Computer Science and Software Engineering Capstone Presentations
Fall Quarter
December 18, 2020
Yang Hu "OpenTelemetry
Prometheus Remote Write Exporter" (Amazon Web Services. Inc) Faculty Advisor: Dr. Erika Parsons |
Abstract Application metrics such as response latency, failed
requests, and instance CPU utilization helps engineers address performance
issues and identify potential optimizations; organizations seek to gain
insight on the internal state of their services to better the experience of
their customers. Yet as software architectures becomes more and more
microservice-oriented, container-based, and complex, it become difficult to
achieve good Observability, The OpenTelemetry project is an in-progress open
source project considered to be the standard of Observability for cloud
native and microservice oriented applications driven by Google, Microsoft,
AWS, and other industry players. It provides a set of SDKs that developers
can instrument their codebase with to collect application telemetry, and a
collector that supports the sending of temetrey to different data processing
backends. An existing popular metric backend in the world of
cloud computing is Prometheus. It is a monitoring solution that stores,
visualizes, supports user query of metric, and sends alarms based on metics.
Despite its popularity, it has several shortcomings: 1. It does not have a
scalable or available storage; the metric is only stored on local disk. 2. It
gathers metrics using a poll-based model, which requires engineers to do
additional network configuration to allow Prometheus to access the
application. this is not desirable for many users. 3. It does have native
multi-tenant support. To address these issues, Prometheus has a remote write
feature that supports the pushing of metrics it receives to another backend.
This backend should be scalable, accept pushed metrics, and has native
multi-tenant support. Some of the remote write backends include Cortex and
Thanos. Using anothing backend as a workaround is inconvenient and does not
address the issue that users do not want an external service
polling/accessing their application. So to address this need and encourage
users to switch to using a Remote Write backend without Prometheus directly,
a proposal was made to add a Prometheus remote write exporter to the
OpenTelemetry project and send metrics to Cortex/Thanos directly. On behalf of AWS, I designed, implemented, tested,
and contributed a Prometheus Remote write exporter to the OpenTelemetry
Collector project. I added the feature to convert metric data between
different formats, satisfied interface requirements on both ends of the
metric pipeline,wrote code following robustness principles, participated in
discussion around metric semantics and eventually added to the specification
of the OpenTelemtry project, implemented security features to safeguard users'
resources, and implemented end-to-end testing. |
|
Updated December 15, 2020