Jaeger V2 Unveiled: Distributed Tracing Powered by OpenTelemetry | Horovits

Jaeger V2 Unveiled: Distributed Tracing Powered by OpenTelemetry

Dotan Horovits (@horovits)
6 min read6 days ago

--

Distributed Tracing has proven essential in monitoring microservices architectures and containerized workloads. The leading open source tool is Jaeger, a graduated project under the Cloud Native Computing Foundation, which was originally developed internally at Uber.

Now Jaeger is fast approaching V2, which introduces a new architecture with deep OpenTelemetry integration, which promises more flexibility, performance, extensibility and ease of use.

On a recent episode of OpenObservability Talks, I sat down with Yuri Shkuro, the creator of Jaeger back from his days at Uber (now at Meta) and core maintainer of the project, to hear all about the upcoming major release.

Jaeger V2 Unveiled: Distributed Tracing Powered by OpenTelemetry: OpenObservability Talks

Journey to OpenTelemetry started at Jaeger V1

Jaeger has been adopting OpenTelemetry for many years now, and has adapted its scope accordingly. The Jaeger project originally provided a full suite that included SDKs (tracers in Jaeger terms), an agent, and then backend services for query engine and UI.

With the introduction of OpenTelemetry, Jaeger has sunset its SDKs, moving to using OpenTelemetry SDKs for instrumenting applications with tracing.

The Jaeger Agent has also been announced as deprecated about in the first half of 2024, and will be sunset soon, as it’s being replaced by an implementation based on OpenTelemetry Collector.

Another important interoperability step was made in V1 when Jaeger provided native support for OTLP, the OpenTelemetry native protocol. This meant that OpenTelemetry users could send their data without need for redundant format transformation.

Moreover, Jaeger UI was enhanced to understand OTLP data format. Users could simply export their OpenTelemetry data as a JSON file and upload it to Jaeger UI. The UI itself still needed to result to the backend to translate this data, but these first steps laid the foundations for a deeper OpenTelemetry integration, which is now coming with Jaeger V2.

Jaeger V2 is going all-in on OpenTelemetry

Jaeger V1 has been around for some seven years, with a steady cadence of updates — over 60 releases, in fact. However, as distributed tracing has evolved, so has the need for a more flexible and powerful data processing pipeline. The OpenTelemetry Collector, with its advanced capabilities and extensibility, provided a natural upgrade path, which was the base for a new architecture for Jaeger.

Jaeger V2 now uses OpenTelemetry’s data format for ingesting traces and integrates OpenTelemetry Collector’s more adaptable and versatile architecture. This architectural shift addresses several limitations of V1. For example, OpenTelemetry enables a wider variety of protocol support and allows data to be routed from multiple sources like Kafka, Redis, and other application-level metrics, and then processed through custom pipelines. This includes critical processing functions like sampling, filtering, and batching — operations that were previously more rigid or cumbersome in Jaeger V1.

Another compelling reason for V2 is the plugin ecosystem. The OpenTelemetry Collector has a rich contributor base, with plugins that add custom exporters, processors, and other functionality. Jaeger V1 had to work around limitations in Go’s plugin support, which hindered extensibility. In contrast, OpenTelemetry Collector’s builder framework allows users to compile tailored versions of the collector binary with only the modules they need, avoiding overhead. While Jaeger V2 doesn’t yet support dynamic extensions via the OpenTelemetry Builder, Yuri mentioned it could in the future as demand grows.

By embracing OpenTelemetry’s flexible pipeline approach, Jaeger V2 is well-equipped to support modern, scalable observability needs, positioning it as a future-proof tracing solution in the OpenTelemetry ecosystem.

Storage V2: Streamlined Performance with OpenTelemetry Compatibility

Jaeger does not come with its own database for trace storage backend, but rather Jaeger integrates with well-established databases, including Cassandra, Elasticsearch and OpenSearch. It also integrates via a gRPC API with other well known databases such as ClickHouse.

The upcoming Storage V2 architecture marks a pivotal upgrade, bringing Jaeger closer to the OpenTelemetry (OTLP) standard. The primary changes in Storage V2 are focused on efficiency and compatibility.

First, it natively supports OTLP payloads, eliminating the need for data model translations between Jaeger’s format and storage backends. This reduces overhead and aligns the tracing ecosystem around a unified data model, streamlining integration with other OTLP-based tools.

Secondly, Storage V2 introduces batch processing for spans, a significant efficiency boost. Historically, Jaeger’s V1 storage pipeline handled one span at a time, an approach initially optimized for Cassandra but less effective for other databases.

The shift to batch processing allows Jaeger to make better use of storage solutions like ClickHouse, enhancing throughput and resource usage. This batching aligns well with OpenTelemetry’s batch-oriented data handling, further future-proofing Jaeger for more complex, scalable observability needs.

Extending OpenTelemetry to manage storage

Jaeger’s storage integration, as Yuri explained, takes OpenTelemetry a step further by tackling what lies beyond the typical telemetry generation and collection tasks.

OpenTelemetry, by design, doesn’t handle storage — it’s focused on capturing and exporting telemetry data. But Jaeger, embracing the OpenTelemetry Collector framework, has introduced robust storage management functionalities that extend OpenTelemetry’s capabilities.

Jaeger’s approach includes handling complex tasks like dynamic schema creation and management for various storage backends (e.g., Cassandra or Elasticsearch). Yuri highlighted that Jaeger’s storage backends need to accommodate different storage formats. This means Jaeger must efficiently transform data into a compatible schema and handle queries based on the underlying storage. This versatility ensures that Jaeger operates seamlessly across different backends by customizing queries for each, be it with a simple JSON query in Elasticsearch or a more manual indexing process in Cassandra.

I find this move by Jaeger to integrate storage support truly fascinating, as it exemplifies how OpenTelemetry can evolve to support more end-to-end observability needs. It even raises the possibility that other projects in the OpenTelemetry community could explore similar expansions beyond data capture and processing.

Jaeger V2 RC is here, GA soon to follow

Jaeger already has a release candidate for the new major release —Jaeger V2 RC1, and the website already has a designated docs page for the next release with some information.

The next milestone is the GA (general availability), and Yuri emphasized that Jaeger’s GA will be a significant milestone, signaling stability and readiness for widespread production use. The GA should be released early November 2024, according to Yuri, leading up to KubeCon North America 2024.

While some non-core features, like full Kubernetes operator support, might still require further refinement post-GA, the core functionality and configurational framework are nearly ready.

Yuri described the new OpenTelemetry configuration format, that is now in being finalized, as a major improvement, offering a standardized and familiar setup based on OpenTelemetry configuration format, superceding the cumbersome CLI flag-based configuration in Jaeger V1.

Notably, this GA release will also phase out the Jaeger agent, as newer OpenTelemetry SDKs support direct HTTP and gRPC communications, bypassing the need for UDP agents. Instead, users who prefer host-local configurations can leverage the OpenTelemetry Collector as a drop-in replacement. For organizations ready to transition, Jaeger v2’s GA provides a structured pathway to scale up while benefiting from the latest advancements in the OpenTelemetry ecosystem.

Want to learn more? Check out the OpenObservability Talks episode: Jaeger V2 Unveiled: Powered by OpenTelemetry.

--

--

Dotan Horovits (@horovits)

Technology evangelist, CNCF Ambassador, open source enthusiast, DevOps aficionado. Social: @horovits YouTube: @horovits Podcast: OpenObservability Talks