End-of-Year Observability Retrospective with Charity Majors
As 2024 draws to an end, I invited Charity Majors to run a retrospective on the year, in the 2024 concluding episode of . Charity and Irecently delivered keynotes at Open Source Observability Day, which sparked fascinating discussions on the evolution of open observability and its impact on the broader industry, which was a good precursor for our fireside chat.
Together we ran a yearly postmortem on the key insights and trends, exploring what the observability community and industry have accomplished this year, as well as what’s on the horizon for observability in 2025 and beyond.
Charity Majors is the co-founder and CTO of Honeycomb. She pioneered the concept of modern Observability, drawing on her years of experience building and managing massive distributed systems at Parse (acquired by Facebook), Facebook, and Linden Lab building Second Life. She is the co-author of Observability Engineering and Database Reliability Engineering (O’Reilly).
OpenTelemetry trends
OpenTelemetry has soared this year in adoption, signaling a unified way to generate and collect telemetry while avoiding vendor lock-in.
Furthermore, adoption of OpenTelemetry’s open specification and semantic conventions means that vendors can do more complicated and interesting with your data without you having to put in a lot of effort upfront for that to happen.
While Charity feels that OpenTelemetry in 2024 was about stabilizing things and growing adoption, I feel that we’ve yet to scratch the surface in many fronts. Indeed in classic IT backend monitoring with logs metrics and traces it’s getting there, but newer use cases like client-side monitoring, real-user monitoring (RUM) and CI/CD observability are still far from stability.
Are we at Observability 2.0? Or 3.0?
Charity has been advocating for observability 2.0 in 2024. In her view, Observability 2.0 has one source of truth, wide structured log events, from which you can derive all the other data types. This is contrasted with the “three pillars of observability” approach, namely logs, metrics and traces.
This brings two advantages:
- Cost-effectiveness with a unified data storage and avoiding replicating data across telemetry signals (logs, metrics, traces)
- The ability to slice and dice data, by retaining as much high-cardinality data as possible.
I’m weary of the semantic versioning approach to branding names. And, not surprisingly, shortly after came Hazel Weakly with a blog post advocating for Observability 3.0. I’m not going down this rabbit hole.
The transformation I see in observability is the data analytics paradigm shift. You can call it 2.0 or 3.0, but we need to move away from obsessing about logs, metrics and traces (and newer signals such as profiles), because they are merely the raw data. What’s we’re interested in — are the insights! This means:
- enriching and correlating data
- unified querying, visualization and alerting
- improving the signal-to-noise ratio
- collecting data of different sources, formats and types
Observability for developers and business insights
Charity commented that what the business and marketing side of the house has had these data analytics capabilities for 20 years already. Just look at Vertica and various data lakes and columnar stores used for these cases. It’s time we started thinking more like data analysts ourselves. “we’re kinda playing catchup when it comes to some of this stuff,” Charity commented, “that is a little bit embarrassing.”
Observability for Charity is focused on how you develop your code. Is the code being used the way I expected it? Does it function the way I designed it? I double down on this notion, for me observability is about the product, the UX, the business. You can be an entrepreneur trying to tune the PLG experience, or the pricing model — you’d need SaaS observability.
AI for DevOps and Observability
You can’t summarize 2024 without talking about Artificial Intelligence (AI), can you?
Charity urged caution against overpromising. While AI can streamline data processing and insights generation, its impact is most meaningful when paired with strong foundational practices.
I emphasized that although it was overly hyped, I’ve seen practical benefits. One example is querying with natural language, which lends itself to Generative AI, and lowers the barrier to entry for engineers needing to master Lucene for logs, PromQL for metrics, and who knows which other domain-specific languages (DSLs) they have in their systems and observability tools. Helping us find correlations, so that we don’t need to rely on visualizations so heavily is another place where I see AI assisting us.
Platform engineering is maturing
Platform engineering was another hot topic in 2024. And one with direct relevance for observability. This year we really saw platform engineering becoming production grade (as I predicted a year ago :-)), and in many organizations the observability stack and practices have shifted to be goverened by a central platform team at scale.
Charity noted the importance of holistic organizational change to support platform engineering, ensuring developers can own, understand, and operate their code.
We also discussed the importance of treating platform as a product. This is also the topic of the current research we conduct at the Cloud Native Computing Foundation’s Platform working group. If you’re up for it, take the Platform as a Product survey and help the open source community understand how it is being practiced out there.
We got to talk about more stuff, such as controlling cost and additional use cases for observability.
Want to learn more? Check out the OpenObservability Talks episode: End-of-Year Observability Retrospective with Charity Majors.