Continuous Observability: Shedding Light on CI/CD Pipelines
DevOps is not just about operating software in production, but also releasing that software to production. Well-functioning continuous integration/continuous delivery (CI/CD) pipelines are critical for the business, and this calls for quality observability to ensure that Lead Time for Changes is kept short and that broken and flaky pipelines are quickly identified and remediated.
On an episode of OpenObservability Talks I hosted Oleg Nenashev, a core maintainer and board member in the Jenkins project, as well as an Ambassador at the Continuous Delivery Foundation (CDF) and Cloud Native Computing Foundation (CNCF). Oleg is a community builder and Developer Advocate at Gradle Inc. We discussed CI/CD, observability, the prominent open source projects and foundations, as well as a new proposal for extending OpenTelemetry to natively support CI/CD observability use cases. Here are some of the highlights of our fireside chat.
The Expanding Toolset of CI/CD Observability
The open source space boasts an array of CI/CD tools. The CDF features projects like Jenkins and Tekton, while the CNCF hosts cloud-native counterparts such as Argo and Flux. Various tools cater to different use cases and implementation preferences.
In addition, there is a vast ecosystem of observability tools to inspect CI/CD pipelines. By leveraging existing observability solutions like OpenSearch, Prometheus, and Jaeger, organizations can extend observability from production environments to their release cycles. This holistic approach enables a deep understanding of the pipeline’s performance, bottlenecks, and potential areas of improvement.
The diverse landscape offers choices for developers and organizations, enabling them to adopt the tools that best suit their needs. However, the multitude of tooling, each with its own proprietary conventions, creates defragmentation that inhibits end-to-end observability across the toolchain. This is where open standards and open specifications come into play.
Standardizing on Continuous Delivery with the CDEvents Specification
Standardization plays a vital role in ensuring seamless integration and interoperability between CI/CD tools. It allows for the creation of a common language and protocol that different tools can use to communicate effectively. One significant standardization effort is the CDEvents specification, a project under the CDF which aims to define semantic conventions and schemas for continuous delivery events.
The CDEvents specification provides a concrete standard for capturing and exchanging data related to CI/CD pipelines. With CDEvents as a standard, tools from different stages of the CI/CD process can communicate seamlessly, providing a high-level view of the entire pipeline. The ability to navigate between granular details and high-level insights facilitates comprehensive analysis and optimization of the software delivery process.
Implementing CDEvents brings several benefits to the CI/CD observability landscape. It allows for the aggregation of data from multiple tools, providing a holistic view of the entire delivery process. The standardization also simplifies the creation of developer tooling, making it easier to debug, troubleshoot, and analyze pipeline activities. Additionally, CDEvents enable integration with other systems and tools, enhancing capabilities such as progressive delivery, A/B testing, and security analysis.
But to achieve standardization of CI/CD Observability, we need to address the telemetry collection side of things.
Realizing the Value of CI/CD Observability with OpenTelemetry
The community has been working on standardizing on collecting telemetry data for monitoring production systems. The result is OpenTelemetry, an observability framework for generating and collecting telemetry data, whether logs, metrics, traces or other signals. OpenTelemetry is an open source project under the CNCF, that includes APIs, SDKs and tools, as well as an open specification for vendor-agnostic telemetry data representation and transmission. This specification and its associated protocol OTLP have been widely adopted, with many observability tools and vendors already natively supporting it.
It is only natural to enhance OpenTelemetry to also cover CI/CD observability use cases. This is the goal of a new OpenTelemetry extension proposal I submitted, and which I discussed with Oleg. The first step is to formalize semantic conventions for CI/CD within the OpenTelemetry specification, and this presents a good fit for collaboration with the CDEvents project.
“Now having unified standards based on OpenTelemetry, CDEvents and other technologies, we can actually rebuild the whole delivery trace from commit [through production]…you can just hook on a value stream and build observability from the very beginning before the code appears,” Oleg says. “You get observability to all stages and all tools used for these stages.”
We would like to invite developers and practitioners to participate in the discussions, contribute ideas, and help shape the future of CI/CD observability, the OpenTelemetry semantic conventions and CDEvents. Discussion takes place in the CNCF slack workspace under the #cicd-o11y channel, and on #cdevents in the CDF workspace respectively, for collaboration and knowledge sharing.
Want to learn more? Check out the OpenObservability Talks episode: Continuous Observability: Shedding Light on CI/CD Pipelines.