Distributed tracing and Observability

TODO

How is distributed logging and tracing supposed to work for asynchronous systems? e.g. For applications that are using messaging or streaming products and using streaming strategies like pub-sub?

https://stackify.com/what-is-observability-everything-a-beginner-needs-to-know/

  1. How to set-up notifications, observability alerts?

Opentelemetry

Traces: https://opentelemetry.io/docs/concepts/signals/traces/

As requests flow through distributed systems, it’s important to keep track of how it travels, as this can be useful for monitoring and troubleshooting.

Tracing allows you to track the journey of a request as it moves through different services in a distributed environment. It provides a way to understand the flow of operations across these services, making it easier to pinpoint performance issues or errors.

Using tracing, you can break down the operations into smaller parts or pieces by identifying what happened, where, when, and how it happened, along with every other relevant information. This structured approach significantly enhances the effectiveness and efficiency of the debugging process.

Tracing is a fundamental aspect of observability. A trace is a collection of spans, providing a high-level view of how a specific request or transaction moves through various services within a distributed environment. Imagine a trace as a comprehensive map that outlines the path a request takes through the system.

Spans: https://signoz.io/blog/opentelemetry-spans/

Useful for understanding performance issues in a single service. e.g. Which functions are taking too long to complete?

An OpenTelemetry span represents a single unit of work within a system. It encapsulates information about a specific operation, including its start time, duration, associated attributes, and any events or errors during its execution.

Spring Cloud Sleuth

TODO

https://www.baeldung.com/spring-cloud-sleuth-single-application

Zipkin

TODO

https://www.baeldung.com/tracing-services-with-zipkin


Links to this note