apache kafka – Best practices for distributed tracing in Flink
I’m considering distributing tracing in the context of Flink. I have the following questions:
-
How to implement tracing internally within the Flink pipeline itself? That is, how to propagate the tracing context between the different operators from the sources to the sinks?
-
How to glue things at the edges? That is, extract the context when reading from the sources and put it when writing to the sinks?
For 2 in particular I just need to support Kafka sources & sinks. I guess the typical thing would be to use the kafka headers for that, as described here. This also has the advantage that it does not require changes in the payload schemas for example.
More generally, are there any (Flink-specific) libraries/integrations available which facilitate the task at hand? E.g., by decorating transformations with tracing capabilities as done here for Kafka Streams. See also this related question or this interceptor-like wrapper which could be another option for effectively enlarging the context for tracing purposes.
For what it’s worth, I’m mostly interested in solutions based on OpenTracing and/or OpenTelemetry.
Read more here: Source link