-
Notifications
You must be signed in to change notification settings - Fork 114
Add SQL commenter as context propagation for databases #406
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Add SQL commenter as context propagation for databases #406
Conversation
Thanks for opening your first pull request! If you haven't yet signed our Contributor License Agreement (CLA), then please do so that we can accept your contribution. A link should appear shortly in this PR if you have not already signed one. |
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #406 +/- ##
============================================
- Coverage 81.55% 81.29% -0.27%
- Complexity 1659 1692 +33
============================================
Files 131 135 +4
Lines 6848 6923 +75
============================================
+ Hits 5585 5628 +43
- Misses 1263 1295 +32 Flags with carried forward coverage won't be shown. Click here to find out more.
Continue to review full report in Codecov by Sentry.
🚀 New features to boost your workflow:
|
@@ -121,6 +125,18 @@ public static function register(): void | |||
$span->setAttributes($attributes); | |||
|
|||
Context::storage()->attach($span->storeInContext($parent)); | |||
if (self::isSqlCommenterEnabled() && $sqlStatement !== 'undefined') { | |||
$sqlStatement = self::appendSqlComments($sqlStatement, true); | |||
$span->setAttributes([ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's redundant to include the sql comments in the statement, that info is already available in the trace. Does the spec suggest to do this? (comment applies to several similar bits of code)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The spec doesn't suggest this, it's a bit from our own experience--capturing the final statement in the attribute helps easily diagnose integration issues by looking at the trace (e.g. database monitoring expects certain info in the comment but didn't see it) rather than digging through server logs/system views.
But there has been the concern that including this in db.query.text
makes the value high cardinality which may not be handled well by all tracing backends. What we did for Python contrib was to make it a config option whether to collect query text before or after injection. Defer to PHP maintainers though, if you simply want to always capture pre-injection.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, there is no spec / requirement to update db.query.text
.
As the instrumentation library changed the sql statement and the database is actually going to execute the modified statement, so updating TraceAttributes::DB_QUERY_TEXT
to reflect the truth and it helps diagnose too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tracking the actual (mutated) statement that the DB executed does sound reasonable, but I do share the concern that it makes the value high cardinality. Can we make it opt-in?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure. Done!
|
||
use OpenTelemetry\API\Trace\Propagation\TraceContextPropagator; | ||
|
||
class Opentelemetry |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you look at what contrib-auto-slim does with the still-in-draft response propagation? I don't think this is a good class name, what does it do? Should there be an interface that it implements?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure which still in draft PR. Can you put the url here?
Agreed it is not a very good name, my initial thought is to keep the google sqlcommenter code. This class is to retrieve the trace context / service name required to comment sql statement. Do you have any idea about the class name?
I think using interface is a good idea as Instrumentation\MySqli
will also call the same in order to retrieve trace context / service name.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think what this class needs to be is an instance of a TextMapPropagator, or perhaps its own interface like DatabaseQueryPropagationSetter
(the terminology should be spec-driven), where "carrier" is a string (database query text), and this is where the query is injected with sql comments.
In my mind, it could work something like this (based on https://github.com/open-telemetry/opentelemetry-php-contrib/blob/main/src/Instrumentation/Slim/src/SlimInstrumentation.php#L100):
if ($dbPropagationEnabled && $doServiceNamePropagation) {
if (class_exists(`\OpenTelemetry\...\ServiceNamePropagator`)) {
$prop = new \OpenTelemetry\...\ServiceNamePropagator();
$prop->inject($dbQueryString, DatabaseQueryPropagationSetter::instance(), $scope->context());
}
}
//similar for trace context propagation
I also wonder if we should use Globals::propagator()
for the trace context propagation. Users can configure and use different propagators or even provide their own (for example, for AWS x-ray), and may want those values to be propagated as well as/instead of ours? (Has this come up in spec discussions?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I understand correctly, it seems we don't need the Opentelemetry class but putting the injection part to the instrumentation code.
I updated the code in this commit. Please correct me if I get it wrong.
SQL commenter in theory is to put information about the code in the query statement and it is not limited to trace context / service name. However, the majority implementation is putting a trace context ( or the proposed service name) so that user can correlate the trace with the database queries. Also, there is a limit about the length of the query statement so I am not so sure about opening up to different propagators and that is not in the spec discussions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Re:
I also wonder if we should use Globals::propagator() for the trace context propagation. Users can configure and use different propagators or even provide their own (for example, for AWS x-ray), and may want those values to be propagated as well as/instead of ours? (Has this come up in spec discussions?)
Yes this was discussed in the spec, there seems consensus that the default would be the global propagator.
And users should be able to configure their own propagator(s), which if multiple, can be passed via a composite propagator. So my take is our implementation should accept a composite propagator instead of the current logic of running through "if with trace context", "if service propagator exists".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using Globals::propagator()
can add flexibility to configure other propagators, but it would also add high
cardinality risk and make the comment longer.
I don't think it is a good idea to include Globals
as we just want to correlate the trace with the database query.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
High cardinality will result just from the traceontext propagation, but it is a valid concern about length of comment. We may need yet another 😬 config to limit that.
{ | ||
$setter ??= ArrayAccessGetterSetter::getInstance(); | ||
|
||
$detector = new Detectors\Service(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will need some thought. Resources including service.name
can come from a variety of sources (for example, a yaml file in declarative config). I think the best source is the resource that's associated with the tracer, but that's not currently accessible. Perhaps service name needs to be added to SpanContext, so that it can be accessed in the same way that other components are for context propagation? That's a spec change, we won't just add it to our API and SDK.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, definitely the best source is from Tracer
. Agreed it requires a spec change and involves a change in the API and SDK. I will monitor the discussion of the semconv to see what is the recommendation to get the service name.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this an active question or issue in spec and/or semconv? If so, can you link to it? I think it should block this being merged.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, this is the semconv PR. I saw you put comments there. Indeed there is still no breakthrough on how to get the service name.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The most recent comments on that PR suggest that it should be provided by the user and it may be the service name or some completely different user-defined value. We can wait for that to be agreed upon, but to me that means it's not coming from a detector, but actual user input. For an auto-instrumentation, that could be an env var or from declarative configuration.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the recommendation for MS SQL Server is to use CONTEXT_INFO
instead of sqlcomment as propagation method. As mentioned in #406 (comment) i really think we should limit this first cut to just the DBMS we're testing against, e.g. MySQL and maybe Postgres.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done. Limited to postgresql
and mysql
only
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@brettmc The spec mentioned the The instrumentation implementation SHOULD append the comment to the end of the query
. Should I remove the option to allow prepending the query?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It goes on to say that Semantic conventions for individual database systems MAY specify different format, which may include different position, encoding, or schema, depending on the specific database system's requirements or preferences.
Do we know of any databases where prepending could cause an issue, or is it just user preference (in which case, removing the option seems ok to me) ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd vote to keep this option in, based on the semconv discussion where there are valid reasons for the user to choose prepend depending on database.
@jerrytfleung i know it's not addressed yet in the semconv, but we probably do want configuration to explicitly opt into particular DB systems. For example only enable for MySQL and not for MS SQL Server, etc. |
I am not sure if I get it correctly. Does it mean we need a different implementation for MS SQL Server? From this and that, it is using SET CONTEXT_INFO to pass information to MS SQL Server. |
That's right, there's now the MS SQL Server-specific semconv that states trace context propagation should be achieved via the |
private static function addSqlComments(string $query): string | ||
{ | ||
$comments = []; | ||
$prop = TraceContextPropagator::getInstance(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
other propagators can be configured, and the correct way to get them would be via Globals
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see. Done.
@@ -329,4 +374,63 @@ private static function isDistributeStatementToLinkedSpansEnabled(): bool | |||
|
|||
return filter_var(get_cfg_var('otel.instrumentation.pdo.distribute_statement_to_linked_spans'), FILTER_VALIDATE_BOOLEAN, FILTER_NULL_ON_FAILURE) ?? false; | |||
} | |||
|
|||
private static function isSqlCommenterEnabled(): bool |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think a lot of this code could be moved into its own class, and then be unit tested. It seems like it shouldn't be called "sql commenter" - that was the name of the product before it was donated to otel, but it looks like the objective is to define how to do something similar/better. I guess we're waiting on approved spec changes to see what the names of things should be?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From what I can see in the pending approval PR, the name should be something like "Context Propagation", and SQL commenter is a way to realize the "Context Propagation".
Is ContextPropagation
a good class name for you?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done. Moved to its own class, named ContextPropagation
. Please see if it looks good. Thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you were to add the sql-server propagation mentioned elsewhere in this PR, would it be in that class, or in a different, similar class? I feel like it should be in a different one, and so maybe SqlPropagator
is an interface, and SqlCommentPropagator
is this class, with potentially a SqlServerPropagator
coming later which uses set_context_info?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not going to implement the sql-server propagation in this PR so as to make this scope manageable but I am happy to add the placeholder class SqlServerPropagator
first.
Done! I added the class SqlPropagatorInterface
, SqlCommenterPropagator
& SqlServerPropagator
.
@bobstrecansky open-telemetry/semantic-conventions#2495 has been merged to the spec. Can you please unblock this PR? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Jerry I added a couple questions/comments. Also, now that the semconv is merged, maybe the README can refer to that instead.
case 'postgresql': | ||
case 'mysql': | ||
$comments = []; | ||
Globals::propagator()->inject($comments); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jerrytfleung the semconv states:
The instrumentation SHOULD allow users to pass a propagator to overwrite the global propagator. If no propagator is provided by the user, instrumentation SHOULD use the global propagator.
It doesn't seem we currently support this. And side curiosity why not have the SqlCommentPropagator
use global propagator directly instead of passing around $comments
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is because SqlCommentPropagator is not registered in Registry. The Globals::propagator() needs environment variable OTEL_PROPAGATORS https://github.com/open-telemetry/opentelemetry-php/blob/main/src/SDK/Propagation/PropagatorFactory.php#L21 to initialize the propagator list.
|
||
namespace OpenTelemetry\Contrib\Instrumentation\PDO; | ||
|
||
class SqlServerPropagator implements SqlPropagatorInterface |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jerrytfleung naming nit ;)... IMHO ContextInfoPropagator
makes more sense given the other one is called SqlCommentPropagator
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure. Done
Co-authored-by: Brett McBride <[email protected]>
Co-authored-by: Brett McBride <[email protected]>
Co-authored-by: Brett McBride <[email protected]>
6e65ec4
to
ee41b2a
Compare
Description:
Currently, there isn't any sqlcommenter for the database instrumentation library that allows customers to correlate their span with the database query. This PR adds the implementation for sqlcommenter according to semantic-conventions PR
Testing:
Unit Testing:
Wrote unit tests for sql comment for correctness and they're all passing.
Manual Testing:
From PostgreSQL database server log,
We are able to find the modified SQL statement