-
Notifications
You must be signed in to change notification settings - Fork 1k
OpenSearch Transport v3.0 Implementation #14823
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
import io.opentelemetry.instrumentation.api.instrumenter.Instrumenter; | ||
import io.opentelemetry.instrumentation.api.instrumenter.SpanKindExtractor; | ||
|
||
public final class OpenSearchJavaInstrumenterFactory { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if this is only used in the OpenSearchJavaSingletons class, perhaps just move it there instead of having it's own class? Often we'll have factories in the library instrumentation when it's also needed in the javaagent, reducing duplication, but as far as I can tell, it's only used in one place here
import com.google.auto.value.AutoValue; | ||
|
||
@AutoValue | ||
public abstract class OpenSearchJavaRequest { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This comment applies to many of these classes, but I'm not sure including "Java" in the class names is necessary
public abstract class OpenSearchJavaRequest { | |
public abstract class OpenSearchRequest { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jaydeluca I also struggled with this. The reason for the OpenSearchJava*
prefix is clear (distinguishing from OpenTelemetryRest*
and it comes from opentelemtry-java instrumentation).
Still, I'd like to settle on a direction although both have solid reasons:
Which one is the better choice
- Do not use
Java
in the class names, even if it comes from opentelemetry-java instrumentation. - Keep using
Java
in the class names, even if it feels a bit awkward.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In other instrumentations we don't have a java
package and we also don't use Java
in class names (with the exception of JavaHttpServer...
and JavaHttpClient..
). There are module names such as graphql-java
and openai-java-1.1
for libraries that java in their name though other libraries that have a similar naming scheme don't have java
in their name. I'd drop the java
package and the Java
in class name, in the context of this project we are dealing only with java frameworks anyway.
import org.opensearch.client.transport.OpenSearchTransport; | ||
import org.opensearch.client.transport.httpclient5.ApacheHttpClient5TransportBuilder; | ||
|
||
public class OpenSearchJavaApacheHttpClient5TransportTest extends AbstractOpenSearchJavaTest { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
public class OpenSearchJavaApacheHttpClient5TransportTest extends AbstractOpenSearchJavaTest { | |
class OpenSearchJavaApacheHttpClient5TransportTest extends AbstractOpenSearchJavaTest { |
import software.amazon.awssdk.utils.AttributeMap; | ||
|
||
@SuppressWarnings("deprecation") // using deprecated semconv | ||
public class OpenSearchJavaAwsSdk2TransportTest extends AbstractOpenSearchJavaTest { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
public class OpenSearchJavaAwsSdk2TransportTest extends AbstractOpenSearchJavaTest { | |
class OpenSearchJavaAwsSdk2TransportTest extends AbstractOpenSearchJavaTest { |
@SuppressWarnings( | ||
"deprecation") // RestClientTransport is deprecated but still the correct way for OpenSearch | ||
// Java 3.0 | ||
public class OpenSearchJavaRestClientTransportTest extends AbstractOpenSearchJavaTest { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
public class OpenSearchJavaRestClientTransportTest extends AbstractOpenSearchJavaTest { | |
class OpenSearchJavaRestClientTransportTest extends AbstractOpenSearchJavaTest { |
Because you are adding a new instrumentation, I think it would be a good idea to make it already "indy-ready" to make it future-proof and prevent having to rework it later with a PR like #14842. I haven't looked closely, but this should be quite close to the equivalent instrumentation for Elasticsearch, which has already been made "indy-compliant". |
@SylvainJuge good point, i haven't thought about that, i will look at "indy-ready" |
@mznet you can view this as an optional task, we'll merge this PR even if you don't do this |
@SuppressWarnings("deprecation") // using deprecated DbSystemIncubatingValues | ||
@Override | ||
public String getDbSystem(OpenSearchJavaRequest request) { | ||
return DbIncubatingAttributes.DbSystemIncubatingValues.OPENSEARCH; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
consider using the non-deprecated DbSystemNameIncubatingValues
instead
@Deprecated | ||
@Override | ||
@Nullable | ||
public String getUser(OpenSearchJavaRequest request) { | ||
return null; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you can remove this, there is a default implementation in DbClientCommonAttributesGetter
@Deprecated | ||
@Override | ||
@Nullable | ||
public String getConnectionString(OpenSearchJavaRequest request) { | ||
return null; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you can remove this, there is a default implementation in DbClientCommonAttributesGetter
@Override | ||
@Nullable | ||
public String getDbQueryText(OpenSearchJavaRequest request) { | ||
return request.getMethod() + " " + request.getOperation(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This probably isn't correct, ideally this should return the full query that is sanitized.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@laurit when i first wrote the implementation, extracting the full query text seemed impossible because the request only provided limited information. on second thought, it might actually be possible, so i’ll take another look.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using the full query isn't always desirable either because it needs to be sanitized. If it can't be sanitized then it can't be enabled by default. Unfortunately sanitizing might not be easy. I think that since the existing opensearch-rest instrumentation uses the same query text we can use it here too for now. In the future we could use the same approach to fill the query summary (https://opentelemetry.io/docs/specs/semconv/database/database-spans/#generating-a-summary-of-the-query) instead and set the query text to null. Before we can do that we have to implement support for the db.query.summary
attribute. Also since db.query.summary
is only in the stable semconv would need to figure out what to do with the old semconv (continue filling the query text as is or set it to null). cc @trask
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@laurit I was finally able to extract the full query text, but composing the sanitization logic will take more time. The existing sanitization logic can’t be reused since this isn’t just sql or sql-like but it’s a json query. I’ve roughly figured out how to approach opensearch query sanitization, but I’ll need more time to validate that it’s the right solution.
So I agree with your review. Let’s leave it as is for now, and I’ll come back once I’ve implemented the sanitization logic and can extract the full query text with it.
@SylvainJuge Since the review has already progressed this far, it might not be a good idea to change the code at this point. It would be better to add the indy-compatible instrumentation code in a separate PR. Thank you for pointing out. cc @laurit |
Looks like I did something wrong on this branch. I’ll redo it. |
2e37420
to
e66488d
Compare
I did the rebase wrong, so other people’s commits ended up in this branch’s history. I force pushed to the pre-rebase commit and cherry-picked just the commits I wanted on top. |
/easycla |
This PR adds tracing for OpenSearch Transport interface, enabling in-depth tracing of OpenSearch requests across all Transport implementations.
Closes #14652