Skip to content

Add tracing support using Micrometer #1695

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 8 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions driver-core/build.gradle.kts
Original file line number Diff line number Diff line change
Expand Up @@ -54,6 +54,7 @@ dependencies {

optionalImplementation(libs.snappy.java)
optionalImplementation(libs.zstd.jni)
optionalImplementation(libs.micrometer)

testImplementation(project(path = ":bson", configuration = "testArtifacts"))
testImplementation(libs.reflections)
Expand Down
31 changes: 31 additions & 0 deletions driver-core/src/main/com/mongodb/MongoClientSettings.java
Original file line number Diff line number Diff line change
Expand Up @@ -30,9 +30,11 @@
import com.mongodb.connection.SslSettings;
import com.mongodb.connection.TransportSettings;
import com.mongodb.event.CommandListener;
import com.mongodb.internal.tracing.TracingManager;
import com.mongodb.lang.Nullable;
import com.mongodb.spi.dns.DnsClient;
import com.mongodb.spi.dns.InetAddressResolver;
import com.mongodb.internal.tracing.Tracer;
import org.bson.UuidRepresentation;
import org.bson.codecs.BsonCodecProvider;
import org.bson.codecs.BsonValueCodecProvider;
Expand Down Expand Up @@ -118,6 +120,7 @@ public final class MongoClientSettings {
private final InetAddressResolver inetAddressResolver;
@Nullable
private final Long timeoutMS;
private final TracingManager tracingManager;

/**
* Gets the default codec registry. It includes the following providers:
Expand Down Expand Up @@ -238,6 +241,7 @@ public static final class Builder {
private ContextProvider contextProvider;
private DnsClient dnsClient;
private InetAddressResolver inetAddressResolver;
private TracingManager tracingManager;

private Builder() {
}
Expand Down Expand Up @@ -275,6 +279,7 @@ private Builder(final MongoClientSettings settings) {
if (settings.heartbeatSocketTimeoutSetExplicitly) {
heartbeatSocketTimeoutMS = settings.heartbeatSocketSettings.getReadTimeout(MILLISECONDS);
}
tracingManager = settings.tracingManager;
}

/**
Expand Down Expand Up @@ -723,6 +728,20 @@ Builder heartbeatSocketTimeoutMS(final int heartbeatSocketTimeoutMS) {
return this;
}

/**
* Sets the tracer to use for creating Spans for operations and commands.
*
* @param tracer the tracer
* @see com.mongodb.tracing.MicrometerTracer
* @return this
* @since 5.5
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
* @since 5.5
* @since 5.6

*/
@Alpha(Reason.CLIENT)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Q: Is this going to be released as alpha?

public Builder tracer(final Tracer tracer) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given that Tracer and related types are part of the API, they should not be in com.mongodb.internal

this.tracingManager = new TracingManager(tracer);
return this;
}

/**
* Build an instance of {@code MongoClientSettings}.
*
Expand Down Expand Up @@ -1040,6 +1059,17 @@ public ContextProvider getContextProvider() {
return contextProvider;
}

/**
* Get the tracer to create Spans for operations and commands.
*
* @return this
* @since 5.5
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
* @since 5.5
* @since 5.6

*/
@Alpha(Reason.CLIENT)
public TracingManager getTracingManager() {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be more consistent with the rest of the various settings classes to return the same type that was used to configure, rather than a wrapping manager class, i.e.

public Tracer getTracer() {
    return tracer;
}

It's easy enough to create the TracingManager within the MongoClientImpl constuctors.

Also, clients are allowed to create multiple MongoClient instances from the same MongoClientSettings, and if that happens then we'll have two multiple MongoClient instances sharing the same TracingManager instance.

return tracingManager;
}

@Override
public boolean equals(final Object o) {
if (this == o) {
Expand Down Expand Up @@ -1156,5 +1186,6 @@ private MongoClientSettings(final Builder builder) {
heartbeatConnectTimeoutSetExplicitly = builder.heartbeatConnectTimeoutMS != 0;
contextProvider = builder.contextProvider;
timeoutMS = builder.timeoutMS;
tracingManager = (builder.tracingManager == null) ? TracingManager.NO_OP : builder.tracingManager;
}
}
4 changes: 2 additions & 2 deletions driver-core/src/main/com/mongodb/MongoNamespace.java
Original file line number Diff line number Diff line change
Expand Up @@ -91,7 +91,7 @@ public static void checkCollectionNameValidity(final String collectionName) {
public MongoNamespace(final String fullName) {
notNull("fullName", fullName);
this.fullName = fullName;
this.databaseName = getDatatabaseNameFromFullName(fullName);
this.databaseName = getDatabaseNameFromFullName(fullName);
Copy link
Preview

Copilot AI Jul 21, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Method name corrected from 'getDatatabaseNameFromFullName' to 'getDatabaseNameFromFullName' - this appears to be an unrelated typo fix that should not be included in a tracing feature PR.

Copilot uses AI. Check for mistakes.

this.collectionName = getCollectionNameFullName(fullName);
checkDatabaseNameValidity(databaseName);
checkCollectionNameValidity(collectionName);
Expand Down Expand Up @@ -190,7 +190,7 @@ private static String getCollectionNameFullName(final String namespace) {
return namespace.substring(firstDot + 1);
}

private static String getDatatabaseNameFromFullName(final String namespace) {
private static String getDatabaseNameFromFullName(final String namespace) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems like a good change, but how did it become part of this PR, as it's the only change to this file.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's just fixing a typo

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I thought CoPilot actually suggested it, and wasn't just commenting on it.

Still, our practice generally is to avoid including drive-by changes like this into PRs, especially when they are in files that are otherwise uninvolved in the PR.

Feel free to open a new PR with just this change though. No ticket required.

int firstDot = namespace.indexOf('.');
if (firstDot == -1) {
return "";
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -186,6 +186,10 @@ BsonDocument getCommandDocument(final ByteBufferBsonOutput bsonOutput) {
}
}

BsonDocument getCommand() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

recommend adding a javadoc to help disambiguate this from the getCommandDocument method.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why was this method added when there is already getCommandDocument, which provides a correct and performant command document (The command field does not, since it is most likely an instance of BsonDocumentWrapper, and it will not contain fields from the PAYLOAD_TYPE_1_DOCUMENT_SEQUENCE used in bulk writes)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This does return the original BsonDocument passed to the CommandMessage (this is needed to get the command name for example without parsing the document via getCommandDocument).

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, but in some cases, when the command itself is an instance of BsonDocumentWrapper, this will result in hydrating into a Map, which is something that we assiduously try to avoid during encoding.

return command;
}

/**
* Get the field name from a buffer positioned at the start of the document sequence identifier of an OP_MSG Section of type
* `PAYLOAD_TYPE_1_DOCUMENT_SEQUENCE`.
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -51,6 +51,9 @@
import com.mongodb.internal.logging.StructuredLogger;
import com.mongodb.internal.session.SessionContext;
import com.mongodb.internal.time.Timeout;
import com.mongodb.internal.tracing.Span;
import com.mongodb.internal.tracing.TraceContext;
import com.mongodb.internal.tracing.TracingManager;
import com.mongodb.lang.Nullable;
import org.bson.BsonBinaryReader;
import org.bson.BsonDocument;
Expand Down Expand Up @@ -94,6 +97,18 @@
import static com.mongodb.internal.connection.ProtocolHelper.isCommandOk;
import static com.mongodb.internal.logging.LogMessage.Level.DEBUG;
import static com.mongodb.internal.thread.InterruptionUtil.translateInterruptedException;
import static com.mongodb.internal.tracing.Tags.CLIENT_CONNECTION_ID;
import static com.mongodb.internal.tracing.Tags.CURSOR_ID;
import static com.mongodb.internal.tracing.Tags.NAMESPACE;
import static com.mongodb.internal.tracing.Tags.QUERY_SUMMARY;
import static com.mongodb.internal.tracing.Tags.QUERY_TEXT;
import static com.mongodb.internal.tracing.Tags.SERVER_ADDRESS;
import static com.mongodb.internal.tracing.Tags.SERVER_CONNECTION_ID;
import static com.mongodb.internal.tracing.Tags.SERVER_PORT;
import static com.mongodb.internal.tracing.Tags.SERVER_TYPE;
import static com.mongodb.internal.tracing.Tags.SESSION_ID;
import static com.mongodb.internal.tracing.Tags.SYSTEM;
import static com.mongodb.internal.tracing.Tags.TRANSACTION_NUMBER;
import static java.util.Arrays.asList;

/**
Expand Down Expand Up @@ -376,7 +391,7 @@ public <T> T sendAndReceive(final CommandMessage message, final Decoder<T> decod
message, decoder, operationContext);
try {
return sendAndReceiveInternal.get();
} catch (MongoCommandException e) {
} catch (Throwable e) {
if (reauthenticationIsTriggered(e)) {
return reauthenticateAndRetry(sendAndReceiveInternal, operationContext);
}
Expand All @@ -391,9 +406,8 @@ public <T> void sendAndReceiveAsync(final CommandMessage message, final Decoder<

AsyncSupplier<T> sendAndReceiveAsyncInternal = c -> sendAndReceiveAsyncInternal(
message, decoder, operationContext, c);
beginAsync().<T>thenSupply(c -> {
sendAndReceiveAsyncInternal.getAsync(c);
}).onErrorIf(e -> reauthenticationIsTriggered(e), (t, c) -> {

beginAsync().thenSupply(sendAndReceiveAsyncInternal::getAsync).onErrorIf(this::reauthenticationIsTriggered, (t, c) -> {
reauthenticateAndRetryAsync(sendAndReceiveAsyncInternal, operationContext, c);
}).finish(callback);
}
Expand Down Expand Up @@ -432,15 +446,31 @@ public boolean reauthenticationIsTriggered(@Nullable final Throwable t) {
private <T> T sendAndReceiveInternal(final CommandMessage message, final Decoder<T> decoder,
final OperationContext operationContext) {
CommandEventSender commandEventSender;
Span tracingSpan = createTracingSpan(message, operationContext);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider pushing all the changes made to this class down into LoggingCommandEventSender, which already contains a lot of similar logic. LoggingCommandEventSender might need some refactoring as a result, but it seems better than messing too much with this class, which already has enough responsibilities as it is.


try (ByteBufferBsonOutput bsonOutput = new ByteBufferBsonOutput(this)) {
message.encode(bsonOutput, operationContext);
commandEventSender = createCommandEventSender(message, bsonOutput, operationContext);
BsonDocument commandDocument = message.getCommandDocument(bsonOutput);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This always hydrates the commandDocument, even if its potentially not used, this will impact performance. So it hydrated if:

tracingSpan == null and

 boolean listensOrLogs = commandListener != null || COMMAND_PROTOCOL_LOGGER.isRequired(DEBUG, getClusterId());
        if (!recordEverything && (isMonitoringConnection || !opened() || !authenticated.get() || !listensOrLogs)) {
            return new NoOpCommandEventSender();

We should avoid this and only create it if its used.


commandEventSender = createCommandEventSender(message, commandDocument, operationContext);
commandEventSender.sendStartedEvent();

if (tracingSpan != null && operationContext.getTracingManager().isCommandPayloadEnabled()) {
tracingSpan.tag(QUERY_TEXT, commandDocument.toJson());
}

try {
sendCommandMessage(message, bsonOutput, operationContext);
} catch (Exception e) {
if (tracingSpan != null) {
tracingSpan.error(e);
}
commandEventSender.sendFailedEvent(e);
throw e;
} finally {
if (tracingSpan != null) {
tracingSpan.end();
}
}
}

Expand Down Expand Up @@ -553,7 +583,8 @@ private <T> void sendAndReceiveAsyncInternal(final CommandMessage message, final

try {
message.encode(bsonOutput, operationContext);
CommandEventSender commandEventSender = createCommandEventSender(message, bsonOutput, operationContext);
BsonDocument commandDocument = message.getCommandDocument(bsonOutput);
CommandEventSender commandEventSender = createCommandEventSender(message, commandDocument, operationContext);
commandEventSender.sendStartedEvent();
Compressor localSendCompressor = sendCompressor;
if (localSendCompressor == null || SECURITY_SENSITIVE_COMMANDS.contains(message.getCommandDocument(bsonOutput).getFirstKey())) {
Expand Down Expand Up @@ -952,19 +983,87 @@ public void onResult(@Nullable final ByteBuf result, @Nullable final Throwable t

private static final StructuredLogger COMMAND_PROTOCOL_LOGGER = new StructuredLogger("protocol.command");

private CommandEventSender createCommandEventSender(final CommandMessage message, final ByteBufferBsonOutput bsonOutput,
private CommandEventSender createCommandEventSender(final CommandMessage message, final BsonDocument commandDocument,
final OperationContext operationContext) {
boolean listensOrLogs = commandListener != null || COMMAND_PROTOCOL_LOGGER.isRequired(DEBUG, getClusterId());
if (!recordEverything && (isMonitoringConnection || !opened() || !authenticated.get() || !listensOrLogs)) {
return new NoOpCommandEventSender();
}
return new LoggingCommandEventSender(
SECURITY_SENSITIVE_COMMANDS, SECURITY_SENSITIVE_HELLO_COMMANDS, description, commandListener,
operationContext, message, bsonOutput,
operationContext, message, commandDocument,
COMMAND_PROTOCOL_LOGGER, loggerSettings);
}

private ClusterId getClusterId() {
return description.getConnectionId().getServerId().getClusterId();
}

/**
* Creates a tracing span for the given command message.
* <p>
* The span is only created if tracing is enabled and the command is not security-sensitive.
* It attaches various tags to the span, such as database system, namespace, query summary, opcode,
* server address, port, server type, client and server connection IDs, and, if applicable,
* transaction number and session ID. For cursor fetching commands, the parent context is retrieved using the cursor ID.
* If command payload tracing is enabled, the command document is also attached as a tag.
*
* @param message the command message to trace
* @param operationContext the operation context containing tracing and session information
* @return the created {@link Span}, or {@code null} if tracing is not enabled or the command is security-sensitive
*/
@Nullable
private Span createTracingSpan(final CommandMessage message, final OperationContext operationContext) {
TracingManager tracingManager = operationContext.getTracingManager();
BsonDocument command = message.getCommand();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now that message.getCommand is available does it make sense to use it everywhere where in InternalStreamConnection for example where: SECURITY_SENSITIVE_COMMANDS.contains(message.getCommandDocument(bsonOutput).getFirstKey()) is used.

String commandName = command.getFirstKey();
if (!tracingManager.isEnabled()
|| SECURITY_SENSITIVE_COMMANDS.contains(commandName)
|| SECURITY_SENSITIVE_HELLO_COMMANDS.contains(commandName)) {
return null;
}

// Retrieving the appropriate parent context for the span.
TraceContext parentContext;
long cursorId = -1;
if (command.containsKey("getMore")) {
cursorId = command.getInt64("getMore").longValue();
parentContext = tracingManager.getCursorParentContext(cursorId);
} else {
parentContext = tracingManager.getParentContext(operationContext.getId());
}

Span span = tracingManager
.addSpan("Command " + commandName, parentContext)
.tag(SYSTEM, "mongodb")
.tag(NAMESPACE, message.getNamespace().getDatabaseName())
.tag(QUERY_SUMMARY, command.toString());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be command.toJson() ?


if (cursorId != -1) {
span.tag(CURSOR_ID, cursorId);
}

tagServerAndConnectionInfo(span, message);
tagSessionAndTransactionInfo(span, operationContext);

return span;
}

private void tagServerAndConnectionInfo(final Span span, final CommandMessage message) {
span.tag(SERVER_ADDRESS, serverId.getAddress().getHost())
.tag(SERVER_PORT, String.valueOf(serverId.getAddress().getPort()))
.tag(SERVER_TYPE, message.getSettings().getServerType().name())
.tag(CLIENT_CONNECTION_ID, this.description.getConnectionId().toString())
.tag(SERVER_CONNECTION_ID, String.valueOf(this.description.getConnectionId().getServerValue()));
}

private void tagSessionAndTransactionInfo(final Span span, final OperationContext operationContext) {
SessionContext sessionContext = operationContext.getSessionContext();
if (sessionContext.hasSession() && !sessionContext.isImplicitSession()) {
span.tag(TRANSACTION_NUMBER, String.valueOf(sessionContext.getTransactionNumber()))
.tag(SESSION_ID, String.valueOf(sessionContext.getSessionId()
.get(sessionContext.getSessionId().getFirstKey())
.asBinary().asUuid()));
}
}
}
Original file line number Diff line number Diff line change
Expand Up @@ -78,7 +78,7 @@ class LoggingCommandEventSender implements CommandEventSender {
@Nullable final CommandListener commandListener,
final OperationContext operationContext,
final CommandMessage message,
final ByteBufferBsonOutput bsonOutput,
final BsonDocument commandDocument,
final StructuredLogger logger,
final LoggerSettings loggerSettings) {
this.description = description;
Expand All @@ -88,7 +88,7 @@ class LoggingCommandEventSender implements CommandEventSender {
this.loggerSettings = loggerSettings;
this.startTimeNanos = System.nanoTime();
this.message = message;
this.commandDocument = message.getCommandDocument(bsonOutput);
this.commandDocument = commandDocument;
this.commandName = commandDocument.getFirstKey();
this.redactionRequired = securitySensitiveCommands.contains(commandName)
|| (securitySensitiveHelloCommands.contains(commandName) && commandDocument.containsKey("speculativeAuthenticate"));
Expand Down
Loading