Skip to content

Conversation

@addaleax
Copy link
Collaborator

This adds new features to the mongodb-runner package, both for programmatic and CLI usage, in order to let drivers adopt this as their main MongoDB cluster management tool.

CLI improvements:

  • Support --config <file> to provide a config file
  • Let users trigger verbose debug mode through the CLI
  • Let users specify config options via environment variables

Core features:

  • Custom arguments for the individual replset members, individual shards, or individual mongoses (and multiple mongoses, through that)
  • Support for enabling authentication on clusters as part of their startup

Related: mongodb-labs/drivers-evergreen-tools#701

async addAuthIfNeeded(): Promise<void> {
if (!this.users?.length) return;
// Sleep to give time for a possible replset election to settle.
await sleep(1000);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there something that could tell us a replset election is done earlier? Seems like a place where things could add up timewise.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, this is taken straight from @blink1073's commits, I've asked myself the same question but he can probably answer more easily. We do wait for a primary to be self-reporting as the winner of the election before, so I'm not entirely sure that this is necessary

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Despite the wait for the self-report, I was getting a notPrimary error about 50% of the time before I added the sleep.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another wrinkle: if a keyFile is used, the initial metadata insert fails with the error below. In my branch I deferred the buildInfo check if there was a keyFile. The cluster start still works in the end with the failure.

DEBUG:   mongodb-runner failed to get buildInfo, treating as closed server MongoServerError: Authentication failed.
    at Connection.sendCommand (/Users/steve.silvester/workspace/drivers-evergreen-tools/.evergreen/orchestration/node_modules/mongodb/lib/cmap/connection.js:306:27)
    at process.processTicksAndRejections (node:internal/process/task_queues:103:5)
    at async Connection.command (/Users/steve.silvester/workspace/drivers-evergreen-tools/.evergreen/orchestration/node_modules/mongodb/lib/cmap/connection.js:334:26)
    at async executeScram (/Users/steve.silvester/workspace/drivers-evergreen-tools/.evergreen/orchestration/node_modules/mongodb/lib/cmap/auth/scram.js:79:22)
    at async ScramSHA256.auth (/Users/steve.silvester/workspace/drivers-evergreen-tools/.evergreen/orchestration/node_modules/mongodb/lib/cmap/auth/scram.js:39:16)
    at async performInitialHandshake (/Users/steve.silvester/workspace/drivers-evergreen-tools/.evergreen/orchestration/node_modules/mongodb/lib/cmap/connect.js:104:13)
    at async connect (/Users/steve.silvester/workspace/drivers-evergreen-tools/.evergreen/orchestration/node_modules/mongodb/lib/cmap/connect.js:24:9) {
  errorLabelSet: Set(2) { 'HandshakeError', 'ResetPool' },
...

isClosed(): boolean {
return this.servers.length === 0 && this.shards.length === 0;
// eslint-disable-next-line @typescript-eslint/no-unused-vars
for (const _ of this.children()) return true;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should it be returning false if there are children? Not sure I follow the logic here, maybe worth a comment? I might be missing something.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This matches the existing logic – if all servers have been removed from the cluster, than it's closed. But I'm happy to add a comment here

Copy link
Member

@Anemy Anemy Nov 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should it be

Suggested change
for (const _ of this.children()) return true;
for (const _ of this.children()) return false;

Then? Wouldn't return true mean there are servers or shards? Thanks for adding a comment! Apologies if I'm totally missing this.

type: 'string',
describe: 'Configure OIDC authentication on the server',
})
.config()

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Haha I can't believe I missed this option. 😅

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I verified that it passes options through that aren't defined in the cli options, so we don't need to add all of the new options to the cli.

rsMembers: RSMemberOptions[];
};

export type ShardedOptions = {
Copy link

@blink1073 blink1073 Nov 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The way the shards are defined in our config files is more like a list of rsMembers, so I think we need to add:

| {
    shards?: never;
    shardArgs?: RSMemberOptions[][];
}
```

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, will do


export type MongoClusterOptions = Pick<
MongoServerOptions,
'logDir' | 'tmpDir' | 'args' | 'binDir' | 'docker'

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to pick defaultConnectionOptions here as well to handle the TLS options.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That doesn't sound quite right – can you describe the use case here?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How else will internal client returned by withClient be able to use TLS?

Copy link

@blink1073 blink1073 Nov 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We pass these options on to mongo-orchestration to use in its connection string for internal clients:

ssl = true
tlsCertificateKeyFile = clientCertFile
tlsAllowInvalidCertificates = true

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants