Skip to content

Conversation

@MadLittleMods
Copy link
Contributor

@MadLittleMods MadLittleMods commented Oct 23, 2025

Cheaper logcontext debug logs (random_string_insecure_fast(...))

Follow-up to #18966

During the weekly Backend team meeting, it was mentioned that random_string(...) was taking a significant amount of CPU on matrix.org. This makes sense as it relies on secrets.choice(...), a cryptographically secure function that is inherently computationally expensive. And since #18966, we're calling random_string(...) as part of a bunch of logcontext utilities.

Since we don't need cryptographically secure random strings for our debug logs, this PR is introducing a new random_string_insecure_fast(...) function that uses random.choice(...) which uses pseudo-random numbers that are "both fast and threadsafe".

Dev notes

Pull Request Checklist

  • Pull request is based on the develop branch
  • Pull request includes a changelog file. The entry should:
    • Be a short description of your change which makes sense to users. "Fixed a bug that prevented receiving messages from other servers." instead of "Moved X method from EventStore to EventWorkerStore.".
    • Use markdown where necessary, mostly for code blocks.
    • End with either a period (.) or an exclamation mark (!).
    • Start with a capital letter.
    • Feel free to credit yourself, by adding a sentence "Contributed by @github_username." or "Contributed by [Your Name]." to the end of the entry.
  • Code style is correct (run the linters)

return "".join(secrets.choice(_string_with_symbols) for _ in range(length))


def pseudo_random_string(length: int) -> str:
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also considered naming this random_string_insecure_fast

🤷

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should call this insecure_random_string. Being pseudo-random doesn't necessarily imply insecure, that's what CSPRNGs are for, after all.

It's maybe not a huge deal, but given the sheer hazard of using the wrong type of random in the wrong place, I much prefer the clear and simple insecure label, because it brings your attention to an important (negative) caveat. In theory that could raise some alarm bells at a critical time during review.

On the other hand, I don't think it's important to say fast — if someone consciously thinks about the speed at PR review time, they can look it up. (But I'm also not against calling it 'fast', to be fair!)

Copy link
Contributor Author

@MadLittleMods MadLittleMods Oct 30, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Originally, I didn't even consider that a simple, innocuous-sounding random_string(...) utility would have a noticeable impact in the app when using it in the logcontext code. There is no "cryptographically"/"crypt" hint about it from the outside that I would normally think about needing in a secure context.

Adding fast at-least sparks an idea that the other variation could be slow and better consideration on which one to choose.

I'll go with random_string_insecure_fast (suffix) so it appears more readily and obviously next to random_string in typeahead.

Comment on lines 862 to 866
instance_id = pseudo_random_string(5)
calling_context = current_context()
logcontext_debug_logger.debug(
"run_in_background(%s): called with logcontext=%s", instance_id, calling_context
)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As another alternative, we could gate the random string creation behind if logcontext_debug_logger.isEnabledFor(logging.DEBUG)

@MadLittleMods MadLittleMods marked this pull request as ready for review October 23, 2025 18:38
@MadLittleMods MadLittleMods requested a review from a team as a code owner October 23, 2025 18:39
return "".join(secrets.choice(_string_with_symbols) for _ in range(length))


def pseudo_random_string(length: int) -> str:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should call this insecure_random_string. Being pseudo-random doesn't necessarily imply insecure, that's what CSPRNGs are for, after all.

It's maybe not a huge deal, but given the sheer hazard of using the wrong type of random in the wrong place, I much prefer the clear and simple insecure label, because it brings your attention to an important (negative) caveat. In theory that could raise some alarm bells at a critical time during review.

On the other hand, I don't think it's important to say fast — if someone consciously thinks about the speed at PR review time, they can look it up. (But I'm also not against calling it 'fast', to be fair!)

@MadLittleMods MadLittleMods changed the title Cheaper logcontext debug logs (pseudo_random_string(...)) Cheaper logcontext debug logs (random_string_insecure_fast(...)) Oct 30, 2025
@MadLittleMods MadLittleMods merged commit f0aae62 into develop Oct 30, 2025
75 of 78 checks passed
@MadLittleMods MadLittleMods deleted the madlittlemods/cheaper-logcontext-debug-logs branch October 30, 2025 16:47
@MadLittleMods
Copy link
Contributor Author

Thanks for the review @reivilibre 🐡

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants