Skip to content

Refactor Counter metrics to be homeserver-scoped #18656

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 52 commits into from
Jul 25, 2025

Conversation

MadLittleMods
Copy link
Contributor

@MadLittleMods MadLittleMods commented Jul 4, 2025

Bulk refactor Counter metrics to be homeserver-scoped. We also add lints to make sure that new Counter metrics don't sneak in without using the server_name label (SERVER_NAME_LABEL).

All of the "Fill in" commits are just bulk refactor.

Part of #18592

Testing strategy

  1. Add the metrics listener in your homeserver.yaml
    listeners:
      # This is just showing how to configure metrics either way
      #
      # `http` `metrics` resource
      - port: 9322
        type: http
        bind_addresses: ['127.0.0.1']
        resources:
          - names: [metrics]
            compress: false
      # `metrics` listener
      - port: 9323
        type: metrics
        bind_addresses: ['127.0.0.1']
  2. Start the homeserver: poetry run synapse_homeserver --config-path homeserver.yaml
  3. Fetch http://localhost:9322/_synapse/metrics and/or http://localhost:9323/metrics
  4. Observe response includes the synapse_user_registrations_total, synapse_http_server_response_count_total, etc metrics with the server_name label

Dev notes

mypy plugins

https://mypy.readthedocs.io/en/latest/extending_mypy.html

poetry run mypy synapse/util/ratelimitutils.py

ErrorCodes defined in mypy plugins seemingly cannot be ignored by --disable-error-code python/mypy#12987

(doesn't work because of above issue)

poetry run mypy --disable-error-code missing-server-name-label
...
mypy: error: Invalid error code(s): missing-server-name-label

Refactor notes

More special refactors:

  • synapse/http/request_metrics.py
  • synapse/metrics/background_process_metrics.py

Unused metrics:

  • notified_events_counter
  • push_rules_invalidation_counter
  • push_rules_state_size_counter
  • remove_pusher_counter

Todo

Pull Request Checklist

  • Pull request is based on the develop branch
  • Pull request includes a changelog file. The entry should:
    • Be a short description of your change which makes sense to users. "Fixed a bug that prevented receiving messages from other servers." instead of "Moved X method from EventStore to EventWorkerStore.".
    • Use markdown where necessary, mostly for code blocks.
    • End with either a period (.) or an exclamation mark (!).
    • Start with a capital letter.
    • Feel free to credit yourself, by adding a sentence "Contributed by @github_username." or "Contributed by [Your Name]." to the end of the entry.
  • Code style is correct (run the linters)

@MadLittleMods MadLittleMods changed the title Add lints to ensure Prometheus metrics have the server_name label Refactor Counter metrics to be homeserver-scoped Jul 4, 2025
…f `synapse/http/request_metrics.py`)

```
synapse/logging/opentracing.py:1084: error: Cannot determine type of "request_metrics"  [has-type]
synapse/logging/opentracing.py:1100: error: Cannot determine type of "request_metrics"  [has-type]
synapse/logging/opentracing.py:1106: error: Unused "type: ignore" comment  [unused-ignore]
synapse/logging/opentracing.py:1106: error: Cannot determine type of "request_metrics"  [has-type]
synapse/logging/opentracing.py:1106: note: Error code "has-type" not covered by "type: ignore" comment

synapse/http/server.py:141: error: Cannot determine type of "request_metrics"  [has-type]
synapse/http/server.py:151: error: Cannot determine type of "request_metrics"  [has-type]
synapse/http/server.py:333: error: Cannot determine type of "request_metrics"  [has-type]
synapse/http/server.py:546: error: Cannot determine type of "request_metrics"  [has-type]
synapse/http/site.py:132: error: Cannot determine type of "server_name"  [has-type]
synapse/http/site.py:341: error: Cannot determine type of "request_metrics"  [has-type]
```
Conflicts:
	synapse/handlers/device.py
	synapse/metrics/__init__.py
	synapse/storage/_base.py
	synapse/storage/databases/main/end_to_end_keys.py
MadLittleMods added a commit that referenced this pull request Jul 23, 2025
Part of #18592

Separated out of #18656
because it's a bigger, unique piece of the refactor


### Testing strategy

 1. Add the `metrics` listener in your `homeserver.yaml`
    ```yaml
    listeners:
      # This is just showing how to configure metrics either way
      #
      # `http` `metrics` resource
      - port: 9322
        type: http
        bind_addresses: ['127.0.0.1']
        resources:
          - names: [metrics]
            compress: false
      # `metrics` listener
      - port: 9323
        type: metrics
        bind_addresses: ['127.0.0.1']
    ```
1. Start the homeserver: `poetry run synapse_homeserver --config-path
homeserver.yaml`
1. Fetch `http://localhost:9322/_synapse/metrics` and/or
`http://localhost:9323/metrics`
1. Observe response includes the background processs metrics
(`synapse_background_process_start_count`,
`synapse_background_process_db_txn_count_total`, etc) with the
`server_name` label
Conflicts:
	synapse/appservice/scheduler.py
	synapse/handlers/device.py
	synapse/handlers/typing.py
	synapse/push/pusherpool.py
	synapse/replication/tcp/redis.py
	synapse/util/task_scheduler.py
@@ -65,6 +89,85 @@ def get_method_signature_hook(
return None


def check_prometheus_metric_instantiation(ctx: FunctionSigContext) -> CallableType:
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

New linting to prevent new Counter metrics from being introduced without the SERVER_NAME_LABEL

@MadLittleMods MadLittleMods marked this pull request as ready for review July 23, 2025 20:06
@MadLittleMods MadLittleMods requested a review from a team as a code owner July 23, 2025 20:06
MadLittleMods added a commit that referenced this pull request Jul 23, 2025
MadLittleMods added a commit that referenced this pull request Jul 23, 2025
Wait for #18656 to be merged
so we have `self.server_name` available.
MadLittleMods added a commit that referenced this pull request Jul 23, 2025
Wait for #18656 to merge
so we have access to `self.server_name`
MadLittleMods added a commit that referenced this pull request Jul 23, 2025
Wait for #18656 to be merged
so we have `self.server_name` available.
MadLittleMods added a commit that referenced this pull request Jul 23, 2025
Wait for #18656 to merge
so we have access to `self.server_name`
MadLittleMods added a commit that referenced this pull request Jul 23, 2025
MadLittleMods added a commit that referenced this pull request Jul 24, 2025
Copy link
Member

@devonh devonh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These changes all look good and are pretty straightforward swaps to add the new server name label.

Thanks for taking the time to cleanup things like docstrings along the way :)

@@ -137,8 +139,6 @@ def __init__(

logger.info("Created Mailer for app_name %s", app_name)

emails_sent_counter.labels("password_reset")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Huh, odd. Your cleanup looks good to me 🤷

@MadLittleMods MadLittleMods merged commit 2c236be into develop Jul 25, 2025
74 of 78 checks passed
@MadLittleMods MadLittleMods deleted the madlittlemods/18592-metric-lint-labels branch July 25, 2025 19:58
@MadLittleMods
Copy link
Contributor Author

Thanks for the review @devonh 🐎

MadLittleMods added a commit that referenced this pull request Jul 29, 2025
Bulk refactor `Histogram` metrics to be homeserver-scoped. We also add
lints to make sure that new `Histogram` metrics don't sneak in without
using the `server_name` label (`SERVER_NAME_LABEL`).

Part of #18592



### Testing strategy

 1. Add the `metrics` listener in your `homeserver.yaml`
    ```yaml
    listeners:
      # This is just showing how to configure metrics either way
      #
      # `http` `metrics` resource
      - port: 9322
        type: http
        bind_addresses: ['127.0.0.1']
        resources:
          - names: [metrics]
            compress: false
      # `metrics` listener
      - port: 9323
        type: metrics
        bind_addresses: ['127.0.0.1']
    ```
1. Start the homeserver: `poetry run synapse_homeserver --config-path
homeserver.yaml`
1. Fetch `http://localhost:9322/_synapse/metrics` and/or
`http://localhost:9323/metrics`
1. Observe response includes the TODO metrics with the `server_name`
label

### Todo

- [x] Wait for #18656 to merge


### Dev notes

```
LoggingDatabaseConnection
make_conn
make_pool
make_fake_db_pool
```

### Pull Request Checklist

<!-- Please read
https://element-hq.github.io/synapse/latest/development/contributing_guide.html
before submitting your pull request -->

* [x] Pull request is based on the develop branch
* [x] Pull request includes a [changelog
file](https://element-hq.github.io/synapse/latest/development/contributing_guide.html#changelog).
The entry should:
- Be a short description of your change which makes sense to users.
"Fixed a bug that prevented receiving messages from other servers."
instead of "Moved X method from `EventStore` to `EventWorkerStore`.".
  - Use markdown where necessary, mostly for `code blocks`.
  - End with either a period (.) or an exclamation mark (!).
  - Start with a capital letter.
- Feel free to credit yourself, by adding a sentence "Contributed by
@github_username." or "Contributed by [Your Name]." to the end of the
entry.
* [x] [Code
style](https://element-hq.github.io/synapse/latest/code_style.html) is
correct (run the
[linters](https://element-hq.github.io/synapse/latest/development/contributing_guide.html#run-the-linters))
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants