Create shared schemas collector for DBM integrations #21720

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

Open

sethsamuel wants to merge 9 commits into master from seth.samuel/DBMON-5799-create-shared-schema-collector

Contributor

sethsamuel commented Oct 21, 2025 •

edited

Loading

What does this PR do?

Adds a shared schema collector for the DBM integrations (Postgres, MySQL, SQLServer).

Motivation

This class centralizes shared logic around iteration, buffering, submission, etc. Individual integrations will implement subclasses that handle actual data retrieval and mapping. See #21501 for the Postgres implementation.

Review checklist (to be filled by reviewers)

Feature or bugfix MUST have appropriate tests (unit, integration, e2e)
Add the qa/skip-qa label if the PR doesn't need to be tested during QA.
If you need to backport this PR to another branch, you can add the backport/<branch-name> label to the PR and it will automatically open a backport PR once this one is merged


          Create shared schemas collector for DBM integrations

6b80e5a

temporal-github-worker-1 bot added agent/review-requested ecosystems/review-requested product/review-requested labels

datadog-agent-integrations-bot bot added the base_package label

This comment has been minimized.

Sign in to view

codecov bot commented Oct 21, 2025 •

edited

Loading

Codecov Report

❌ Patch coverage is 86.81319% with 24 lines in your changes missing coverage. Please review.
✅ Project coverage is 91.10%. Comparing base (497be69) to head (9c2daa0).
⚠️ Report is 17 commits behind head on master.

Additional details and impacted files

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

sethsamuel added 3 commits

October 21, 2025 14:36

WIP

96e5260

WIP

04f8163


          Changelog

4624b88

sethsamuel commented

View reviewed changes

datadog_checks_base/tests/base/utils/test_persistent_cache.py

    
              class TestCheck(AgentCheck):

                  __test__ = False

Contributor Author

sethsamuel Oct 21, 2025

Fixes a warning in pytest

sethsamuel added 3 commits

October 21, 2025 14:49


          Warning

a68f875


          Remove unused

aa0e0dd


          Lint

3c64896

sethsamuel marked this pull request as ready for review

October 23, 2025 13:35

sethsamuel requested review from a team as code owners

October 23, 2025 13:35

datadog-agent-integrations-bot bot added team/agent-integrations team/database-monitoring-agent labels

chatgpt-codex-connector bot reviewed

View reviewed changes

chatgpt-codex-connector bot left a comment

💡 Codex Review

Here are some automated review suggestions for this pull request.

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

datadog_checks_base/datadog_checks/base/utils/db/schemas.py Show resolved Hide resolved

datadog_checks_base/datadog_checks/base/utils/db/schemas.py Show resolved Hide resolved

sethsamuel added 2 commits

October 24, 2025 09:04


          AI Fixes

da84647

Fix

9c2daa0

eric-weaver reviewed

View reviewed changes

datadog_checks_base/datadog_checks/base/checks/db.py

Comment on lines +24 to +26

    
                  @property

                  def reported_hostname(self) -> str | None:

                      raise NotImplementedError("reported_hostname is not implemented for this check")

Contributor

eric-weaver Oct 24, 2025

Suggested change

      
                @property
          
                def reported_hostname(self) -> str | None:
          
                    raise NotImplementedError("reported_hostname is not implemented for this check")
          
                @property
          
                @abstractmethod
          
                def reported_hostname(self) -> str | None:
          
                    pass

An alternative to raising NotImplementedError would be to decorate these as abstractmethods (from abc import abstractmethod). The benefit being unimplemented abstract methods will throw errors immediately upon class instantiation if missing vs the current approach which will lazily throw the error at first property access. Marking it abstractmethod also should get picked up in IDE's and linters I believe.

eric-weaver reviewed

View reviewed changes

datadog_checks_base/datadog_checks/base/utils/db/schemas.py

    
                          when the current collection started.

                  """

                  _collection_started_at: int | None = None

Contributor

eric-weaver Oct 24, 2025

Suggested change

_collection_started_at: int | None = None

This should be able to be removed. We're using this on the class instance level and not the class level so this gets overwritten on self

eric-weaver reviewed

View reviewed changes

datadog_checks_base/datadog_checks/base/utils/db/schemas.py

Comment on lines +57 to +60

    
                      self._dbms = check.__class__.__name__.lower()

                      if self._dbms == 'postgresql':

                          # Backwards compatibility for metrics namespacing

                          self._dbms = 'postgres'

Contributor

eric-weaver Oct 24, 2025

I think we can move this onto DatabaseCheck itself or have DatabaseCheck.dbms abstract method. I can see this being useful elsewhere

eric-weaver reviewed

View reviewed changes

datadog_checks_base/datadog_checks/base/utils/db/schemas.py

Comment on lines +74 to +78

    
                      This method will enforce non-overlapping invocations and

                      returns False if the previous collection was still in progress when invoked again.

                      """

                      if self._collection_started_at is not None:

                          return False

Contributor

eric-weaver Oct 24, 2025

Let's chat more about this one offline. I might be missing something, but I don't immediately see how this function is ever called in a non-blocking way, including in your reference PR where this would occur today

eric-weaver reviewed

View reviewed changes

datadog_checks_base/datadog_checks/base/utils/db/schemas.py

    
                      status = "success"

                      try:

                          self._collection_started_at = now_ms()

                          databases = self._get_databases()

Contributor

eric-weaver Oct 24, 2025

Can we add a debug log here and report the number of databases we'll be collecting

eric-weaver reviewed

View reviewed changes

datadog_checks_base/datadog_checks/base/utils/db/schemas.py

    
                          self._collection_started_at = now_ms()

                          databases = self._get_databases()

                          for database in databases:

                              database_name = database['name']

Contributor

eric-weaver Oct 24, 2025

Similarly here, we don't need to log the DB name to avoid customer data, but let's emit a debug log like "collecting schemas for database ". I think this will be helpful reference markers if needing to debug stalls/issues on large collections

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

agent/review-requested base_package ecosystems/review-requested product/review-requested team/agent-integrations team/database-monitoring-agent