Skip to content

Conversation

@sethsamuel
Copy link
Contributor

@sethsamuel sethsamuel commented Oct 21, 2025

What does this PR do?

Adds a shared schema collector for the DBM integrations (Postgres, MySQL, SQLServer).

Motivation

This class centralizes shared logic around iteration, buffering, submission, etc. Individual integrations will implement subclasses that handle actual data retrieval and mapping. See #21501 for the Postgres implementation.

Review checklist (to be filled by reviewers)

  • Feature or bugfix MUST have appropriate tests (unit, integration, e2e)
  • Add the qa/skip-qa label if the PR doesn't need to be tested during QA.
  • If you need to backport this PR to another branch, you can add the backport/<branch-name> label to the PR and it will automatically open a backport PR once this one is merged

@datadog-official

This comment has been minimized.

@codecov
Copy link

codecov bot commented Oct 21, 2025

Codecov Report

❌ Patch coverage is 86.81319% with 24 lines in your changes missing coverage. Please review.
✅ Project coverage is 91.10%. Comparing base (497be69) to head (9c2daa0).
⚠️ Report is 17 commits behind head on master.

Additional details and impacted files
🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.



class TestCheck(AgentCheck):
__test__ = False
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixes a warning in pytest

Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment on lines +24 to +26
@property
def reported_hostname(self) -> str | None:
raise NotImplementedError("reported_hostname is not implemented for this check")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
@property
def reported_hostname(self) -> str | None:
raise NotImplementedError("reported_hostname is not implemented for this check")
@property
@abstractmethod
def reported_hostname(self) -> str | None:
pass

An alternative to raising NotImplementedError would be to decorate these as abstractmethods (from abc import abstractmethod). The benefit being unimplemented abstract methods will throw errors immediately upon class instantiation if missing vs the current approach which will lazily throw the error at first property access. Marking it abstractmethod also should get picked up in IDE's and linters I believe.

when the current collection started.
"""

_collection_started_at: int | None = None
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
_collection_started_at: int | None = None

This should be able to be removed. We're using this on the class instance level and not the class level so this gets overwritten on self

Comment on lines +57 to +60
self._dbms = check.__class__.__name__.lower()
if self._dbms == 'postgresql':
# Backwards compatibility for metrics namespacing
self._dbms = 'postgres'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can move this onto DatabaseCheck itself or have DatabaseCheck.dbms abstract method. I can see this being useful elsewhere

Comment on lines +74 to +78
This method will enforce non-overlapping invocations and
returns False if the previous collection was still in progress when invoked again.
"""
if self._collection_started_at is not None:
return False
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's chat more about this one offline. I might be missing something, but I don't immediately see how this function is ever called in a non-blocking way, including in your reference PR where this would occur today

status = "success"
try:
self._collection_started_at = now_ms()
databases = self._get_databases()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we add a debug log here and report the number of databases we'll be collecting

self._collection_started_at = now_ms()
databases = self._get_databases()
for database in databases:
database_name = database['name']
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similarly here, we don't need to log the DB name to avoid customer data, but let's emit a debug log like "collecting schemas for database ". I think this will be helpful reference markers if needing to debug stalls/issues on large collections

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants