-
Notifications
You must be signed in to change notification settings - Fork 1.5k
Create shared schemas collector for DBM integrations #21720
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Create shared schemas collector for DBM integrations #21720
Conversation
This comment has been minimized.
This comment has been minimized.
Codecov Report❌ Patch coverage is Additional details and impacted files🚀 New features to boost your workflow:
|
|
|
||
|
|
||
| class TestCheck(AgentCheck): | ||
| __test__ = False |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixes a warning in pytest
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💡 Codex Review
Here are some automated review suggestions for this pull request.
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| @property | ||
| def reported_hostname(self) -> str | None: | ||
| raise NotImplementedError("reported_hostname is not implemented for this check") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| @property | |
| def reported_hostname(self) -> str | None: | |
| raise NotImplementedError("reported_hostname is not implemented for this check") | |
| @property | |
| @abstractmethod | |
| def reported_hostname(self) -> str | None: | |
| pass |
An alternative to raising NotImplementedError would be to decorate these as abstractmethods (from abc import abstractmethod). The benefit being unimplemented abstract methods will throw errors immediately upon class instantiation if missing vs the current approach which will lazily throw the error at first property access. Marking it abstractmethod also should get picked up in IDE's and linters I believe.
| when the current collection started. | ||
| """ | ||
|
|
||
| _collection_started_at: int | None = None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| _collection_started_at: int | None = None |
This should be able to be removed. We're using this on the class instance level and not the class level so this gets overwritten on self
| self._dbms = check.__class__.__name__.lower() | ||
| if self._dbms == 'postgresql': | ||
| # Backwards compatibility for metrics namespacing | ||
| self._dbms = 'postgres' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we can move this onto DatabaseCheck itself or have DatabaseCheck.dbms abstract method. I can see this being useful elsewhere
| This method will enforce non-overlapping invocations and | ||
| returns False if the previous collection was still in progress when invoked again. | ||
| """ | ||
| if self._collection_started_at is not None: | ||
| return False |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's chat more about this one offline. I might be missing something, but I don't immediately see how this function is ever called in a non-blocking way, including in your reference PR where this would occur today
| status = "success" | ||
| try: | ||
| self._collection_started_at = now_ms() | ||
| databases = self._get_databases() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we add a debug log here and report the number of databases we'll be collecting
| self._collection_started_at = now_ms() | ||
| databases = self._get_databases() | ||
| for database in databases: | ||
| database_name = database['name'] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similarly here, we don't need to log the DB name to avoid customer data, but let's emit a debug log like "collecting schemas for database ". I think this will be helpful reference markers if needing to debug stalls/issues on large collections
What does this PR do?
Adds a shared schema collector for the DBM integrations (Postgres, MySQL, SQLServer).
Motivation
This class centralizes shared logic around iteration, buffering, submission, etc. Individual integrations will implement subclasses that handle actual data retrieval and mapping. See #21501 for the Postgres implementation.
Review checklist (to be filled by reviewers)
qa/skip-qalabel if the PR doesn't need to be tested during QA.backport/<branch-name>label to the PR and it will automatically open a backport PR once this one is merged