refactor(profiling): start porting memalloc to C++ #12519
Draft
+78
−203
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Python profiling is implemented in several different programming
languages: Python, Cython, C, C++, and we also call out to a Rust
library. Particularly on the native side, that's a lot of cognitive load
to work on this code. In the long term we should try to consolidate the
code as much as we reasonably can, so it's easier to work on. Most of
the native code is C++. The memory profiler, memalloc, is an outlier and
is written in C. This PR starts porting memalloc to C++. There are a few
benefits I expect to get from this:
macro. I have already found this hard to debug, for example when I
found out it was prone to integer arithmetic bugs when the index type
was set to a 16-bit unsigned integer. We shouldn't even be able to do
that for a container. C++ has vectors that do this, but better.
mutex and atomic support in the standard library, with support for the
platforms we're interested in.
has has tables. I expect that heap profiling will have poor scaling
behavior for heaps with lots of objects, and our tracking data
structure is an array. Having the door open to easily switch to a
different data structure, without having to roll our own in C or add
a 3rd-party dependency, would be good.
language, it's similar enough that we can do a gradual bug-for-bug
translation and be able to reason about its correctness (or at least
lack of significant deviance from past behavior) as we go. Rust has
all the things listed above, but it would be a much bigger jump to
go from C to a basic working Rust version than to go to C++.
Eventually we might be able to simplify this code enough in C++ that a
Rust translation actually seems easy. So Rust isn't off the table.
This commit starts by renaming the memalloc files and switching some of
the dynamic array types to std::vectors. It otherwise makes no
significant changes; just enough to keep it compiling. Followup changes
will port over the synchronization code and make the
traceback_t
implementation a bit easier to understand (e.g. using a vector rather
than allocating space for frames past the end of the actual
traceback_t
struct).
Checklist
Reviewer Checklist