Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(profiling): add support for pytorch profiling #9154

Merged
merged 114 commits into from
Dec 13, 2024
Merged
Show file tree
Hide file tree
Changes from 75 commits
Commits
Show all changes
114 commits
Select commit Hold shift + click to select a range
363064e
Port some of the original PR to new libdatadog
sanchda May 3, 2024
570f5fa
Add some of the pytorch stuff
sanchda May 3, 2024
48e389b
Remove unused tag
sanchda May 3, 2024
6f6eabb
Merge branch 'main' into peterg17/pytorch_profiling_integration2
peterg17 May 21, 2024
4770f40
[PROF-9710] Add instrumentation for torch.profiler.
peterg17 May 23, 2024
78c5b6b
[PROF-9710] Format profiling module with black.
peterg17 May 23, 2024
137e66f
fixup! Add some of the pytorch stuff
peterg17 May 23, 2024
193d8e0
fixup! [PROF-9710] Add instrumentation for torch.profiler.
peterg17 May 23, 2024
17f18ec
fixup! [PROF-9710] Add instrumentation for torch.profiler.
peterg17 May 23, 2024
a7d52f9
fixup! [PROF-9710] Add instrumentation for torch.profiler.
peterg17 May 24, 2024
7c598b0
fixup! [PROF-9710] Add instrumentation for torch.profiler.
peterg17 May 26, 2024
56fa87f
[PROF-9710] Add pytorch integration release note.
peterg17 May 28, 2024
2dfabe3
Merge branch 'main' into peterg17/pytorch_profiling_integration2
sanchda May 30, 2024
ac7c7c2
Merge branch 'main' into peterg17/pytorch_profiling_integration2
sanchda Jun 4, 2024
f90dffb
Some fixups
sanchda Jun 4, 2024
2c579f5
Add better type management
sanchda Jun 4, 2024
a671241
Merge branch 'main' into peterg17/pytorch_profiling_integration2
sanchda Jun 5, 2024
6b3bfe6
Merge branch 'main' into peterg17/pytorch_profiling_integration2
peterg17 Jul 9, 2024
c9940a2
Fix leftover merge conflict, format.
peterg17 Jul 9, 2024
c0d8ef9
fixup! Fix leftover merge conflict, format.
peterg17 Jul 9, 2024
46f46f1
fixup! Fix leftover merge conflict, format.
peterg17 Jul 9, 2024
d331e26
fixup! Fix leftover merge conflict, format.
peterg17 Jul 9, 2024
824d109
Apply ruff format fix for _ddup.pyi.
peterg17 Jul 9, 2024
1aaf847
fixup! Fix leftover merge conflict, format.
peterg17 Jul 9, 2024
514fec6
fixup! Fix leftover merge conflict, format.
peterg17 Jul 9, 2024
f00352c
Fix type annotation warning.
peterg17 Jul 9, 2024
380b5d7
Fix default pytorch config value (needs to be False).
peterg17 Jul 10, 2024
3e906ae
Add documentation for PyTorch profiling integration.
peterg17 Jul 10, 2024
4047701
fixup! Add documentation for PyTorch profiling integration.
peterg17 Jul 10, 2024
c38b7d2
Revert profiling changes unrelated to PyTorch.
peterg17 Jul 10, 2024
09b36c3
Add pytorch profiler terms to docs spelling word list.
peterg17 Jul 10, 2024
55784df
Fix pytorch documentation spelling warnings.
peterg17 Jul 10, 2024
4cb74e7
Improve PyTorch documentation wording.
peterg17 Jul 10, 2024
d0637e1
Remove unused parameters from pytorch wrapper.
peterg17 Jul 10, 2024
df721ab
fixup! Remove unused parameters from pytorch wrapper.
peterg17 Jul 10, 2024
00d3270
fixup! Improve PyTorch documentation wording.
peterg17 Jul 10, 2024
6d7c1f2
Improve doc formatting.
peterg17 Jul 10, 2024
5e4eb0d
fixup! Improve doc formatting.
peterg17 Jul 10, 2024
1f76aa3
fixup! Improve doc formatting.
peterg17 Jul 10, 2024
96a746a
[PROF-9710] Cleanup from PR feedback.
peterg17 Jul 11, 2024
28ed224
fixup! [PROF-9710] Cleanup from PR feedback.
peterg17 Jul 11, 2024
17e5ef5
Merge branch 'main' into peterg17/pytorch_profiling_integration2
sanchda Jul 12, 2024
a24383e
Update docs/advanced_usage.rst
danielsn Dec 9, 2024
5adcc86
Merge branch 'main' into peterg17/pytorch_profiling_integration2
danielsn Dec 9, 2024
58d4133
Merge branch 'main' into peterg17/pytorch_profiling_integration2
danielsn Dec 9, 2024
2857bd4
format and import wrapt
danielsn Dec 9, 2024
8eeeeb9
fix ProfilingConfig
danielsn Dec 9, 2024
a753fdf
[PROF-9710] Add pytorch gpu tests Github Workflow.
peterg17 Dec 10, 2024
e80bbab
[PROF-9710] Fix pytorch CI.
peterg17 Dec 10, 2024
a2201ea
[PROF-9710] Try fixing pytorch CI again.
peterg17 Dec 10, 2024
39f7fbe
get rid of attrs, and use the correct name for get/set_original
danielsn Dec 10, 2024
2a52598
make ruff happy
danielsn Dec 10, 2024
b6f1d26
ci(profiling): add torch dependency to pytorch CI test.
peterg17 Dec 10, 2024
a0754b3
ci(profiling): add torchvision dependency for pytorch CI test.
peterg17 Dec 10, 2024
9c82135
PR comments
danielsn Dec 10, 2024
712a74d
fix wrong variable checked for null
danielsn Dec 10, 2024
5f0f3e4
nicer refactor of sample collection
danielsn Dec 10, 2024
3cd874f
nicer comment formatting
danielsn Dec 10, 2024
c2c1e41
ci(profiling): add pytorch cpu test in CI for integration.
peterg17 Dec 10, 2024
f23dd74
Log interresting events
danielsn Dec 10, 2024
092c23c
log the event if it had no data
danielsn Dec 10, 2024
8e58455
ci(profiling): see pytorch test output in stdout.
peterg17 Dec 10, 2024
71ef8b0
ci(profiling): print out pytorch test stdout.
peterg17 Dec 10, 2024
754d1fe
super init class
danielsn Dec 10, 2024
eabbf1d
simplify super:
danielsn Dec 10, 2024
61709c2
pass recorder not tracer
danielsn Dec 10, 2024
69c5962
ci(profiling): import ddtrace auto into pytorch test scripts.
peterg17 Dec 10, 2024
58642c0
ci(profiling): ignore ruff import errors for ddtrace auto lines.
peterg17 Dec 10, 2024
d7b6417
remove duplicate arg to pytorch constructor.
peterg17 Dec 10, 2024
d6c1307
make debug logging less verbose.
peterg17 Dec 10, 2024
4e8c419
better log events
danielsn Dec 10, 2024
7ed98c0
track cputime
danielsn Dec 10, 2024
62bcd71
use new pytorch function trace_start_ns()
peterg17 Dec 10, 2024
819936d
use device memory usage
danielsn Dec 10, 2024
7e048ed
Merge branch 'main' into peterg17/pytorch_profiling_integration2
danielsn Dec 10, 2024
fa90dab
Update .github/workflows/pytorch_gpu_tests.yml
danielsn Dec 11, 2024
6cb0f81
Update .github/workflows/pytorch_gpu_tests.yml
danielsn Dec 11, 2024
a6bcff5
Update .github/workflows/pytorch_gpu_tests.yml
danielsn Dec 11, 2024
374b0cd
Merge branch 'main' into peterg17/pytorch_profiling_integration2
danielsn Dec 11, 2024
18069f0
ci(profiling): use Github Actions GPU runner for pytorch workflow.
peterg17 Dec 11, 2024
3c3f4ec
ci(profiling): handle pytorch profiling start timestamp.
peterg17 Dec 11, 2024
b00adeb
ci(profiling): adjust pytorch test program to use ddtrace-run.
peterg17 Dec 11, 2024
5a457c4
ci(profiling): add env vars to trigger pytorch integration test.
peterg17 Dec 11, 2024
30d9755
debug pytorch CI.
peterg17 Dec 11, 2024
471c6f1
more CI debugging for pytorch test.
peterg17 Dec 11, 2024
5dd28de
fix(profiling): handle gpu memory across different pytorch versions.
peterg17 Dec 11, 2024
6a00184
ci(profiling): enable ddtrace debug logging in pytorch test.
peterg17 Dec 11, 2024
992f4d7
ci(profiling): enable file output for pprof in pytorch test
peterg17 Dec 11, 2024
da54e20
Merge branch 'main' into peterg17/pytorch_profiling_integration2
danielsn Dec 11, 2024
ffc8f12
cleanup the collection code, and make it randomly sample if too many …
Dec 11, 2024
90f0731
clean up time calculation code for pytorch.
peterg17 Dec 11, 2024
9e4ebd9
use time_elapsed variable.
peterg17 Dec 11, 2024
1cbda1d
fix(profiling): insert static file name to avoid sample being dropped.
peterg17 Dec 12, 2024
f421a79
Make timeline and flamegraphs work
Dec 12, 2024
1f2cc36
fix(profiling): comment out security linting violation, not applicable.
peterg17 Dec 13, 2024
ed6acd0
debug pytorch pprof output.
peterg17 Dec 13, 2024
c312099
ci(profiling): debug pytorch pprof output job.
peterg17 Dec 13, 2024
d5f94d6
ci(profiling): debug pprof output.
peterg17 Dec 13, 2024
124bda1
ci(profiling): debug pprof output filename.
peterg17 Dec 13, 2024
2ce6f8d
handle pprof output in test for different cases.
peterg17 Dec 13, 2024
39d22ae
more pytorch CI debugging.
peterg17 Dec 13, 2024
cc3e89d
use better pprof parsing function for test.
peterg17 Dec 13, 2024
0bf0597
add lz4 dependency.
peterg17 Dec 13, 2024
41fdd83
ci(profiling): fix pytorch test pprof file prefix.
peterg17 Dec 13, 2024
3299218
ci(profiling): refactor pytorch profiling tests into their own file.
peterg17 Dec 13, 2024
a77e167
ci(profiling): fix gpu time sample test.
peterg17 Dec 13, 2024
9e36686
better names for the pseudoframes and CUDA lane
Dec 13, 2024
4df82d8
ci(profiling): fix pytorch gpu test.
peterg17 Dec 13, 2024
c69660a
ci(profiling): remove thread id/name from pytorch test.
peterg17 Dec 13, 2024
25339fe
add push_absolute_ns
Dec 13, 2024
3d23dfc
reorder frames
Dec 13, 2024
1af4ce2
ci(profiling): move pytorch test to fix profiling CI, rewords docs.
peterg17 Dec 13, 2024
a4492b7
Merge branch 'main' into peterg17/pytorch_profiling_integration2
danielsn Dec 13, 2024
9393216
Trigger Build
Dec 13, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
40 changes: 40 additions & 0 deletions .github/workflows/pytorch_gpu_tests.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,40 @@
name: Pytorch Unit Tests (with GPU)

on:
pull_request:
branches:
- 'main'
paths:
- 'ddtrace/profiling/collector/pytorch.py'
workflow_dispatch:

jobs:
unit-tests:
strategy:
matrix:
os: [ubuntu-latest]
arch: [x86_64]
runs-on: ${{ matrix.os }}
steps:
- uses: actions/checkout@v4
# Include all history and tags
with:
fetch-depth: 0

- uses: actions/setup-python@v5
name: Install Python
with:
python-version: '3.12'

- uses: actions-rust-lang/setup-rust-toolchain@v1
- name: Install latest stable toolchain and rustfmt
run: rustup update stable && rustup default stable && rustup component add rustfmt clippy

- name: Install hatch
run: pip install hatch

- name: Install PyTorch
run: pip install torch

- name: Run tests
run: hatch run profiling_pytorch:test
Original file line number Diff line number Diff line change
Expand Up @@ -44,6 +44,9 @@ extern "C"
void ddup_push_release(Datadog::Sample* sample, int64_t release_time, int64_t count);
void ddup_push_alloc(Datadog::Sample* sample, int64_t size, int64_t count);
void ddup_push_heap(Datadog::Sample* sample, int64_t size);
void ddup_push_gpu_gputime(Datadog::Sample* sample, int64_t time, int64_t count);
void ddup_push_gpu_memory(Datadog::Sample* sample, int64_t mem, int64_t count);
void ddup_push_gpu_flops(Datadog::Sample* sample, int64_t flops, int64_t count);
void ddup_push_lock_name(Datadog::Sample* sample, std::string_view lock_name);
void ddup_push_threadinfo(Datadog::Sample* sample,
int64_t thread_id,
Expand All @@ -56,6 +59,7 @@ extern "C"
void ddup_push_trace_type(Datadog::Sample* sample, std::string_view trace_type);
void ddup_push_exceptioninfo(Datadog::Sample* sample, std::string_view exception_type, int64_t count);
void ddup_push_class_name(Datadog::Sample* sample, std::string_view class_name);
void ddup_push_gpu_device_name(Datadog::Sample*, std::string_view device_name);
void ddup_push_frame(Datadog::Sample* sample,
std::string_view _name,
std::string_view _filename,
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -45,7 +45,8 @@ namespace Datadog {
X(local_root_span_id, "local root span id") \
X(trace_type, "trace type") \
X(class_name, "class name") \
X(lock_name, "lock name")
X(lock_name, "lock name") \
X(gpu_device_name, "gpu device name")

#define X_ENUM(a, b) a,
#define X_STR(a, b) b,
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -100,6 +100,9 @@ class Sample
bool push_release(int64_t lock_time, int64_t count);
bool push_alloc(int64_t size, int64_t count);
bool push_heap(int64_t size);
bool push_gpu_gputime(int64_t time, int64_t count);
bool push_gpu_memory(int64_t size, int64_t count);
bool push_gpu_flops(int64_t flops, int64_t count);

// Adds metadata to sample
bool push_lock_name(std::string_view lock_name);
Expand All @@ -117,6 +120,9 @@ class Sample
bool is_timeline_enabled() const;
static void set_timeline(bool enabled);

// Pytorch GPU metadata
bool push_gpu_device_name(std::string_view device_name);

// Assumes frames are pushed in leaf-order
void push_frame(std::string_view name, // for ddog_prof_Function
std::string_view filename, // for ddog_prof_Function
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,10 @@ enum SampleType : unsigned int
LockRelease = 1 << 4,
Allocation = 1 << 5,
Heap = 1 << 6,
All = CPU | Wall | Exception | LockAcquire | LockRelease | Allocation | Heap
GPUTime = 1 << 7,
GPUMemory = 1 << 8,
GPUFlops = 1 << 9,
All = CPU | Wall | Exception | LockAcquire | LockRelease | Allocation | Heap | GPUTime | GPUMemory | GPUFlops
};

// Every Sample object has a corresponding `values` vector, since libdatadog expects contiguous values per sample.
Expand All @@ -30,6 +33,12 @@ struct ValueIndex
unsigned short alloc_space;
unsigned short alloc_count;
unsigned short heap_space;
unsigned short gpu_time;
unsigned short gpu_count;
unsigned short gpu_alloc_space;
unsigned short gpu_alloc_count;
unsigned short gpu_flops;
unsigned short gpu_flops_samples; // Should be "count," but flops is already a count
};

} // namespace Datadog
Original file line number Diff line number Diff line change
Expand Up @@ -193,6 +193,24 @@ ddup_push_heap(Datadog::Sample* sample, int64_t size) // cppcheck-suppress unuse
sample->push_heap(size);
}

void
ddup_push_gpu_gputime(Datadog::Sample* sample, int64_t time, int64_t count) // cppcheck-suppress unusedFunction
{
sample->push_gpu_gputime(time, count);
}

void
ddup_push_gpu_memory(Datadog::Sample* sample, int64_t size, int64_t count) // cppcheck-suppress unusedFunction
{
sample->push_gpu_memory(size, count);
}

void
ddup_push_gpu_flops(Datadog::Sample* sample, int64_t flops, int64_t count) // cppcheck-suppress unusedFunction
{
sample->push_gpu_flops(flops, count);
}

void
ddup_push_lock_name(Datadog::Sample* sample, std::string_view lock_name) // cppcheck-suppress unusedFunction
{
Expand Down Expand Up @@ -252,6 +270,12 @@ ddup_push_class_name(Datadog::Sample* sample, std::string_view class_name) // cp
sample->push_class_name(class_name);
}

void
ddup_push_gpu_device_name(Datadog::Sample* sample, std::string_view gpu_device_name) // cppcheck-suppress unusedFunction
{
sample->push_gpu_device_name(gpu_device_name);
}

void
ddup_push_frame(Datadog::Sample* sample, // cppcheck-suppress unusedFunction
std::string_view _name,
Expand Down
17 changes: 17 additions & 0 deletions ddtrace/internal/datadog/profiling/dd_wrapper/src/profile.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -89,6 +89,23 @@ Datadog::Profile::setup_samplers()
if (0U != (type_mask & SampleType::Heap)) {
val_idx.heap_space = get_value_idx("heap-space", "bytes");
}
if (0U != (type_mask & SampleType::GPUTime)) {
val_idx.gpu_time = get_value_idx("gpu-time", "nanoseconds");
val_idx.gpu_count = get_value_idx("gpu-samples", "count");
}
if (0U != (type_mask & SampleType::GPUMemory)) {
// In the backend the unit is called 'gpu-space', but maybe for consistency
// it should be gpu-alloc-space
// gpu-alloc-samples may be unused, but it's passed along for scaling purposes
val_idx.gpu_alloc_space = get_value_idx("gpu-space", "bytes");
val_idx.gpu_alloc_count = get_value_idx("gpu-alloc-samples", "count");
}
if (0U != (type_mask & SampleType::GPUFlops)) {
// Technically "FLOPS" is a unit, but we call it a 'count' because no
// other profiler uses it as a unit.
val_idx.gpu_flops = get_value_idx("gpu-flops", "count");
val_idx.gpu_flops_samples = get_value_idx("gpu-flops-samples", "count");
}

// Whatever the first sampler happens to be is the default "period" for the profile
// The value of 1 is a pointless default.
Expand Down
46 changes: 46 additions & 0 deletions ddtrace/internal/datadog/profiling/dd_wrapper/src/sample.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -262,6 +262,42 @@ Datadog::Sample::push_heap(int64_t size)
return false;
}

bool
Datadog::Sample::push_gpu_gputime(int64_t time, int64_t count)
{
if (0U != (type_mask & SampleType::GPUTime)) {
values[profile_state.val().gpu_time] += time * count;
values[profile_state.val().gpu_count] += count;
return true;
}
std::cout << "bad push gpu" << std::endl;
return false;
}

bool
Datadog::Sample::push_gpu_memory(int64_t size, int64_t count)
{
if (0U != (type_mask & SampleType::GPUMemory)) {
values[profile_state.val().gpu_alloc_space] += size * count;
values[profile_state.val().gpu_alloc_count] += count;
return true;
}
std::cout << "bad push gpu memory" << std::endl;
return false;
}

bool
Datadog::Sample::push_gpu_flops(int64_t size, int64_t count)
{
if (0U != (type_mask & SampleType::GPUFlops)) {
values[profile_state.val().gpu_flops] += size * count;
values[profile_state.val().gpu_flops_samples] += count;
return true;
}
std::cout << "bad push gpu flops" << std::endl;
return false;
}

bool
Datadog::Sample::push_lock_name(std::string_view lock_name)
{
Expand Down Expand Up @@ -351,6 +387,16 @@ Datadog::Sample::push_class_name(std::string_view class_name)
return true;
}

bool
Datadog::Sample::push_gpu_device_name(std::string_view device_name)
{
if (!push_label(ExportLabelKey::gpu_device_name, device_name)) {
std::cout << "bad push" << std::endl;
return false;
}
return true;
}

bool
Datadog::Sample::push_monotonic_ns(int64_t _monotonic_ns)
{
Expand Down
24 changes: 14 additions & 10 deletions ddtrace/internal/datadog/profiling/ddup/_ddup.pyi
Original file line number Diff line number Diff line change
Expand Up @@ -19,19 +19,23 @@ def start() -> None: ...
def upload() -> None: ...

class SampleHandle:
def push_cputime(self, value: int, count: int) -> None: ...
def push_walltime(self, value: int, count: int) -> None: ...
def flush_sample(self) -> None: ...
def push_acquire(self, value: int, count: int) -> None: ...
def push_release(self, value: int, count: int) -> None: ...
def push_alloc(self, value: int, count: int) -> None: ...
def push_class_name(self, class_name: StringType) -> None: ...
def push_cputime(self, value: int, count: int) -> None: ...
def push_exceptioninfo(self, exc_type: Union[None, bytes, str, type], count: int) -> None: ...
def push_frame(self, name: StringType, filename: StringType, address: int, line: int) -> None: ...
def push_gpu_device_name(self, device_name: StringType) -> None: ...
def push_gpu_flops(self, value: int, count: int) -> None: ...
def push_gpu_gputime(self, value: int, count: int) -> None: ...
def push_gpu_memory(self, value: int, count: int) -> None: ...
def push_heap(self, value: int) -> None: ...
def push_lock_name(self, lock_name: StringType) -> None: ...
def push_frame(self, name: StringType, filename: StringType, address: int, line: int) -> None: ...
def push_threadinfo(self, thread_id: int, thread_native_id: int, thread_name: StringType) -> None: ...
def push_monotonic_ns(self, monotonic_ns: int) -> None: ...
def push_release(self, value: int, count: int) -> None: ...
def push_span(self, span: Optional[Span]) -> None: ...
def push_task_id(self, task_id: Optional[int]) -> None: ...
def push_task_name(self, task_name: StringType) -> None: ...
def push_exceptioninfo(self, exc_type: Union[None, bytes, str, type], count: int) -> None: ...
def push_class_name(self, class_name: StringType) -> None: ...
def push_span(self, span: Optional[Span]) -> None: ...
def push_monotonic_ns(self, monotonic_ns: int) -> None: ...
def flush_sample(self) -> None: ...
def push_threadinfo(self, thread_id: int, thread_native_id: int, thread_name: StringType) -> None: ...
def push_walltime(self, value: int, count: int) -> None: ...
32 changes: 32 additions & 0 deletions ddtrace/internal/datadog/profiling/ddup/_ddup.pyx
Original file line number Diff line number Diff line change
Expand Up @@ -67,6 +67,9 @@ cdef extern from "ddup_interface.hpp":
void ddup_push_release(Sample *sample, int64_t release_time, int64_t count)
void ddup_push_alloc(Sample *sample, int64_t size, int64_t count)
void ddup_push_heap(Sample *sample, int64_t size)
void ddup_push_gpu_gputime(Sample *sample, int64_t gputime, int64_t count)
void ddup_push_gpu_memory(Sample *sample, int64_t size, int64_t count)
void ddup_push_gpu_flops(Sample *sample, int64_t flops, int64_t count)
void ddup_push_lock_name(Sample *sample, string_view lock_name)
void ddup_push_threadinfo(Sample *sample, int64_t thread_id, int64_t thread_native_id, string_view thread_name)
void ddup_push_task_id(Sample *sample, int64_t task_id)
Expand All @@ -76,6 +79,7 @@ cdef extern from "ddup_interface.hpp":
void ddup_push_trace_type(Sample *sample, string_view trace_type)
void ddup_push_exceptioninfo(Sample *sample, string_view exception_type, int64_t count)
void ddup_push_class_name(Sample *sample, string_view class_name)
void ddup_push_gpu_device_name(Sample *sample, string_view device_name)
void ddup_push_frame(Sample *sample, string_view _name, string_view _filename, uint64_t address, int64_t line)
void ddup_push_monotonic_ns(Sample *sample, int64_t monotonic_ns)
void ddup_flush_sample(Sample *sample)
Expand Down Expand Up @@ -301,6 +305,18 @@ cdef call_ddup_push_class_name(Sample* sample, class_name: StringType):
if utf8_data != NULL:
ddup_push_class_name(sample, string_view(utf8_data, utf8_size))

cdef call_ddup_push_gpu_device_name(Sample* sample, device_name: StringType):
if not device_name:
return
if isinstance(device_name, bytes):
ddup_push_gpu_device_name(sample, string_view(<const char*>device_name, len(device_name)))
return
cdef const char* utf8_data
cdef Py_ssize_t utf8_size
utf8_data = PyUnicode_AsUTF8AndSize(device_name, &utf8_size)
if utf8_data != NULL:
ddup_push_gpu_device_name(sample, string_view(utf8_data, utf8_size))

cdef call_ddup_push_trace_type(Sample* sample, trace_type: StringType):
if not trace_type:
return
Expand Down Expand Up @@ -447,6 +463,18 @@ cdef class SampleHandle:
if self.ptr is not NULL:
ddup_push_heap(self.ptr, clamp_to_int64_unsigned(value))

def push_gpu_gputime(self, value: int, count: int) -> None:
if self.ptr is not NULL:
ddup_push_gpu_gputime(self.ptr, clamp_to_int64_unsigned(value), clamp_to_int64_unsigned(count))

def push_gpu_memory(self, value: int, count: int) -> None:
if self.ptr is not NULL:
ddup_push_gpu_memory(self.ptr, clamp_to_int64_unsigned(value), clamp_to_int64_unsigned(count))

def push_gpu_flops(self, value: int, count: int) -> None:
if self.ptr is not NULL:
ddup_push_gpu_flops(self.ptr, clamp_to_int64_unsigned(value), clamp_to_int64_unsigned(count))

def push_lock_name(self, lock_name: StringType) -> None:
if self.ptr is not NULL:
call_ddup_push_lock_name(self.ptr, lock_name)
Expand Down Expand Up @@ -493,6 +521,10 @@ cdef class SampleHandle:
if self.ptr is not NULL:
call_ddup_push_class_name(self.ptr, class_name)

def push_gpu_device_name(self, device_name: StringType) -> None:
if self.ptr is not NULL:
call_ddup_push_gpu_device_name(self.ptr, device_name)

def push_span(self, span: Optional[Span]) -> None:
if self.ptr is NULL:
return
Expand Down
Loading
Loading