Skip to content

Fix unneeded read in trace_parser #189

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 5 commits into
base: dev
Choose a base branch
from

Conversation

ammrat13
Copy link

The parse_from_string method in trace_parser.cc currently reads in one delta for each active thread. However, tracer_tool.cu does not write the delta for the first thread since it will always be zero. This commit updates the behavior of trace_parser to match the intended trace format.

Even though this bug was present, it had no effect in the past since the base_delta_decompress function is correct.

ammrat13 and others added 2 commits March 27, 2023 22:39
The `parse_from_string` method in `trace_parser.cc` currently reads in one delta
for each active thread. However, `tracer_tool.cu` does not write the delta for
the first thread since it will always be zero. This commit updates the behavior
of `trace_parser` to match the intended trace format.

Even though this bug was present, it had no effect in the past since the
`base_delta_decompress` function is correct.
@JRPan JRPan requested review from a team, Shreya-gaur and William-An and removed request for a team and Shreya-gaur May 15, 2023 18:12
@William-An
Copy link
Contributor

@JRPan @tgrogers For the Jenkins test cases, do we only get traces with stride memory pattern? Looks like his original modification that removes the base_delta_compress call should either crash or generate invalid results for the simulator.

@tgrogers
Copy link
Contributor

@ammrat13 - thanks for the commit.
Did you have a use case where you saw this bug having an effect?

@William-An, I am not sure we have all kinds of workloads with varying amounts of divergence in memory. However, the long tests are only looking at the uBenches which are probably almost always strided in their memory accesses.

https://tgrogers-pc01.ecn.purdue.edu/github-ci/accel-sim/correl/git_refs/pull/162/merge_139_1/l1readaccess.QV100-SASS-accel-6c0707c-gpgpu-68e1cd3.per-kernel.html

These look great - but it would be good to also correlate rodinia-3.1 I think so we get some more variety.
@JRPan - how much longer would it take to correlate the 3.1 apps alongside the uBench in the long tests?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants