Skip to content

coverage: Only merge adjacent coverage spans #139966

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

Zalathar
Copy link
Contributor

@Zalathar Zalathar commented Apr 17, 2025

For a long time, coverage instrumentation has automatically “merged” spans with the same control-flow into a smaller number of larger spans, even when the spans being merged are not overlapping or adjacent. This causes any source text between the original spans to be included in the merged span, which is then associated with an execution count when shown in coverage reports.

That approach causes a number of problems:

  • The intervening source text can contain all sorts of things that shouldn't really be marked as executable code (e.g. nested items, parts of macro invocations, long comments). In some cases we have complicated workarounds (e.g. bucketing to avoid merging spans across nested items), but in other cases there isn't much we can do.
  • Merging can have aesthetically weird effects, such as including unbalanced parentheses, because the merging process doesn't really understand what it's doing at a source code level.
  • It generally leads to an accumulation of piled-on heuristics and special cases that give decent-looking results, but are fiendishly difficult to modify or replace.

Therefore, this PR aims to abolish the merging of non-adjacent coverage spans.

The big tradeoff here is that the resulting coverage metadata (embedded in the instrumented binary) tends to become larger, because the overall number of distinct spans has increased. That's unfortunate, but I see it as the inevitable cost of cleaning up the messes and inaccuracies that were caused by the old approach. And the resulting spans do tend to be more accurate to the program's actual control-flow.


The .coverage snapshot changes give an indication of how this PR will affect user-visible coverage reports. In many cases the changes to reporting are minor or even nonexistent, despite substantial changes to the metadata (as indicated by .cov-map snapshots).


try-job: aarch64-gnu

@Zalathar Zalathar added the A-code-coverage Area: Source-based code coverage (-Cinstrument-coverage) label Apr 17, 2025
@rustbot
Copy link
Collaborator

rustbot commented Apr 17, 2025

r? @lcnr

rustbot has assigned @lcnr.
They will have a look at your PR within the next two weeks and either review your PR or reassign to another reviewer.

Use r? to explicitly pick a reviewer

@rustbot rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels Apr 17, 2025
@rustbot
Copy link
Collaborator

rustbot commented Apr 17, 2025

Some changes occurred to MIR optimizations

cc @rust-lang/wg-mir-opt

@Zalathar
Copy link
Contributor Author

@bors try

bors added a commit to rust-lang-ci/rust that referenced this pull request Apr 17, 2025
coverage: Only merge adjacent coverage spans

For a long time, coverage instrumentation has automatically “merged” spans with the same control-flow into a smaller number of larger spans, even when the spans being merged are not overlapping or adjacent. This causes any source text between the original spans to be included in the merged span, which is then associated with an execution count when shown in coverage reports.

That approach causes a number of problems:
- The intervening source text can contain all sorts of things that shouldn't really be marked as executable code (e.g. nested items, parts of macro invocations, long comments). In some cases we have complicated workarounds (e.g. bucketing to avoid merging spans across nested items), but in other cases there isn't much we can do.
- Merging can have aesthetically weird effects, such as including unbalanced parentheses, because the merging process doesn't really understand what it's doing at a source code level.
- It generally leads to an accumulation of piled-on heuristics and special cases that give decent-looking results, but are fiendishly difficult to modify or replace.

Therefore, this PR aims to abolish the merging of non-adjacent coverage spans.

The big tradeoff here is that the resulting coverage metadata (embedded in the instrumented binary) tends to become larger, because the overall number of distinct spans has increased. That's unfortunate, but I see it as the inevitable cost of cleaning up the messes and inaccuracies that were caused by the old approach. And the resulting spans do tend to be more accurate to the program's actual control-flow.

---

The `.coverage` snapshot changes give an indication of how this PR will affect user-visible coverage reports. In many cases the changes to reporting are minor or even nonexistent, despite substantial changes to the metadata (as indicated by `.cov-map` snapshots).

---

try-job: aarch64-gnu
@bors
Copy link
Collaborator

bors commented Apr 17, 2025

⌛ Trying commit 3d65655 with merge 07db97e...

@bors
Copy link
Collaborator

bors commented Apr 17, 2025

☀️ Try build successful - checks-actions
Build commit: 07db97e (07db97e0af0034aa682d7b41f54e783fad57e611)

@lcnr
Copy link
Contributor

lcnr commented Apr 30, 2025

I looked into this now and while it all seems reasonable, I am actually not comfortable with approving these changes. I cannot really judge whether the changes to the coverage files are desirable or whether there's anything I may need to be careful with. It shouldn't take two weeks to reassign a PR. Sorry for not taking the time earlier.

r? compiler

@rustbot rustbot assigned davidtwco and unassigned lcnr Apr 30, 2025
LL| 0| |input: &str| {
LL| | |input: &str| {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shouldn't arguments still get covered somewhere?

Copy link
Contributor Author

@Zalathar Zalathar Apr 30, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've figured out a way to give closure arguments the same treatment as function signatures, which should avoid this difference in coverage report output.

(The old way inherently relied on merging a fake zero-length span at the start of the closure body, which is why the current PR causes it to no longer appear as covered.)

@oli-obk
Copy link
Contributor

oli-obk commented Apr 30, 2025

@bors r+ rollup

@bors
Copy link
Collaborator

bors commented Apr 30, 2025

📌 Commit 8e36076 has been approved by oli-obk

It is now in the queue for this repository.

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Apr 30, 2025
@Zalathar
Copy link
Contributor Author

Zalathar commented May 1, 2025

@bors r-

I pushed a rebase, but I haven’t yet made the adjustments I was planning.

@bors bors added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. labels May 1, 2025
Zalathar added 2 commits May 1, 2025 14:23
This also removes some manipulation of the function signature span that only
made sense in the context of merging non-adjacent spans.
Because we no longer merge non-adjacent spans, there is no need to use buckets
to prevent merging across hole spans.
@Zalathar
Copy link
Contributor Author

Zalathar commented May 1, 2025

After looking into this some more, I realised that currently we don't actually mark closure arguments as coverage regions.

Instead, the execution count on lines like |input: &str| { was coming from a synthetic coverage region on the { character, representing the start of the closure body. That synthetic region is no longer added (because it was motivated by the old span-merging behaviour), which is why such lines no longer have coverage regions.

So what I think I'm going to do for now is restore that those synthetic regions to reduce the churn of this change, and worry about what to do with them (and closure arguments) later.

@Zalathar
Copy link
Contributor Author

Zalathar commented May 1, 2025

Restored (for now) the synthetic region on the { at the start of function bodies that don't have a signature (diff).

@rustbot ready

@rustbot rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. and removed S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. labels May 1, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-code-coverage Area: Source-based code coverage (-Cinstrument-coverage) S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants