-
Notifications
You must be signed in to change notification settings - Fork 37
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Tracing fix #128
Tracing fix #128
Conversation
Issue when running with
|
There is a question of whether the step and execution spans should be ERROR so that they are always included. Having that information is pretty much always useful. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the CR - this will debugging a lot easier, esp when using Shuttle with proptesting. And the *_awaiting_is_irrelevant
examples are really helpful!
src/lib.rs
Outdated
/// `tracing_subscriber::fmt`, appends to the span on calls to `record()` (instead of | ||
/// overwriting), which results in traces which are hard to read if the task is scheduled more | ||
/// than a few times. | ||
/// Thus: set `record_steps_in_span` to `true` if you want this behaviour, or if you are using |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
By this behavior
you mean append behavior
right? Could you make that clear?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also: do we really need to provide this option? What other format subscribers are there for which we'd want to set this to true
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
By this behavior you mean append behavior right? Could you make that clear?
By this_behaviour
I mean that we call record()
on the step_span
. In tracing_subscriber::fmt
this maps to appends. In The Nice Subscriber this maps to overwrites.
Also: do we really need to provide this option? What other format subscribers are there for which we'd want to set this to true?
Having the step logged is super useful. I think this will be set to true
for all my tests. I'll make a tracing_subscriber::fmt::Layer
which handles this.
src/runtime/execution.rs
Outdated
if let Some(span_id) = current_span.id().as_ref() { | ||
subscriber.exit(span_id); | ||
} | ||
state.current_mut().span = current_span; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This makes me wonder if the field should be renamed from span
to last_span
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
?
Not super important what it is named, but why?
The span hasn't changed, we have just swapped out of the task. The span
is the Span of the task, just as the continuation
is its continuation.
tests/basic/tracing.rs
Outdated
} | ||
} | ||
|
||
#[test] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since these tests are not checking anything, maybe mark them ignored
for now?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. I'd like to add a Subscriber which checks them. Can ignore for now, then add that later (I'll open an issue).
- current_task -> parent_task_id - span -> top_level_span - parent_id -> parent_span_id - #[ignore] tracing tests - move test_tracing_nested_spans up
Nevermind, I did not look into the non ERROR/WARN-traces sufficiently. There is Span-stacking on |
Interestingly the issues with instrumented futures don't show up when running with Tracing 0.2.0 (the current pre-release version). |
tracing::info_span!(parent: parent_span.id(), "step", i = schedule_len, task = id.0) | ||
}; | ||
let step_span = error_span!(parent: parent_span_id.clone(), "step", task = id.0, i = field::Empty); | ||
// Note that this is slightly lazy — we are starting storing at the step_span, but could have gotten the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm, if we know that storing the full stack would be better, why are we cutting corners here? Why not just do the right thing?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because I don't think it is better / I don't believe this is cutting corners. This comment and the other one of a similar nature are meant as the following: "I really believe the current scheme is correct. However, spans are both weird and tricky. If there ever are any issues, then start by doing the changes in the comments, in order to definitely rule them out.". In other words: I think the current way of doing it is "just right", and that the suggestions are "too much". I'd rather do the "just right" approach, both for overall code cleanness, and for the opportunity to be proven wrong (and thus learn more about Spans), but leave the comments in case I am wrong.
// All of the tests testing tracing will either have to have weak assertions, | ||
// or be #[ignore]d. The reason for this is that they are not thread safe | ||
// (since everything tracing is managed globally), and there is no way to | ||
// ensure that the tests are run single-threaded. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is this crate: https://docs.rs/serial_test/latest/serial_test/index.html
See if it's worth using here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will do, thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems to me that it won't solve the issue, unless all other tests are marked with parallel
, which isn't worth it. This is because marking a test with serial
only guarantees that it won't be ran at the same time as other serial
or parallel
tests, but it can still run at the same time as tests which are not marked.
d1a90e8
to
8274c09
Compare
8274c09
to
f190a4d
Compare
* Tracing fix * Own Span in Task to avoid dropping the Span while Task is switched out * Respond to comments by @jorajeev - current_task -> parent_task_id - span -> top_level_span - parent_id -> parent_span_id - #[ignore] tracing tests - move test_tracing_nested_spans up * Make the execution_span and the step_span ERROR * Add instrumented futures test * Fix issue with tracing and instrumented futures * Update instrumented futures tracing test to highlight top level span failures * Clippy unneeded async * Update ResetSpanOnDrop to store an owned copy of the Span * Address comments by @jorajeev
I want to add a bit more testing. At least something fairly minimal like connecting the tests to a
tracing_subscriber::fmt
which outputs to something else than stdout and then check on that, and a simple "X threads enter and exit Y spans and assert that the current span is what they expect" test.Opening a PR in order to get feedback and suggestions. Feel free to suggest alternative test ideas as well. Would be nice to do a programmatic check against Tokio / no runtime as well.
Fixing tracing
I think I have done it right this time. One known divergence from Tokio (listed below), might be more, but I think the general scheme should work. Also adds a fix for spans stacking on test panic inside a
catch_unwind
(read: proptest minimization).Overview of the changes
record_steps_in_span
.step{i=X, task=Y}
span is nowstep{task=Y}
by default, andstep{task=Y, i=X}
ifrecord_steps_in_span
istrue
advance_to_next_task
. It is instead done instep
, by entering aTask
s span before continuing its continuation, and leaving the Span when we return from the continuation.ResetSpanOnDrop
which is constructed inrun
. This is to solve the issue of span stacking on panics inside acatch_unwind
(read: proptest minimization). I found this way of solving it much cleaner than finding all the places we could exit a Shuttle test and making sure to clean up spans there.Task
) are now emitted in (an extension of) Shuttle's span.Some notes
thread_fn
was cumbersome and thus verbose, and thus more error-prone than using whatever trace was present. This means there may be traces emitted where we would prefer to use a different span than the one we are using currently (as I have not looked into every singleEvent
emission), these can be handled as we find them.in_scope
inspawn_thread
andTask::new()
are matters of taste — I prefer for these to be emitted under Shuttle's span, but, they are reactions to actions taken by the function under test, so there is an argument for emitting them under whatever span is currently entered.Known issues / discrepancies
Below are some tests which do the same thing using either Shuttle
sync
/future
threads,std::sync
threads andtokio::spawn
threads. Shuttlesync
/future
andstd::sync
does the same thing,tokio::spawn
does something different. I think the thing does currently is sane enough (arguably better than the mess you may get when running under Tokio — this is not how to combine futures and spans).Tokio
Doing
RUST_LOG=warn cargo test tokio_awaiting
gives:As we can see, doing
t.await.unwrap();
for the firstt
makes it execute under the parent spans, and not doing it for the secondt
makes those have no parent.Shuttle
future::spawn
Doing
RUST_LOG=warn cargo test shuttle_test_awaiting
gives:As we can see, this differs from what we saw when running under Tokio.
No runtime
std::thread::spawn
Doing
RUST_LOG=warn cargo test std_thread -- --nocapture
gives:As we can see, the same as above.
Shuttle
thread::spawn
Running
RUST_LOG=warn cargo test shuttle_test_join -- --nocapture
gives:As we can see, the same as the two above.
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.