Skip to content

Conversation

johnaohara
Copy link
Member

No description provided.

#[Buckets = 27, SubBuckets = 2048]
----------------------------------------------------------
92597 requests in 20.00s, 9.01MB read <6>
Socket errors: connect 5, read 0, write 0, timeout 40
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

socket errors? Some have been timed out...are important info?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good catch, and tbh I did not capture that info. It will mean we will likely be missing measurements for 5 requests... but I don't think that invalidates the message in the post.

Av Throughput: 5142.425405 req/sec <6>

----
<1> The average RTT was 0.240 ms
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The order and phrasing content in the bullet points is slightly different from the one to compare with i.e. https://github.com/RedHatPerf/redhatperf.github.io/pull/21/files#diff-e4c84ea749a91a7a6497412e17ff5f4973aec06e593c39f88e6c5abab5211a6aR432

and my eyes are flipping to detect which parts are relevant/different for comparison

A tabular/summary/chart view would help IMO


https://github.com/johnaohara/bpf_validator/[bpf_validator] allows us to independently verify that the numbers produced by a Load Driver are not biased, and contain the full sample count.

He tested against https://hyperfoil.io/[Hyperfoil] and https://github.com/giltene/wrk2[wrk2] to confirm the results presented by Hyperfoil are a accurate representation of what happened during a load test.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We instead of He?


During our validation, we discovered;

* Hyperfoil to add 0.072ms on average to RTT and the summary statistics reports all requests sent.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

extplain what RTT stands for?

A typical load test will be of the form shown above.

* **System Under Test (SUT)**: a dedicated system that contains an operating system, hosted application and related process to support the application. The application typically runs as a service, listening for requests on a particular port. When a request is revived, the network stack processes the request and pass it to the application, which handles the request and sends a response.
* **Load Driver**: The job of the load driver is to replicate virtual users and measure their experience. Typical measurements include samples of Round Trip Time (RTT) or Throughput (req/sec)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Move the explanation for RTT to the first time it's referenced?

* **System Under Test (SUT)**: a dedicated system that contains an operating system, hosted application and related process to support the application. The application typically runs as a service, listening for requests on a particular port. When a request is revived, the network stack processes the request and pass it to the application, which handles the request and sends a response.
* **Load Driver**: The job of the load driver is to replicate virtual users and measure their experience. Typical measurements include samples of Round Trip Time (RTT) or Throughput (req/sec)

The purpose of the load driver is to characterize the SUT, from the perspective of a virtual user.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The purpose of the load driver it to characterize the user/client, not the SUT?


By filtering out all packets that are sent to, and received from, a specific remote port a timestamp is recorded when a packet is sent and also when a full HTTP response has been received.

A map of timestamps is maintained within the kernel code, which calculates the RTT for each individual request. The timestamps are sent to a user space application, that records the timestamps in a http://www.hdrhistogram.org/[hdrHistogram]
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add . to the end

@franz1981 franz1981 requested a review from Copilot June 30, 2025 14:50
@franz1981
Copy link
Contributor

Let's see if @copilot is smart enough (will do on my article too)

Copy link

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR introduces a comprehensive AsciiDoc post that explains the concept of uncoordinated omission in HTTP load testing and demonstrates how to validate load generators using the bpf_validator tool. Key changes include:

  • Creating a detailed guide with performance results and comparison tests between Hyperfoil and wrk2.
  • Providing step-by-step instructions and sample command outputs for running the bpf_validator tool.
  • Including comprehensive statistical analysis and comparisons of request rates and latencies.

What can we do about it? How can we determine if our load driver is

* a) accurate
* b) reporting summary results from all the sample
Copy link

Copilot AI Jun 30, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider correcting 'sample' to 'samples' for grammatical accuracy.

Suggested change
* b) reporting summary results from all the sample
* b) reporting summary results from all the samples

Copilot uses AI. Check for mistakes.

99.9th Percentile: 0.841215
99.99th Percentile: 1.735679

99954 requests in 9995496s <3>
Copy link

Copilot AI Jun 30, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The test duration value '9995496s' appears incorrect; please verify and update it to the intended duration format (e.g., '20.00s').

Suggested change
99954 requests in 9995496s <3>
99954 requests in 99954s <3>

Copilot uses AI. Check for mistakes.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was by accident but 9995496s is indeed suspicious (good catch AI?!)

@franz1981
Copy link
Contributor

franz1981 commented Sep 17, 2025

FYI @lampajr : https://tanelpoder.com/posts/optimizing-ebpf-biolatency-accounting/

Since https://github.com/johnaohara/bpf_validator/blob/26145dee0c3e37e55a3996190edc5ba7ac2a4dd2/accept.bpf.c#L34 is using the same type - the article shows the benefit to move to BPF_MAP_TYPE_PERCPU_HASH, instead.
This is not relevant in all the call paths, but on https://github.com/johnaohara/bpf_validator/blob/26145dee0c3e37e55a3996190edc5ba7ac2a4dd2/accept.bpf.c#L135 and https://github.com/johnaohara/bpf_validator/blob/26145dee0c3e37e55a3996190edc5ba7ac2a4dd2/accept.bpf.c#L200-L206 a similar pattern is used and usually entering and exit on the same read sys call doesn't really need to have a shared type of map (percpu solve that, but can be used only on specific cases).

In order to verify that improving this would matter, clearly we need to know first if capturing once "read" is called means capturing something important first, but in case it will, getting there without introducing a scalability problem, is a good idea IMO.
It can be both tested w a JMH program or a full fat load gen, adding more threads (to hyperfoil) or singe threaded too (still on hyperfoil) attaching a profiler like async profiler but with unconfined/unpriviledged access to walk the kernel stack trace: an attached ebpf program should be visible there too AFAIK

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants