Description
A primary goal I have is to be able to compare/track results over time. To do this we need repeatable/comparable data, ideally in a standardized format for tooling needs.
For example, one common thing folks need to include when they run benchmarks and other performance tests is the system stats they ran on. I think we should have a standard way to collect these and a standard format to include them in our saved results.
Additionally I think we want to include some "build metadata" like the repo and git ref from the repo where the tests were run, any relevant tool settings (ex. autocannon
parallelization), and relevant information about the purpose of the run.
Because we are distributed across orgs and repos I think we need a strategy for where these things will live. I propose the following:
- Tools and shared code (ex. common autocannon request groups) for perf testing should live in the
perf-wg
repo - Load tests for
express
and allexpress
dependencies should live in theexpress
repo - Load tests for middleware and non-express deps should live in their repos
- Benchmark tests should live in the repo they test
- Historical results should live in a directory adjacent to the test code
We could go so far as to standardize where in each repo these live. For example, it would be easier to have tooling that works across all the repos if we agreed on where some things lived. With that in mind, I propose the following:
perf
├── bench
│ └── example
│ ├── index.mjs
│ ├── package.json
│ └── results
| ├── result-1749656765670.json
└── load
└── example
├── index.mjs
├── package.json
└── results
├── result-1749614997462.json
In addition to where things are kept, I think we would benefit from a standard schema for storing the results. As shown above, I propose a JSON file stored like this results/result-<timestamp>.json
. The schema can probably be defined as a JSON schema, but here is a rough idea I had for a load test result to get us started:
{
"serverMetadata": {
"cpus": [...],
"arch": "..."
},
"clientMetadata": {
"cpus": [...],
"arch": "..."
},
"serverResults": {
"metrics": { ... }
},
"clientResults": {
"latency": { ... }
},
}