Standardized formats, locations, and data contracts

A primary goal I have is to be able to compare/track results over time. To do this we need repeatable/comparable data, ideally in a standardized format for tooling needs. 

For example, one common thing folks need to include when they run benchmarks and other performance tests is the system stats they ran on. I think we should have a standard way to collect these and a standard format to include them in our saved results.

Additionally I think we want to include some "build metadata" like the repo and git ref from the repo where the tests were run, any relevant tool settings (ex. `autocannon` parallelization), and relevant information about the purpose of the run.

Because we are distributed across orgs and repos I think we need a strategy for where these things will live. I propose the following:

1. Tools and shared code (ex. common autocannon request groups) for perf testing should live in the `perf-wg` repo
2. Load tests for `express` and all `express` dependencies should live in the `express` repo
3. Load tests for middleware and non-express deps should live in their repos
4. Benchmark tests should live in the repo they test
5. Historical results should live in a directory adjacent to the test code

We could go so far as to standardize where in each repo these live. For example, it would be easier to have tooling that works across all the repos if we agreed on where some things lived. With that in mind, I propose the following:

```
perf
├── bench
│   └── example
│       ├── index.mjs
│       ├── package.json
│       └── results
|          ├── result-1749656765670.json
└── load
    └── example
        ├── index.mjs
        ├── package.json
        └── results
            ├── result-1749614997462.json
```

In addition to where things are kept, I think we would benefit from a standard schema for storing the results. As shown above, I propose a JSON file stored like this `results/result-<timestamp>.json`. The schema can probably be defined as a JSON schema, but here is a rough idea I had for a load test result to get us started:

```json
{
  "serverMetadata": {
    "cpus": [...],
    "arch": "..."
  },
  "clientMetadata": {
    "cpus": [...],
    "arch": "..."
  },
  "serverResults": {
    "metrics": { ... }
  },
  "clientResults": {
    "latency": { ... }
  },
}
```

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Standardized formats, locations, and data contracts #19

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

Standardized formats, locations, and data contracts #19

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions