Mock Export Challenge

Overview

You are provided with a mock server that simulates healthcare data exports. Each export consists of one or more downloadable datasets in CSV format. Your task is to write a program that processes these datasets and computes record counts.

The goal of this exercise is not only correctness but also clarity, reasoning about trade-offs, and handling performance constraints. You are encouraged to use internet and AI resources as part of your process. Be ready to explain and justify your approach in the follow-up discussion.

Setup

Install dependencies using uv.
Sync dependencies:
```
uv sync
```
Run the server:
```
uv run server
```
Run your code:
```
uv run cli
```
Add any additional dependencies with:
```
uv add <package>
```

Problem Statement

Each export contains multiple downloadable CSV files. Each row represents a simulated patient event, with the following columns:

patient_id
event_time
event_type
value

Your task is to build a program that:

Discovers exports and their downloads using the server API.
Processes CSV files efficiently, taking into account file size and multiple downloads.
Produces counts of records across patients and totals, output as formatted JSON printed to stdout.

The expected JSON structure should look like this (aggregated across all downloads of an export):

{
  "patients": {
    "P001": {
      "heart_rate": 1520,
      "spo2": 1470
    }
  },
  "totals": {
    "heart_rate": 8000,
    "spo2": 6000
  }
}

Notes

Your CLI should accept an export ID (demo, small, or large) as an argument and run the analysis for that export.
All counts must be aggregated across all downloads belonging to the chosen export.
Download time ranges are guaranteed to be non-overlapping.

Constraints

DO NOT use Pandas or Numpy.
This exercise is designed for roughly 1-2 hours of focused work.
The full dataset may be large (millions of rows per download).
Your solution should be mindful of performance and memory usage.
Aim for readability and maintainability of code.

Conclusion

The goal of this challenge is to demonstrate how you approach practical data processing: discovering data, handling performance trade-offs, producing accurate results, and presenting them clearly. There is no single “correct” solution-what matters is the reasoning behind your choices and how you communicate them. We will review and discuss your results together over a video call, so be prepared to explain and justify your decisions.

Submission Instructions

When you have completed the assessment, please submit your work as a public GitHub repository.

Ensure the repository includes all source code, supporting files, and this README.
Commit the final JSON output for each export as demo.json, small.json, and large.json.
DO NOT submit a pull request to the company’s repositories.
Provide the link to your public repository to your recruiter or hiring contact.
During the interview, you will be asked to show off your solution running and do an interactive code review. Be ready to share screen and have the project ready.

Name		Name	Last commit message	Last commit date
Latest commit History 26 Commits
api-exploration		api-exploration
src		src
tests		tests
.gitignore		.gitignore
.python-version		.python-version
CLAUDE.md		CLAUDE.md
LICENSE		LICENSE
README.md		README.md
demo.json		demo.json
large.json		large.json
pyproject.toml		pyproject.toml
small.json		small.json
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Mock Export Challenge

Overview

Setup

Problem Statement

Notes

Constraints

Conclusion

Submission Instructions

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Mock Export Challenge

Overview

Setup

Problem Statement

Notes

Constraints

Conclusion

Submission Instructions

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages