Skip to content

Commit 0054f3e

Browse files
Abhinay1997ardaatahanZachNagengast
authored
Regression Test Pipeline (#120)
Co-authored-by: Arda Atahan Ibis <[email protected]> Co-authored-by: ZachNagengast <[email protected]>
1 parent 3ebfa14 commit 0054f3e

24 files changed

+5295
-481
lines changed

.gitignore

+4
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,7 @@
11
.DS_Store
22
/.build
33
/Packages
4+
.vscode/
45
xcuserdata/
56
DerivedData/
67
.swiftpm/configuration/registries.json
@@ -56,8 +57,11 @@ fastlane/report.xml
5657
fastlane/Preview.html
5758
fastlane/screenshots
5859
fastlane/test_output
60+
fastlane/benchmark_data
61+
fastlane/upload_folder
5962

6063
### Xcode Patch ###
64+
**/*.xcconfig
6165
*.xcodeproj/*
6266
!*.xcodeproj/project.pbxproj
6367
!*.xcodeproj/xcshareddata/

BENCHMARKS.md

+120
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,120 @@
1+
# WhisperKit Benchmarks
2+
3+
This document describes how to run the benchmarks for WhisperKit. The benchmarks can be run on a specific device or all connected devices. The results are saved in JSON files and can be uploaded to the [argmaxinc/whisperkit-evals-dataset](https://huggingface.co/datasets/argmaxinc/whisperkit-evals-dataset) dataset on HuggingFace as a pull request. Below are the steps to run the benchmarks locally in order to reproduce the results shown in our [WhisperKit Benchmarks](https://huggingface.co/spaces/argmaxinc/whisperkit-benchmarks) space.
4+
5+
## Download the Source
6+
7+
To download the code to run the test suite, run:
8+
9+
```sh
10+
git clone [email protected]:argmaxinc/WhisperKit.git
11+
```
12+
13+
## Local Environment
14+
15+
Before running the benchmarks, you'll need to set up your local environment with the necessary dependencies. To do this, run:
16+
17+
```sh
18+
make setup
19+
```
20+
21+
See [Contributing](CONTRIBUTING.md) for more information.
22+
23+
24+
## Xcode Environment
25+
26+
When running the tests, the model to test needs is provided to the Xcode from Fastlane as an environment variable:
27+
28+
1. Open the example project:
29+
30+
```sh
31+
xed Examples/WhisperAX
32+
```
33+
34+
2. At the top, you will see the app icon and `WhisperAX` written next to it. Click on `WhisperAX` and select `Edit Scheme` at the bottom.
35+
36+
3. Under `Environment Variables`, you will see an entry with `MODEL_NAME` as the name and `$(MODEL_NAME)` as the value.
37+
38+
## Devices
39+
40+
> [!IMPORTANT]
41+
> An active developer account is required to run the tests on physical devices.
42+
43+
Before running tests, all external devices need to be connected and paired to your Mac, as well as registered with your developer account. Ensure the devices are in Developer Mode. If nothing appears after connecting the devices via cable, press `Command + Shift + 2` to open the list of devices and track their progress.
44+
45+
## Datasets
46+
47+
The datasets for the test suite can be set in a global array called `datasets` in the file [`Tests/WhisperKitTests/RegressionTests.swift`](Tests/WhisperKitTests/RegressionTests.swift). It is prefilled with the datasets that are currently available.
48+
49+
## Models
50+
51+
The models for the test suite can be set in the [`Fastfile`](fastlane/Fastfile). Simply find `BENCHMARK_CONFIGS` and modify the `models` array under the benchmark you want to run.
52+
53+
## Makefile and Fastlane
54+
55+
The tests are run using [Fastlane](fastlane/Fastfile), which is controlled by a [Makefile](Makefile). The Makefile contains the following commands:
56+
57+
### List Connected Devices
58+
59+
Before running the tests it might be a good idea to list the connected devices to resolve any connection issues. Simply run:
60+
61+
```sh
62+
make list-devices
63+
```
64+
65+
The output will be a list with entries that look something like this:
66+
67+
```ruby
68+
{
69+
:name=>"My Mac",
70+
:type=>"Apple M2 Pro",
71+
:platform=>"macOS",
72+
:os_version=>"15.0.1",
73+
:product=>"Mac14,12",
74+
:id=>"XXXXXXXX-1234-5678-9012-XXXXXXXXXXXX",
75+
:state=>"connected"
76+
}
77+
```
78+
79+
Verify that the devices are connected and the state is `connected`.
80+
81+
### Running Benchmarks
82+
83+
After completing the above steps, you can run the tests. Note that there are two different test configurations: one named `full` and the other named `debug`. To check for potential errors, run the `debug` tests:
84+
85+
```sh
86+
make benchmark-devices DEBUG=true
87+
```
88+
89+
Otherwise run the `full` tests:
90+
91+
```sh
92+
make benchmark-devices
93+
```
94+
95+
Optionally, for both tests, you can specify the list of devices for the tests using the `DEVICES` option:
96+
97+
```sh
98+
make benchmark-devices DEVICES="iPhone 15 Pro Max,My Mac"
99+
```
100+
101+
The `DEVICES` option is a comma-separated list of device names. The device names can be found by running `make list-devices` and using the value for the `:name` key.
102+
103+
### Results
104+
105+
After the tests are run, the generated results can be found under `fastlane/benchmark_data` including the .xcresult file with logs and attachments for each device. There will also be a folder called `fastlane/upload_folder/benchmark_data` that contains only the JSON results in `fastlane/benchmark_data` that can used for further analysis.
106+
107+
We will periodically run these tests on a range of devices and upload the results to the [argmaxinc/whisperkit-evals-dataset](https://huggingface.co/datasets/argmaxinc/whisperkit-evals-dataset), which will propagate to the [WhisperKit Benchmarks](https://huggingface.co/spaces/argmaxinc/whisperkit-benchmarks) space and be available for comparison.
108+
109+
110+
# Troubleshooting
111+
112+
113+
If you encounter issues while running the tests, heres a few things to try:
114+
115+
1. Open the project in Xcode and run the tests directly from there.
116+
1. To do this, open the example app (from command line type: `xed Examples/WhisperAX`) and run the test named `RegressionTests/testModelPerformanceWithDebugConfig` from the test navigator.
117+
2. If the tests run successfully, you can rule out any issues with the device or the models.
118+
3. If they dont run successfully, Xcode will provide more detailed error messages.
119+
2. Try specifying a single device to run the tests on. This can be done by running `make list-devices` and then running the tests with the `DEVICES` option set to the name of the device you want to test on. For example, `make benchmark-devices DEVICES="My Mac"`. This will also enable you to see the logs for that specific device.
120+
3. If you are still encountering issues, please reach out to us on the [Discord](https://discord.gg/G5F5GZGecC) or create an [issue](https://github.com/argmaxinc/WhisperKit/issues) on GitHub.

Examples/WhisperAX/Debug.xcconfig

+8
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,8 @@
1+
// For licensing see accompanying LICENSE.md file.
2+
// Copyright © 2024 Argmax, Inc. All rights reserved.
3+
4+
// Configuration settings file format documentation can be found at:
5+
// https://help.apple.com/xcode/#/dev745c5c974
6+
7+
CODE_SIGN_STYLE=Automatic
8+
DEVELOPMENT_TEAM=

0 commit comments

Comments
 (0)