Commit ffc3d62
feat: Implement TrainerClient Backends & Local Process (#33)
* Implement TrainerClient Backends & Local Process
Signed-off-by: Saad Zaher <[email protected]>
* Implement Job Cancellation
Signed-off-by: Saad Zaher <[email protected]>
* update local job to add resouce limitation in k8s style
Signed-off-by: Saad Zaher <[email protected]>
* Update python/kubeflow/trainer/api/trainer_client.py
Co-authored-by: Andrey Velichkevich <[email protected]>
Signed-off-by: Saad Zaher <[email protected]>
* Fix linting issues
Signed-off-by: Saad Zaher <[email protected]>
* fix unit tests
Signed-off-by: Saad Zaher <[email protected]>
* add support wait_for_job_status
Signed-off-by: Saad Zaher <[email protected]>
* Update data types
Signed-off-by: Saad Zaher <[email protected]>
* fix merge conflict
Signed-off-by: Saad Zaher <[email protected]>
* fix unit tests
Signed-off-by: Saad Zaher <[email protected]>
* remove TypeAlias
Signed-off-by: Saad Zaher <[email protected]>
* Replace TRAINER_BACKEND_REGISTRY with TRAINER_BACKEND
Signed-off-by: Saad Zaher <[email protected]>
* Update kubeflow/trainer/api/trainer_client.py
Co-authored-by: Andrey Velichkevich <[email protected]>
Signed-off-by: Saad Zaher <[email protected]>
* Update kubeflow/trainer/api/trainer_client.py
Co-authored-by: Andrey Velichkevich <[email protected]>
Signed-off-by: Saad Zaher <[email protected]>
* Restructure training backends into separate dirs
Signed-off-by: Saad Zaher <[email protected]>
* Update kubeflow/trainer/api/trainer_client.py
Co-authored-by: Andrey Velichkevich <[email protected]>
Signed-off-by: Saad Zaher <[email protected]>
* add get_runtime_packages as not supported by local-exec
Signed-off-by: Saad Zaher <[email protected]>
* move backends and its configs to kubeflow.trainer
Signed-off-by: Saad Zaher <[email protected]>
* fix typo in delete_job
Signed-off-by: Saad Zaher <[email protected]>
* Move local_runtimes to constants
* Move local_runtimes to constants
* allow list_jobs to filter by runtime
* keep runtime ref in __local_jobs
Signed-off-by: Saad Zaher <[email protected]>
* use google style docstring for LocalJob
Signed-off-by: Saad Zaher <[email protected]>
* remove debug opt from LocalProcessConfig
Signed-off-by: Saad Zaher <[email protected]>
* only use imports from kubeflow.trainer for backends
Signed-off-by: Saad Zaher <[email protected]>
* upload local-exec to use only one step
While I believe in simplicity and diving this into steps makes it easier
for debugging and extensibility. Addressing comments on this PR
consolidating all train job scripts into one and running it as single
step to match k8s.
Signed-off-by: Saad Zaher <[email protected]>
* optimize loops when getting runtime
Signed-off-by: Saad Zaher <[email protected]>
* add LocalRuntimeTrainer
Signed-off-by: Saad Zaher <[email protected]>
* rename cleanup config item to cleanup_venv
Signed-off-by: Saad Zaher <[email protected]>
* convert local runtime to runtime
Signed-off-by: Saad Zaher <[email protected]>
* convert runtimes before returning
Signed-off-by: Saad Zaher <[email protected]>
* fix get_job_logs to align with parent interface
Signed-off-by: Saad Zaher <[email protected]>
* rename get_runtime_trainer func
Signed-off-by: Saad Zaher <[email protected]>
* rename get_training_job_command to get_local_train_job_script
Signed-off-by: Saad Zaher <[email protected]>
* Ignore failures in Coveralls action
Signed-off-by: Andrey Velichkevich <[email protected]>
---------
Signed-off-by: Saad Zaher <[email protected]>
Signed-off-by: Saad Zaher <[email protected]>
Signed-off-by: Andrey Velichkevich <[email protected]>
Co-authored-by: Andrey Velichkevich <[email protected]>1 parent 6709dcf commit ffc3d62
File tree
9 files changed
+891
-2
lines changed- .github/workflows
- kubeflow/trainer
- api
- backends/localprocess
9 files changed
+891
-2
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
15 | 15 | | |
16 | 16 | | |
17 | 17 | | |
18 | | - | |
| 18 | + | |
19 | 19 | | |
20 | 20 | | |
21 | 21 | | |
| |||
36 | 36 | | |
37 | 37 | | |
38 | 38 | | |
| 39 | + | |
39 | 40 | | |
40 | 41 | | |
41 | 42 | | |
| |||
48 | 49 | | |
49 | 50 | | |
50 | 51 | | |
| 52 | + | |
51 | 53 | | |
52 | 54 | | |
53 | 55 | | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
38 | 38 | | |
39 | 39 | | |
40 | 40 | | |
| 41 | + | |
| 42 | + | |
| 43 | + | |
| 44 | + | |
| 45 | + | |
41 | 46 | | |
42 | 47 | | |
43 | 48 | | |
| |||
55 | 60 | | |
56 | 61 | | |
57 | 62 | | |
| 63 | + | |
| 64 | + | |
58 | 65 | | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
19 | 19 | | |
20 | 20 | | |
21 | 21 | | |
| 22 | + | |
| 23 | + | |
22 | 24 | | |
23 | 25 | | |
24 | 26 | | |
| |||
27 | 29 | | |
28 | 30 | | |
29 | 31 | | |
30 | | - | |
| 32 | + | |
| 33 | + | |
| 34 | + | |
31 | 35 | | |
32 | 36 | | |
33 | 37 | | |
| |||
43 | 47 | | |
44 | 48 | | |
45 | 49 | | |
| 50 | + | |
| 51 | + | |
46 | 52 | | |
47 | 53 | | |
48 | 54 | | |
| |||
Whitespace-only changes.
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
| 2 | + | |
| 3 | + | |
| 4 | + | |
| 5 | + | |
| 6 | + | |
| 7 | + | |
| 8 | + | |
| 9 | + | |
| 10 | + | |
| 11 | + | |
| 12 | + | |
| 13 | + | |
| 14 | + | |
| 15 | + | |
| 16 | + | |
| 17 | + | |
| 18 | + | |
| 19 | + | |
| 20 | + | |
| 21 | + | |
| 22 | + | |
| 23 | + | |
| 24 | + | |
| 25 | + | |
| 26 | + | |
| 27 | + | |
| 28 | + | |
| 29 | + | |
| 30 | + | |
| 31 | + | |
| 32 | + | |
| 33 | + | |
| 34 | + | |
| 35 | + | |
| 36 | + | |
| 37 | + | |
| 38 | + | |
| 39 | + | |
| 40 | + | |
| 41 | + | |
| 42 | + | |
| 43 | + | |
| 44 | + | |
| 45 | + | |
| 46 | + | |
| 47 | + | |
| 48 | + | |
| 49 | + | |
| 50 | + | |
| 51 | + | |
| 52 | + | |
| 53 | + | |
| 54 | + | |
| 55 | + | |
| 56 | + | |
| 57 | + | |
| 58 | + | |
| 59 | + | |
| 60 | + | |
| 61 | + | |
| 62 | + | |
| 63 | + | |
| 64 | + | |
| 65 | + | |
| 66 | + | |
| 67 | + | |
| 68 | + | |
| 69 | + | |
| 70 | + | |
| 71 | + | |
| 72 | + | |
| 73 | + | |
| 74 | + | |
| 75 | + | |
| 76 | + | |
| 77 | + | |
| 78 | + | |
| 79 | + | |
| 80 | + | |
| 81 | + | |
| 82 | + | |
| 83 | + | |
| 84 | + | |
| 85 | + | |
| 86 | + | |
| 87 | + | |
| 88 | + | |
| 89 | + | |
| 90 | + | |
| 91 | + | |
| 92 | + | |
| 93 | + | |
| 94 | + | |
| 95 | + | |
| 96 | + | |
| 97 | + | |
| 98 | + | |
| 99 | + | |
| 100 | + | |
| 101 | + | |
| 102 | + | |
| 103 | + | |
| 104 | + | |
| 105 | + | |
| 106 | + | |
| 107 | + | |
| 108 | + | |
| 109 | + | |
| 110 | + | |
| 111 | + | |
| 112 | + | |
| 113 | + | |
| 114 | + | |
| 115 | + | |
| 116 | + | |
| 117 | + | |
| 118 | + | |
| 119 | + | |
| 120 | + | |
| 121 | + | |
| 122 | + | |
| 123 | + | |
| 124 | + | |
| 125 | + | |
| 126 | + | |
| 127 | + | |
| 128 | + | |
| 129 | + | |
| 130 | + | |
| 131 | + | |
| 132 | + | |
| 133 | + | |
| 134 | + | |
| 135 | + | |
| 136 | + | |
| 137 | + | |
| 138 | + | |
| 139 | + | |
| 140 | + | |
| 141 | + | |
| 142 | + | |
| 143 | + | |
| 144 | + | |
| 145 | + | |
| 146 | + | |
| 147 | + | |
| 148 | + | |
| 149 | + | |
| 150 | + | |
| 151 | + | |
| 152 | + | |
| 153 | + | |
| 154 | + | |
| 155 | + | |
| 156 | + | |
| 157 | + | |
| 158 | + | |
| 159 | + | |
| 160 | + | |
| 161 | + | |
| 162 | + | |
| 163 | + | |
| 164 | + | |
| 165 | + | |
| 166 | + | |
| 167 | + | |
| 168 | + | |
| 169 | + | |
| 170 | + | |
| 171 | + | |
| 172 | + | |
| 173 | + | |
| 174 | + | |
| 175 | + | |
| 176 | + | |
| 177 | + | |
| 178 | + | |
| 179 | + | |
| 180 | + | |
| 181 | + | |
| 182 | + | |
| 183 | + | |
| 184 | + | |
| 185 | + | |
| 186 | + | |
| 187 | + | |
| 188 | + | |
| 189 | + | |
| 190 | + | |
| 191 | + | |
| 192 | + | |
| 193 | + | |
| 194 | + | |
| 195 | + | |
| 196 | + | |
| 197 | + | |
| 198 | + | |
| 199 | + | |
| 200 | + | |
| 201 | + | |
| 202 | + | |
| 203 | + | |
| 204 | + | |
| 205 | + | |
| 206 | + | |
| 207 | + | |
| 208 | + | |
| 209 | + | |
| 210 | + | |
| 211 | + | |
| 212 | + | |
| 213 | + | |
| 214 | + | |
| 215 | + | |
| 216 | + | |
| 217 | + | |
| 218 | + | |
| 219 | + | |
| 220 | + | |
| 221 | + | |
| 222 | + | |
| 223 | + | |
| 224 | + | |
| 225 | + | |
| 226 | + | |
| 227 | + | |
| 228 | + | |
| 229 | + | |
| 230 | + | |
| 231 | + | |
| 232 | + | |
| 233 | + | |
| 234 | + | |
| 235 | + | |
| 236 | + | |
| 237 | + | |
| 238 | + | |
| 239 | + | |
| 240 | + | |
| 241 | + | |
| 242 | + | |
| 243 | + | |
| 244 | + | |
| 245 | + | |
| 246 | + | |
| 247 | + | |
| 248 | + | |
| 249 | + | |
| 250 | + | |
| 251 | + | |
| 252 | + | |
| 253 | + | |
| 254 | + | |
| 255 | + | |
| 256 | + | |
| 257 | + | |
0 commit comments