- Purpose
- Features
- Model Training
- Model Evaluation
- Model conversion to onnx
- Deployment
- Scope of Improvement
- Resources
This repository contaions end-to-enf pipeline for data-processing, training and deployment of segmentation models using SMP library.
- End-to-End pipeline for training and deployment
- Hydra configuratiions for training
- SMP (https://github.com/qubvel/segmentation_models.pytorch) models with Catalyst
- FastAPI
- Dockerized deployment script
Kitti dataset
We can use CVAT tool.
1. Load images in CVAT tool and create segmentation mask over spinal coord portion.
2. Export data into segmentation format. The exported data has mask into specified RGB color.
3. Transform label mask into binary mask. (Use: [./notebook/data_utitlity.ipynb](./notebook/data_utitlity.ipynb)). This is the requirement of training script.
We should opt for general data splitting strategy:
Train | Validation | Test |
---|---|---|
0.7 | 0.2 | 0.1 |
All the data is maintained inside data/ folder.
training_data
├── test
├── testannot
├── train
├── trainannot
├── val
└── valannot
For training the segmentation model I have used smp library. This library has simple interface and supports more than 10 architectures with approx 400 encoders.
The training is facilitated by catalyst and hydra. Different augmentations implemented using Albumentation library. Check .notebook/training_data_visualisation.ipynb notebook for training data visualisation.
-
Create virtual env by installing the requirements kept in requirements.txt file.
conda create --name <env_name> --file requirements.txt
You also need to setup jupyter notebook kernel for the same env.
python -m ipykernel install --user --name <env_name> --display-name <env_name>
-
Training data is kept inside data folder. If you want to train model on custom data then it's recommended to transform your data using ./notebook/data_utitlity.ipynb notebook and keep training data in the same data folder.
-
All the training configurations are kept inside different config files.
./configs ├── config.yaml ├── model │ └── default.yaml ├── processing │ └── default.yaml └── training └── default.yaml
- model/default.yaml --> stores all the model related configs.
- processing/defualt.yaml --> stores all configs relates to data processing.
- training/default.yaml --> stores configs related to training process.
Note: Make changes according to your training resources.
- Training script is kept in ./train.py file. After tuning the configurations you can start the training process by simply hitting below command.
python train.py
Note: As i had used hydra you can provide run time arguments to modify the default configs, as well as provide different conbinations of arguments to start different training loops.
-
Logs of training will be stored in output/{DATE}/{TIME}/ folder. This folder will keep all the training configs, logs, models and tensorboard event files.
You can check the tensorboard by hitting below command
tensorboard --logdir ./output/
Evaluate model on test data by using ./test.py file. You need to provide log output folder path to run this file.
python test.py \
--log_dir ./outputs/xxxx/xxxx
Result:
Images | dice_loss | iou_score |
---|---|---|
xxxx | xxx | xxx |
For result visualisation you can use ./notebook/result_visualisation.ipynb notebook.
To convert model smp model to onnx.
python utils/onnx_conversion.py \
--log_dir ./outputs/xxxx/xxxx \
--output_dir ./delpoyment/models/
For onnx model results verification use ./notebook/smp_to_onnx.ipynb notebook.
You can also infer from onnx model and save output image by using below command.
python deployment/infer.py \
--model_path ./deployment/models/model.onnx \
--output_dir ./prediction_output/ \
--image_path ./data/training_data/test/test.jpg
For the deployment, I have used FastAPI and Docker to create application. All the deployment code is kept inside ./deployment folder.
For the deployment of pipeline on docker all you need to do is to move inside deployment folder and run below commands
Note: Make sure you have onnx model kept inside ./deployment/models/model.onnx path .
-
Build docker image
docker build -t spinex:latest .
-
Start docker container
docker run -p 8000:8000 --name spinex spinex
-
You can use FastAPI dashboard to test the API (url: http://0.0.0.0:8000/docs/) OR you can hit CURL request from Postman.
Request
curl -X 'POST' \ 'http://0.0.0.0:8000/predict' \ -H 'accept: application/json' \ -H 'Content-Type: multipart/form-data' \ -F '[email protected];type=image/png'
Response
{'status': '200'}}
Var | Description |
---|---|
200 | Success |
100 | Problem while predicting from model |
- DVC is not utilised in current version
- Streamlit is always a good choice for representation of model output, which is indeed missing in current version.
Details | Type | Link |
---|---|---|
Training Script | python file | train.py |
Evaluation Script | python file | test.py |
Inferencing Script | python file | deployment/infer.py |
Onnx conversion Script | python file | ./model_utils/onnx_conversion.py |
Deployment application script | python file | ./deployment/app.py |
Data Operations | ||
Label Binary Conversion | notebook | ./notebook/data_utitlity.ipynb |
Data Splitting | notebook | ./notebook/data_utitlity.ipynb |
Training Data Visualisation | notebook | ./notebook/training_data_visualisation.ipynb |
Onnx Model Verification | notebook | ./notebook/smp_to_onnx.ipynb |