The project's optimization efforts have yielded impressive results:
-
65% Speedup: The model's response time is significantly faster, making it suitable for real-time applications.
-
Improved Accuracy: The model's answers are more precise and contextually relevant.
- DistilBERT
- Intel Extension for PyTorch
- Intel Neural Compressor
- Python