Skip to content

Commit

Permalink
Merge pull request #361 from bbrowning/ocr-gpu-disable
Browse files Browse the repository at this point in the history
Only use CPU for the docling OCR models
  • Loading branch information
mergify[bot] authored Nov 12, 2024
2 parents 2766095 + 848d9c8 commit eaaccca
Showing 1 changed file with 2 additions and 0 deletions.
2 changes: 2 additions & 0 deletions src/instructlab/sdg/utils/chunkers.py
Original file line number Diff line number Diff line change
Expand Up @@ -213,6 +213,8 @@ def chunk_documents(self) -> List:

model_artifacts_path = StandardPdfPipeline.download_models_hf()
pipeline_options = PdfPipelineOptions(artifacts_path=model_artifacts_path)
# Keep OCR models on the CPU instead of GPU
pipeline_options.ocr_options.use_gpu = False
converter = DocumentConverter(
format_options={
InputFormat.PDF: PdfFormatOption(pipeline_options=pipeline_options)
Expand Down

0 comments on commit eaaccca

Please sign in to comment.