Skip to content

Conversation

@smouaa
Copy link
Contributor

@smouaa smouaa commented Sep 30, 2025

  • Add sklearn_handler.py with support for multiple model formats (pickle, joblib, skops, cloudpickle)
  • Add xgboost_handler.py with support for multiple model formats (pickle, json/ubj, xgb)
  • Modified decode_csv to support handler use cases
  • Support both JSON and CSV input/output formats
  • Added find_model_file util function
  • Added import_utils

Description

Brief description of what this PR is about

  • If this change is a backward incompatible change, why must this change be made?
  • Interesting edge cases to note here

Type of change

Please delete options that are not relevant.

  • Bug fix (non-breaking change which fixes an issue)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • New feature (non-breaking change which adds functionality)
  • This change requires a documentation update

Checklist:

  • Please add the link of Integration Tests Executor run with related tests.
  • Have you manually built the docker image and verify the change?
  • Have you run related tests? Check how to set up the test environment here; One example would be pytest tests.py -k "TestCorrectnessLmiDist" -m "lmi_dist"
  • Have you added tests that prove your fix is effective or that this feature works?
  • Has code been commented, particularly in hard-to-understand areas?
  • Have you made corresponding changes to the documentation?

Feature/Issue validation/testing

Please describe the Unit or Integration tests that you ran to verify your changes and relevant result summary. Provide instructions so it can be reproduced.
Please also list any relevant details for your test configuration.

  • Test A
    Logs for Test A

  • Test B
    Logs for Test B

@smouaa smouaa requested review from a team and zachgk as code owners September 30, 2025 17:08
@smouaa smouaa force-pushed the sklearn-handler-csv-support branch 3 times, most recently from 7e9e644 to f7fdcdb Compare September 30, 2025 18:23
@Lokiiiiii
Copy link
Member

Note: Add tests, benchmarks

@Lokiiiiii
Copy link
Member

Note: Add documentation and demo notebook

@smouaa smouaa force-pushed the sklearn-handler-csv-support branch from 09d3a83 to 81af1b6 Compare October 6, 2025 23:34
Copy link
Member

@Lokiiiiii Lokiiiiii left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's add some tests for -

  1. CSV inputs - with & without headers
  2. multiple model artifacts
  3. Improper ENV variable setup
  4. garbage string in SKLEARN_SKOPS_TRUSTED_TYPES
  5. unsupported accept types
  6. Different input shapes
  7. Ragged arrays in input

- Add security controls for pickle formats via environment variables
- Add import_utils module for optional dependency management
- Update Dockerfile to include sklearn dependencies
- Made model file discovery a shared helper function in utils.py file
@smouaa smouaa force-pushed the sklearn-handler-csv-support branch from 81af1b6 to 46c862c Compare October 15, 2025 16:58
@smouaa smouaa force-pushed the sklearn-handler-csv-support branch from 46c862c to 548d787 Compare October 15, 2025 17:03
@Lokiiiiii Lokiiiiii changed the title Add sklearn handler with CSV/json support Add XGB & SKL Py handlers with CSV/json support Oct 16, 2025
@smouaa smouaa force-pushed the sklearn-handler-csv-support branch from a577fbf to 8342035 Compare October 18, 2025 01:10
@smouaa smouaa force-pushed the sklearn-handler-csv-support branch from 494473f to 1769295 Compare October 18, 2025 01:28
@smouaa smouaa force-pushed the sklearn-handler-csv-support branch 3 times, most recently from bfb5054 to bcfa8c1 Compare October 21, 2025 18:47
…ed behavior (accepting properties via serving.properties or env variables, throwing error when csv files contain non numeric data)
@smouaa smouaa force-pushed the sklearn-handler-csv-support branch from bd57507 to 2fd9054 Compare October 21, 2025 23:18
@smouaa smouaa merged commit 3e49622 into deepjavalibrary:master Oct 22, 2025
10 of 12 checks passed
@smouaa smouaa deleted the sklearn-handler-csv-support branch October 22, 2025 03:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants