-
Notifications
You must be signed in to change notification settings - Fork 45
feat(trainer): add structured and configurable logging support #110
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
149c8bc to
56896c3
Compare
575c624 to
6f99044
Compare
This PR implements comprehensive logging support for the Kubeflow SDK as requested in Issue kubeflow#85. ## Features Implemented ### Core Logging Infrastructure - **NullHandler Pattern**: Prevents logging noise by default, users can override with their own configuration - **Configurable Logging**: Support for console, detailed, and JSON output formats - **Structured Logging**: JSON formatter for log aggregation systems (ELK stack, Fluentd, etc.) - **Environment-based Configuration**: Configure logging via environment variables ### Key Components - `kubeflow/trainer/logging/config.py`: Centralized logging configuration - `kubeflow/trainer/logging/formatters.py`: Custom formatters including JSON structured logging - `kubeflow/trainer/logging/logging_test.py`: Comprehensive test suite (14 tests) ### Integration Points - **SDK Integration**: Added debug logging to TrainerClient for better observability - **Package Integration**: Exposed logging utilities through kubeflow.trainer.logging - **NullHandler Setup**: Configured at package level to prevent logging noise ## Issue kubeflow#85 Requirements Fulfilled ✅ **Consistent use of Python's logging library instead of print statements** - All SDK operations now use proper Python logging - NullHandler pattern prevents unwanted output by default ✅ **Support for different logging levels (DEBUG, INFO, WARNING, ERROR)** - Full support for all standard Python logging levels - Configurable via setup_logging() function or environment variables ✅ **Ability for users to configure log formatting and destinations** - Console, detailed, and JSON output formats - File output support - Custom formatter support ✅ **Clear and actionable log messages for key SDK operations** - Debug messages in TrainerClient initialization - Backend selection logging - Job creation and ID logging ## Testing - **14 comprehensive unit tests** covering all logging functionality - **NullHandler pattern verification** with proper isolation - **SDK integration testing** with real TrainerClient usage - **Application integration examples** with file and console logging - **All tests pass** with proper linting compliance ## Usage Examples ### Basic Usage ```python from kubeflow.trainer import TrainerClient, setup_logging # Setup logging (optional - NullHandler prevents noise by default) setup_logging(level="DEBUG", format_type="console") # Use SDK - debug messages will appear if logging is configured client = TrainerClient() ``` ### JSON Logging for Production ```python setup_logging(level="INFO", format_type="json") # Logs will be in JSON format suitable for log aggregation ``` ### Environment Configuration ```bash export KUBEFLOW_LOG_LEVEL=DEBUG export KUBEFLOW_LOG_FORMAT=json ``` ## Breaking Changes None - this is a pure addition with backward compatibility. ## Migration Guide No migration required. Existing code will continue to work without changes. Users can optionally configure logging for better observability. Resolves kubeflow#85 Signed-off-by: Jaeyeon Kim <[email protected]>
6f99044 to
9642b41
Compare
Pull Request Test Coverage Report for Build 18062345345Details
💛 - Coveralls |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for this @anencore94!
/assign @kubeflow/kubeflow-sdk-team
| "get_logger", | ||
| "setup_logging", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
shall we move this to a separate package (e.g. kubeflow/logging)
| ) | ||
|
|
||
| job_id = self.backend.train(runtime=runtime, initializer=initializer, trainer=trainer) | ||
| logger.debug("Successfully created TrainJob with ID: %s", job_id) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you move all of the debug statements out of backend, and place them into trainer_client.py
sdk/kubeflow/trainer/backends/kubernetes/backend.py
Lines 254 to 256 in 9642b41
| logger.debug( | |
| f"{constants.TRAINJOB_KIND} {self.namespace}/{train_job_name} has been created" | |
| ) |
@szaher @kramaranya Does it sound good ?
|
|
||
|
|
||
| def setup_logging( | ||
| level: Union[str, int] = "INFO", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As we discussed here, do we need to have INFO level for our logger ? #85 (comment)
| "formatters": { | ||
| "console": { | ||
| "format": "%(asctime)s - %(name)s - %(levelname)s - %(message)s", | ||
| "datefmt": "%Y-%m-%d %H:%M:%S", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe it is better to make fmt ISO compliant ?
| "datefmt": "%Y-%m-%d %H:%M:%S", | |
| "datefmt": "%Y-%m-%dT%H:%M:%S", |
| "format": ( | ||
| "%(asctime)s - %(name)s - %(levelname)s - %(filename)s:%(lineno)d - %(message)s" | ||
| ), | ||
| "datefmt": "%Y-%m-%d %H:%M:%S", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| "datefmt": "%Y-%m-%d %H:%M:%S", | |
| "datefmt": "%Y-%m-%dT%H:%M:%S", |
| class TestLoggingConfig: | ||
| """Test logging configuration functionality.""" | ||
|
|
||
| def test_get_logger(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please refactor these test cases, similar to this:
| @pytest.mark.parametrize( |
| ValueError: Invalid backend configuration. | ||
| """ | ||
| logger.debug("Initializing TrainerClient with backend_config=%s", backend_config) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How you propagate your loggers to the clients ? Should it be part of the TrainerClient() property ?
What this PR does / why we need it:
This PR implements comprehensive structured and configurable logging support for the Kubeflow SDK as requested in Issue #85. The implementation provides:
The logging system ensures that the SDK is quiet by default (using NullHandler) but provides rich logging capabilities when users explicitly configure logging.
Which issue(s) this PR fixes (optional, in
Fixes #<issue number>, #<issue number>, ...format, will close the issue(s) when PR gets merged):Fixes #85
Features Implemented
Core Logging Infrastructure
Key Components
kubeflow/trainer/logging/config.py: Centralized logging configurationkubeflow/trainer/logging/formatters.py: Custom formatters including JSON structured loggingkubeflow/trainer/logging/logging_test.py: Comprehensive test suite (16 tests)Integration Points
TrainerClientfor better observabilitykubeflow.trainer.loggingIssue #85 Requirements Fulfilled
✅ Consistent use of Python's logging library instead of print statements
✅ Support for different logging levels (DEBUG, INFO, WARNING, ERROR)
setup_logging()function or environment variables✅ Ability for users to configure log formatting and destinations
✅ Clear and actionable log messages for key SDK operations
Testing
Usage Examples
Basic Usage
JSON Logging for Production
Environment Configuration
Breaking Changes
None - this is a pure addition with backward compatibility.
Migration Guide
No migration required. Existing code will continue to work without changes.
Users can optionally configure logging for better observability.
Checklist: