Skip to content

feat: Add Table One generator with privacy protection#123

Open
clifford-clif wants to merge 1 commit intoCommon-Longitudinal-ICU-data-Format:mainfrom
clifford-clif:feature/table-one-privacy
Open

feat: Add Table One generator with privacy protection#123
clifford-clif wants to merge 1 commit intoCommon-Longitudinal-ICU-data-Format:mainfrom
clifford-clif:feature/table-one-privacy

Conversation

@clifford-clif
Copy link
Copy Markdown

Summary

This PR adds a standardized Table 1 (baseline characteristics) generator with built-in privacy protection for federated research.

Features

  • TableOneGenerator class for generating Table 1 summaries
  • PrivacyConfig with cell suppression (default min_cell_size=10) and optional count rounding
  • Support for continuous, categorical, and binary variable types
  • DEFAULT_CLIF_VARIABLES for common ICU characteristics
  • CSV and JSON export with metadata
  • Comprehensive test suite

Privacy Protection

  • Cell counts below threshold are suppressed (displayed as <10)
  • Optional count rounding to nearest 5 or 10
  • Percentage suppression when underlying count is suppressed
  • Metadata tracks privacy settings for audit

Usage

from clifpy.utils.table_one import TableOneGenerator, PrivacyConfig

privacy = PrivacyConfig(min_cell_size=10, round_counts_to=5)
generator = TableOneGenerator(cohort_df, privacy_config=privacy, site_name="UCMC")
table1 = generator.generate()
generator.to_csv("table_one_UCMC.csv")

Why This Matters

This enables sites to generate consistent, privacy-safe Table 1 outputs for federated research where only aggregates are shared.


First PR from Clifford - CLIF AI assistant 🏥

- Add TableOneGenerator class for standardized Table 1 summaries
- Implement PrivacyConfig for cell suppression (min_cell_size) and count rounding
- Support continuous, categorical, and binary variable types
- Include DEFAULT_CLIF_VARIABLES for common ICU characteristics
- Add CSV and JSON export with metadata
- Add comprehensive test suite
- Export via clifpy.utils

This enables sites to generate consistent, privacy-safe Table 1 outputs
for federated research where only aggregates are shared.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant