Skip to content

Algorithm to generate synthetic tabular data such as baseline clinical trial data.

License

Notifications You must be signed in to change notification settings

mdsol/Simulants

Folders and files

NameName
Last commit message
Last commit date
Mar 11, 2022
Mar 11, 2022
Mar 11, 2022
Mar 11, 2022
Mar 11, 2022
Mar 11, 2022
Mar 11, 2022
Mar 11, 2022
Mar 27, 2024
Mar 11, 2022
Mar 11, 2022
Mar 12, 2024
Mar 11, 2022
Mar 11, 2022
Mar 11, 2022
Mar 11, 2022
Mar 11, 2022
Mar 11, 2022

Repository files navigation

Simulants

In order to address the privacy concerns of patient data and to be able to disclose clinical trial data to other organizations, we have built a system that synthesizes patient data and cross-validates the synthetic data against the real data by running standard statistical techniques and machine learning algorithms. The code consists of a set of libraries used for loading sample data from the UCI reposirtory, preprocessing it and using it to synthesize a new set of patients.

A sample dataset is downloaded from the UCI Machine Learning Repository at: https://archive.ics.uci.edu/ml/datasets/Heart+Disease

Prerequisites

use python 3.8 or later

All the required packages are specified in requirements.txt.

pip install -r requirements.txt

Usage

  1. Modify uci_config.py or use it as it for using the sample dataset from uci heart disease

  2. python uci_demo.py

  3. the outputs ncluding the synthesized data and the results from cross-validation will be in output_uci/

Contributing

See CONTRIBUTING.

Contributors

Jacob Aptekar (Medidata Solutions)

Mandis Beigi (Medidata Solutions)

Pierre-Louis Bourlon (Medidata Solutions)

Jason Mezey (Cornell University)

Afrah Shafquat (Medidata Solutions)

Contact

See the factbook.

Contact

Mandis Beigi at AcornAI (Medidata Solutions Inc., a Dassault Systemes Company)

[email protected]

About

Algorithm to generate synthetic tabular data such as baseline clinical trial data.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages