Skip to content

DP Synthetic Data #464

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
mccalluc opened this issue May 30, 2025 · 0 comments
Open

DP Synthetic Data #464

mccalluc opened this issue May 30, 2025 · 0 comments
Milestone

Comments

@mccalluc
Copy link
Contributor

Requires implementation in the OpenDP library. (I asked, and we are not interested in providing a wrapper around SmarkNoise or MostlyAI.)

Other notes from meeting:

  • Add “Synthetic Data” to analysis choices for a given column.
  • User specifies a list of related columns to synthesize... and that's probably it.
    • Number of rows in output should be inferred from the DP count of the input
    • Dependencies between columns should be inferred by the system: Don't prompt the user for this.
  • Just float values is fine for MVP.
  • Outputs
    • New synthetic CSV
    • Marginals are probably free?
@mccalluc mccalluc added this to the v0.6 milestone May 30, 2025
@github-project-automation github-project-automation bot moved this to Pending in DP Wizard May 30, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Pending
Development

No branches or pull requests

1 participant