Breaking Barriers: Tabular Synthetic Data for All – Accessible, Ubiquitous, and API-Driven
Experience Level: intermediate
Synthetic data is revolutionizing AI and testing. This talk explores how an API-first, open-source SDK enables developers to create high-fidelity, privacy-safe tabular data—better than anonymization, scalable across environments, and ready for real-world use.
timeslot: Monday 7th April 2025, 09:00-10:00, Room D
tags: data
Data access should not be a privilege. Whether you are a student, researcher, or developer, the ability to work with high-quality, structured datasets should be accessible to all.
This talk explores how Python developers can generate high-fidelity synthetic tabular data that mirrors real-world patterns for AI, analytics, and software testing—without privacy risks.
Attendees will learn:
- Why real-world data is often inaccessible and how synthetic data solves this
- Why high-fidelity synthetic tabular data is better than anonymization
- Path from UI-first to API-First Open-Source SDK (incl. lessons learned)
- Live demo: Creating privacy-safe synthetic datasets using the MOSTLY AI SDK
- How to evaluate synthetic data quality
- Practical applications and the road ahead
Michael Druk
- Software Engineer (surprise?)
- An expat living in Austria for over 8 years
- I enjoy heavy metal music
- Traveling is nice, too
- Languages, as well (not just programming, I promise)
- Sailing is a more recent hobby of mine
