Breaking Barriers: Tabular Synthetic Data for All – Accessible, Ubiquitous, and API-Driven

Experience Level: intermediate

Synthetic data is revolutionizing AI and testing. This talk explores how an API-first, open-source SDK enables developers to create high-fidelity, privacy-safe tabular data—better than anonymization, scalable across environments, and ready for real-world use.


timeslot: Monday 7th April 2025, 09:00-10:00, Room D
tags: data

Data access should not be a privilege. Whether you are a student, researcher, or developer, the ability to work with high-quality, structured datasets should be accessible to all.
This talk explores how Python developers can generate high-fidelity synthetic tabular data that mirrors real-world patterns for AI, analytics, and software testing—without privacy risks.
Attendees will learn:
  • Why real-world data is often inaccessible and how synthetic data solves this
  • Why high-fidelity synthetic tabular data is better than anonymization
  • Path from UI-first to API-First Open-Source SDK (incl. lessons learned)
  • Live demo: Creating privacy-safe synthetic datasets using the MOSTLY AI SDK
  • How to evaluate synthetic data quality
  • Practical applications and the road ahead

Michael Druk
  • Software Engineer (surprise?)
  • An expat living in Austria for over 8 years
  • I enjoy heavy metal music
  • Traveling is nice, too
  • Languages, as well (not just programming, I promise)
  • Sailing is a more recent hobby of mine
Michael Druk