Practical python data pipeline design

Experience Level: intermediate

How can you improve your data pipelines? What standards should you enfornce in your project? What tools should you use? This talk is all about best practices and practical tips for building reliable, scalable and maintainable data pipelines in python!


  • Timeslot: Sunday, 6th April 2025, 11:00-12:00, Room D
  • Tags: Data

Weather you are working on your new AI project or just trying to copy stuff into your Postgres database, you are building a data pipeline, and you might be making your life harder by doing it wrong! There are many patterns you can follow to simplify this process, and many tools that can take away hours or even days of work, all while saving you a lot of money in storage and compute costs.
In this talk I want to show you practical tips for building data pipelines. We will start by looking at the building blocks of any pipeline which are sources and destinations, what key features should you look to understand the requirements and constrains when dealing with these, and how can you connect all the parts efficiently.
It is very important that we consider data quality tests and I want to walk you through some practical examples where you will see not only how you can implement these tests, but where it is most critical to have them.
Throught this talk, we will discuss modern tools like dlt, great expectations and structured logging which can make your life a lot easier.
Data engineering does not happen in a vacuum so we have to always consider the value that our pipelines provide. I want to show you some ways in which really understanding the domain and requirements can shape how you design your pipeline and make everyone happier with the results.

Darius Chira
Data team lead at Valutico, interested in all things related to python, data, AI, good food and crazy stories.
Mircea Daarius