Building Data Pipelines with Python
Understanding Pipeline Frameworks, Workflow Automation, and Python Toolsets
By Katharine Jarmul
Publisher: O'Reilly Media
Final Release Date: November 2016
Run time: 3 hours 40 minutes

This course shows you how to build data pipelines and automate workflows using Python 3. From simple task-based messaging queues to complex frameworks like Luigi and Airflow, the course delivers the essential knowledge you need to develop your own automation solutions. You'll learn the architecture basics, and receive an introduction to a wide variety of the most popular frameworks and tools.

Designed for the working data professional who is new to the world of data pipelines and distributed solutions, the course requires intermediate level Python experience and the ability to manage your own system set-ups.

  • Acquire a practical understanding of how to approach data pipelining using Python toolsets
  • Master the ability to determine when a Python framework is appropriate for a project
  • Understand workflow concepts like directed acyclic graphs, producers, and consumers
  • Learn to integrate data flows into pipelines, workflows, and task-based automation solutions
  • Understand how to parallelize data analysis, both locally and in a distributed cluster
  • Practice writing simple data tests using property-based testing
Katharine (AKA Kjam) Jarmul is a Python developer, data consultant, and educator who has worked with Python since 2008. Kjam runs kjamistan UG, a Python consulting, training, and competitive analysis company based in Berlin, Germany. She is the author of several O'Reilly titles, including Data Wrangling with Python: Tips and Tools to Make Your Life Easier. She holds an M.A. from American University and an M.S. from Pace University.
Table of Contents
Product Details
Recommended for You
Customer Reviews


by PowerReviews
oreillyBuilding Data Pipelines with Python

(based on 1 review)

Ratings Distribution

  • 5 Stars



  • 4 Stars



  • 3 Stars



  • 2 Stars



  • 1 Stars



Reviewed by 1 customer

Displaying review 1

Back to top


So timely

By Phil

from Boston, MA

Verified Reviewer


  • Accurate


    Best Uses

    • Intermediate
    • Student

    Comments about oreilly Building Data Pipelines with Python:

    Wanting a good primer on Celery is what drew me to Katharine's work, I got the Celery tutorial and then a firehose full of other pipeline tools and even a does of Ansible.

    Whether you are an architect or student working on a project, this video is great for quickly seeing different Python data flow solutions at work and getting a slightly hands-on understanding of messaging and workflow framework in the Python ecosystem.

    Displaying review 1

    Back to top

    Buy 2 Get 1 Free Free Shipping Guarantee
    Buying Options
    Immediate Access - Go Digital what's this?
    Video:  $69.99
    (Streaming, Downloadable)