📝 Notes taken by Abd @ITNA Digital

Links

🔗 Link to the Video

Keywords

Workflow Orchestration,

Table of Contents


Data Pipeline

Essentially takes data converts/processes/outputs new data. Like - our ingest_data.py we did last week.

Our ingest_data.py wasn’t the best data pipeline as it combine every thing into one script. If one aspect fails the entire pipeline fails.

It is best to split the entire process into multiple scripts with proper checks and reviews that confirmation previous section was executed. Sequential running, execution of multiple scripts and processes, try and except clauses and other things should be considered for the data pipeline.

Untitled

MAKE executes sequential scripts like first wget then ingest_data.py . Used smaller workflows.

Slightly more complex pipeline aka Data Workflow

Untitled