📝 Notes taken by Abd @ITNA Digital
Links
Keywords
Workflow Orchestration,
Table of Contents
Essentially takes data converts/processes/outputs new data. Like - our ingest_data.py
we did last week.
Our ingest_data.py
wasn’t the best data pipeline as it combine every thing into one script. If one aspect fails the entire pipeline fails.
It is best to split the entire process into multiple scripts with proper checks and reviews that confirmation previous section was executed. Sequential running, execution of multiple scripts and processes, try and except clauses and other things should be considered for the data pipeline.
MAKE
executes sequential scripts like first wget
then ingest_data.py
. Used smaller workflows.