Common patterns for EMR workflows?

virajparekh · December 19, 2018, 7:09pm

What are the common patterns to follow when writing external compute workflows?

virajparekh · December 19, 2018, 7:09pm

Here, the job runs after a set of files arrive, and if the “condition check” passes, it proceeds to send a report.

If it doesn’t pass, we’ll usually see some sort of task/dag triggered to account for those types of scenarios (this could be anything from sending a notification to reloading source data into s3).

If it does, some sort of notification is usually sent out (our customers tend to use slack quite liberally).

Finally, the cluster spins down.

This is just a mockup and some customers can have several levels of validation based on the use case, but the overall structure is a good starting point.

Topic		Replies	Views
Generate Parallel Tasks based on Result of previous task in a DAG Airflow	10	5985	April 8, 2020
Task that wait until DAG is refreshed Airflow	1	1642	March 23, 2022
Airflow 2.0 Multi DAG orchestration w dependencies Airflow	0	1835	January 28, 2021
Run next task group if the same task group from previous DAG Run finished Airflow airflow , dag-run , task-group	0	960	May 16, 2023
Tasks that don't execute until they are cleared Airflow	2	1668	December 30, 2020

Common patterns for EMR workflows?

Related Topics