ALERT: Due to maintenance activity, you might not see any screenshots. Your patience is highly appreciated. Thanks!!
This lab walks you through Using Google Cloud Composer and Creating DAGs.
Duration: 60 minutes
Cloud Composer is a managed service for building, scheduling, and running workflow orchestration pipelines for hybrid and multi-cloud environments. It uses an open-source Apache Airflow service at the backend. Cloud Composer uses DAGs. The Cloud Composer Environment comprises a GKE Cluster, a Cloud Storage bucket, a Web Server(App Engine), and a Database(Cloud SQL) accompanied by Monitoring and Logging features. Though the creation of a Cloud Composer environment requires 10-20 minutes, it does provides a plethora of advantages such as automation of workflows and easy integration with Python libraries, and lastly well integrated with other Google Cloud services such as BigQuery, Dataproc. The only precondition to excel in Cloud Composer is you should be proficient in Python Programming Language.
This lab is an Incremental Version of Creating a Composer Environment, and navigating in the Airflow UI Lab. Kindly go through it to explore more about Cloud Composer.
A DAG can be defined as a collection of tasks with defined dependencies and relationships that you want to schedule and run. It's created using Python Language. In DAGs No Cycle should be present.
DAGs comprise Nodes and Edges. Nodes in a DAG are the tasks that we want to execute in a DAG. Node is sometimes called an Operator. It's automatically triggered but can be Triggered multiple times i.e. you can create as many DagRuns as you want. DagRuns are just the number of instances your Dag is executed.
Each task in a DAG can represent almost anything like sending the data to BigQuery or Running a Pipeline.
Creating a Cloud Composer environment.
Navigating to Airflow UI.
Create a DAG for printing Hello World.