ALERT: Due to maintenance activity, you might not see any screenshots. Your patience is highly appreciated. Thanks!!
This lab walks you through Using Google Cloud Composer.
Duration: 60 minutes
Cloud Composer is a fully managed workflow orchestration service which means Google will take care of everything, you just have to create a DAG. You can create, monitor, and manage workflows. Cloud Composer is backed by the popular open-source project Apache Airflow open source project. It operates using the Python programming language.
Cloud Composer helps you create Airflow environments though its creation requires nearly 10 minutes with no installation or management overhead. You can focus on your workflows and not your infrastructure.
At the backend, Cloud Composer uses Google Kubernetes Engine service to create, manage and delete environment clusters where Airflow components run. These clusters are fully managed by Cloud Composer.
With the creation of Clusters, Cloud Composer also creates a bucket automatically where your DAGs, logs, and other code are stored.
Before Even creating your DAGs or workflows, you need to create an environment. Airflow depends on many micro-services to run, so Cloud Composer provisions Google Cloud components or services to run your workflows. These components are collectively known as a Cloud Composer environment. You can create one or more environments in a single Google Cloud project.
GKE cluster: The Airflow schedulers, workers, and Redis Queue run as GKE workloads on a single cluster, and are responsible for processing and executing DAGs. The cluster also hosts other Cloud Composer components like Composer Agent and Airflow Monitoring, which help manage the Cloud Composer environment, gather logs to store in Cloud Logging, and gather metrics to upload to Cloud Monitoring.
Web server: The web server runs the Apache Airflow web interface, and Identity-Aware Proxy protects the interface. For more information, see Airflow Web Interface.
Database: The database holds the Apache Airflow metadata.
Cloud Storage bucket: Cloud Composer associates a Cloud Storage bucket with the environment. The associated bucket stores the DAGs, logs, custom plugins, and data for the environment. For more information about the storage bucket for Cloud Composer, see Data Stored in Cloud Storage.
Google offers this service in the following 2 Environments:
Cloud Composer 1 environments are zonal.
Cloud Composer 2 environments have a zonal Airflow Metadata DB and a regional Airflow scheduling & execution layer. Airflow schedulers, workers, and web servers run in the Airflow execution layer.
It is an open-source platform for Programmatically creating, monitoring, and scheduling workflows. The rich user interface makes it easy to visualize pipelines running in production, monitor progress, and troubleshoot issues when needed.
Creating a Cloud Composer environment.
Navigating to Airflow UI.