This lab walks you through Cloud BigQuery.
You will be creating a BigQuery Dataset and loading the CSV data.
Region: us-central1
Duration: 60 minutes
BigQuery is a fully managed big data tool for companies that need a cloud-based interactive query service for massive datasets.
BigQuery is not a database, it's a query service.
BigQuery supports SQL queries, which makes it quite user-friendly. It can be accessed from Console, CLI, or using SDK. You can query billions of rows, it only takes seconds to write, and seconds to return.
You can use its REST APIs and get your work done by sending a JSON request.
Let’s understand with help of an example, Suppose you are a data analyst and you need to analyze tons of data. If you choose a tool like traditional MySQL, you need to have an infrastructure ready, that can store this huge data.
You can focus on analysis rather than working on infrastructure. Hardware is completely abstracted.
Designing this infrastructure itself will be a difficult task because you will have to figure out RAM size, CPU type, or any other configurations.
BigQuery is mainly for Big Data. You shouldn’t confuse it with OLTP (Online Transaction Processing) database.
Datasets: Datasets hold one or more tables of data.
Tables: Tables are row-column structures that hold actual data
Jobs: Operations that you perform on the data, such as loading data, running queries, or exporting data.
Login into the GCP Console.
Creating a BigQuery Dataset.
Create a Table.
Loading the data through an external CSV.
Reading data through the Table using SQL Query.