Road to Snowflake SnowPro Core Certification: Data Loading

Eighth Chapter: Data Loading

Gonzalo Fernandez Plaza

--

Eighth Chapter of the Snowflake SnowPro Core Certification Complete Course.
Eighth Chapter of the Snowflake SnowPro Core Certification Complete Course.

Hello everybody. In this chapter, we are going to study the different ways to COPY data into Snowflake tables, either using bulk load or continuous load with Snowpipe:

  1. Data Loading
  2. Bulk load using COPY INTO
  3. Continuous load using Snowpipe
  4. Typical Exam Questions

Remember that all the chapters from the course can be found in the following link.

DATA LOADING

When we load the data into Snowflake, we usually perform the following steps:

  • Source system → Snowflake Stage → Snowflake Table.

A source system (it can be, for example, your application, or not even that, it can be whatever machine that generates data, like sensors) will generate data that it will send to a stage, for example, AWS S3. Once the data is at any stage, we’ll copy it into Snowflake tables. There are two different ways to do it, Bulk Load or Continuous Load; let’s study the differences.

Data Loading Steps in Snowflake.
Data Loading Steps in Snowflake.

BULK LOAD

Bulk load is the process of loading batches of data from files already available at any stage into Snowflake tables. We use the COPY command to do that (it consumes credits from the virtual warehouses). It supports data transformation while loading, using column reordering, column omission, casting… Let’s see how the COPY INTO command works.

COPY INTO

Using this command, you can load data from staged files to an existing table. It copies the data into tables. It works for any stage (internal, external, table, and user stages) or even from an external location like S3, Azure… Some factors affect the loading time, like the physical location of the stage, GZIP compression efficiency (files are automatically compressed using gzip unless compression is explicitly disabled), or the number and types of columns.

--

--

Gonzalo Fernandez Plaza

Computer Science Engineer & Tech Lead 🖥️. Publishing AWS & Snowflake ❄️ courses & exams. https://www.fullcertified.com