Road to Snowflake SnowPro Core Certification: Clustering

Fifth Chapter: Clustering

Gonzalo Fernandez Plaza

--

Fifth Chapter of the Snowflake SnowPro Core Certification Complete Course.
Fifth Chapter of the Snowflake SnowPro Core Certification Complete Course.

In this chapter, we will look at one of the most critical concepts for optimizing tables in Snowflake, Clustering. We will discuss the following points.

  1. Data Clustering
  2. Clustering Depth
  3. Cluster Keys
  4. Re-clustering
  5. Typical Exam Questions about Clustering

Remember that all the chapters from the course can be found in the following link.

DATA CLUSTERING

Typically, data stored in tables is sorted along natural dimensions, for example, by date. This process is called clustering, and data that is not sorted/clustered may hurt queries performance, particularly on huge tables, as Snowflake will have to analyze more micro-partitions to give a query result. Let’s look at the following example, where the micro-partitions are ordered by date. In this case, if we had to query for the date 11/2, Snowflake wouldn’t scan the last two micro-partitions, improving the performance of the query.

Clustering example.
Clustering example (via docs.snowflake.com).

--

--

Gonzalo Fernandez Plaza

Computer Science Engineer & Tech Lead 🖥️. Publishing AWS & Snowflake ❄️ courses & exams. https://www.fullcertified.com