Road to Snowflake SnowPro Core Certification: Clustering
Fifth Chapter: Clustering
In this chapter, we will look at one of the most critical concepts for optimizing tables in Snowflake, Clustering. We will discuss the following points.
Remember that all the chapters from the course can be found in the following link.
DATA CLUSTERING
Typically, data stored in tables is sorted along natural dimensions, for example, by date. This process is called clustering, and data that is not sorted/clustered may hurt queries performance, particularly on huge tables, as Snowflake will have to analyze more micro-partitions to give a query result. Let’s look at the following example, where the micro-partitions are ordered by date. In this case, if we had to query for the date 11/2, Snowflake wouldn’t scan the last two micro-partitions, improving the performance of the query.