BigQuery Time-Partitioning: A Hidden Cost Revealed

Image shows Piyush kalra with a lime green background

Piyush Kalra

Nov 28, 2025

    Table of contents will appear here.
    Table of contents will appear here.
    Table of contents will appear here.

You have done something that many other people have also done. You have managed to partition your BigQuery tables in an effort to organize your data and reduce query expenses. You have an expectation that your bill is going to decrease, yet it continues to rise. Why is that? Well, it turns out, there is a minuscule element in your SQL query, something trivial like CAST(), that completely overturns the logic you spent so much effort to save.

This is very common, and it is also very annoying. However, it is not the end of the world. These are the reasons why BigQuery time partitioning behaves like it does, which we need to resolve swiftly.

What Is BigQuery Time-Partitioning and Why Does It Matter?

Instead of going directly to the cost that is hidden, let's start with the fundamentals. A partitioned table in BigQuery that has been sliced into smaller tables: partitions. When you run a query with a restriction on the partition column, BigQuery does not have to scan all the partitions, only the relevant ones. This process is called partition pruning.

Think of partition pruning as analogous to opening a single drawer in a cabinet to locate a particular file and avoiding a more complicated approach of emptying the cabinet and rummaging through all the contents. By lessening the amount of data examined, the costs and performance of the query improve profitability dramatically.

A time partitioned table can be created through a column defined as a DATE, DATETIME, or TIMESTAMP. An example of a table created, which is partitioned using a DATE column, is shown below:

metadata := &bigquery.TableMetadata{
   TimePartitioning: &bigquery.TimePartitioning{
       Field:      "date", // Partition by the 'date' column
       Expiration: 90 * 24 * time.Hour, // Partitions expire after 90 days
   },
   Schema: sampleSchema,
}

This is one of the most crucial aspects of setting up the data in BigQuery for easy retrieval while minimizing costs.

The Hidden Cost Trap: When Partition Pruning Fails

This is where things get complicated. The assumption here is that any query with the filters of date will invoke partition pruning, but that is not always the case. The function that is used in the WHERE clause is of great significance.

Let’s look at two queries that appear to be similar. For a partitioned table of events, let us assume that there is a partitioned DATETIME column called event_time.

Query 1: The Expensive Way
This query uses CAST () to change the column of the partition in DATETIME to a DATE.

SELECT
   *
FROM
   `project.dataset.events`
WHERE
   CAST(event_time AS DATE) = '2025-11-12';

Bytes processed: 1.01 GB (Full table scan)

Query 2: The Optimized Way
This query uses the DATE() function instead.

SELECT
   *
FROM
   `project.dataset.events`
WHERE
   DATE(event_time) = '2025-11-12';

Bytes processed: 85 MB (Partition pruned)

The first query, which uses CAST(), gets to scan the whole table. The second query, which uses DATE(), only scans the partition for 2025-11-12. The disparity in data processing directly affects the cost of your query more.

Unlike other database systems, BigQuery, for example, does not scan the entire dataset. Rather, with the DATE(event_time) column, it opens the drawer corresponding to 2025-11-12, reads that specific partition, and ignores the rest. This concept is referred to as partition pruning and is arguably one of the best cost-optimization features of BigQuery.

However, that is not the end of the world. It is not always the case that it will function the way you expect. In the case of applying the CAST() function to the partition column, BigQuery does not 'see' those partitions. This results in the pruning of the partition and the scanning of the entire table, which ultimately increases the cost associated with that first query.

Why Your Partition Filter Might Be Failing in BigQuery

So, why does CAST () break partition pruning while DATE () works? It comes down to how BigQuery uses the query planner to apply the filters.

When you filter on a partition column directly, there is a high likelihood that BigQuery will know what partitions to scan. If computation is done on that column, the query planner has additional work to do. In the case of DATETIME columns, CAST(event_time AS DATE) creates a "computed" value. BigQuery cannot pre-calculate the result of the function for every single row across all partitions and thus defaults to a full table scan for safekeeping. It cannot partition the filter to the hidden partition metadata.

The opposite of this case does not hold true for TIMESTAMP columns. In the case when your partition column is a TIMESTAMP, you can use CAST(), DATE(), or EXTRACT(DATE FROM col) will all result in partition pruning. This plight of the DATETIME is what is paradoxical, as there even exists a TIMESTAMP trap.

The True Cost of Ignoring Partition Pruning

While there is not much of a difference between scanning 85 MB and 1.01 GB, the diffs can add up really quickly. Imagine a 1 GB table scan, which can be costed at a few pennies. Now, what about a table of 10 TB in size? One poorly optimized query can easily reach the hundreds, and if that single query is repeated a few times in a single day, the costs can quickly add up in a month.

When it comes to GCP BigQuery, the focus is not on frequent and random improvements, but rather on striving. Clearly, every query that is run will have an impact. The differentiation in this case can start with a firm glance at the Bytes scanned or Bytes billed sections in the User interface of BigQuery. The simple implementation of this defense is usually the first line of hidden costs.

3 Ways to Ensure BigQuery Partition Pruning Works

Now for the solution. Here are three steps you can take to prevent this hidden cost and ensure partition pruning works every time.

1. Use the Right Functions for DATETIME Columns

When your table is sliced on a DATETIME column, steer clear of the function CAST(column AS DATE). Rather, use one of the two alternatives:

  • DATE(your_datetime_column)

  • EXTRACT(DATE FROM your_datetime_column)

Both functions give BigQuery a better chance of correctly identifying and pruning unattended partitions.

2. Prefer TIMESTAMP Partition Columns

If the design of your table is configured under your control, use a TIMESTAMP column as the partitioning key instead of DATETIME. As stated before, TIMESTAMP columns do not suffer from this problem. Also, partition pruning works perfectly with CAST(), DATE(), and EXTRACT() on TIMESTAMP columns. This reduces the chances of your queries incurring excessive costs to be much lower level.

3. Test Pruning Using the Query Plan or Bytes Processed

Don't just assume your queries are optimized. Partition pruning must be validated as working after a new query is created. It is possible to do this by tracking the bytes processed on the query results tab or by looking at the query execution plan and the steps taken to scan.

Pro Tip: After writing a new query, always run EXPLAIN or check the job stats. If the data scanned nears the total size of the table, then partition pruning has failed. A check of this type is worth the expense of a dummy table, as it may save you thousands of dollars.

How to Build a Self-Auditing BigQuery Environment

Preventing these issues manually is good, but automating checks is even better.  A solid strategy for BigQuery FinOps would be to build additional protective measures to stop expensive queries even before they come up.

Below are some suggestions for automation:

  • Enforce Partition Filters: When defining tables, for the require_partition_filter option, which is set to TRUE, users are forced to add a WHERE clause on the partition column, although it does not solve the CAST() problem.

  • Scan for Anti-Patterns: Create a Cloud Function or a scheduled routine that automatically inspects your query logs for CAST() on a column that uses the DATETIME partition.

  • Monitor with INFORMATION_SCHEMA: With INFORMATION_SCHEMA.JOBS, you are able to flag queries that process an extremely high volume of data. There is the option of setting alerts to signal you whenever there is a query open.

Here’s a simple piece of SQL code you can use to start auditing on your infrastructure to flag high-volume queries:

SELECT
 project_id,
 user_email,
 statement_type,
 ROUND(total_bytes_processed / POWER(1024, 3), 2) AS gb_scanned,
 ROUND(total_bytes_processed / POWER(1024, 4), 2) AS tb_scanned,
 ROUND((total_bytes_processed / POWER(1024, 4)) * 5, 2) AS estimated_cost_usd,
 total_slot_ms / 1000 AS seconds_used,
 creation_time,
 query
FROM
 `region-us`.INFORMATION_SCHEMA.JOBS_BY_PROJECT
WHERE
 total_bytes_processed > 1000000000  -- scanned >1 GB
 AND creation_time > TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 7 DAY)
ORDER BY
 total_bytes_processed DESC
LIMIT 50;

Conclusion

GCP BigQuery is a tool that facilitates the partitioning of data and the use of data; however, that is not the whole story. It can be concluded that partitioning is a process that can optimize the BigQuery costs, though it is rife with complexities, with every single detail counting.

To recap:

  • Partitioning is a powerful tool for BigQuery cost optimization.

  • Using CAST() on a DATETIME partition column breaks partition pruning, leading to full table scans.

  • Always use DATE() or EXTRACT(DATE FROM col) instead.

  • Monitor your query performance and automate audits to catch issues early.

This is only one of several hidden cost traps in BigQuery. Having a strong FinOps strategy, along with building a culture of awareness of costs and having a vision to lessen the costs, will help a great deal in taking care of the cloud data warehouse.

Join Pump for Free

If you are an early-stage startup that wants to save on cloud costs, use this opportunity. If you are a start-up business owner who wants to cut down the cost of using the cloud, then this is your chance. Pump helps you save up to 60% in cloud costs, and the best thing about it is that it is absolutely free!

Pump provides personalized solutions that allow you to effectively manage and optimize your Azure, GCP, and AWS spending. Take complete control over your cloud expenses and ensure that you get the most from what you have invested. Who would pay more when we can save better?

Are you ready to take control of your cloud expenses?

Similar Blog Posts

1455 Market Street, San Francisco, CA 94103

Made with

in San Francisco, CA

© All rights reserved. Pump Billing, Inc.