PUMP

Customers

Why Pump?

Product

Pricing

Resources

Hiring

Book a demo

PUMP

Customers

Why Pump?

Pricing

Hiring

Book a demo

Home

Blog

GCP

How to Save 90% on BigQuery Storage Costs

Jul 24, 2025

Table of contents will appear here.

When I first saw our customer's BigQuery storage bill, I couldn't believe the numbers. They were paying over $7,000 monthly for data storage that should have cost a fraction of that amount. The frustrating part? This massive expense was entirely preventable.

After years of working with BigQuery implementations, I've learned that most storage cost problems stem from one simple issue: default configurations. Like many powerful tools, BigQuery ships with settings designed for broad compatibility rather than cost optimization. The result? Companies unknowingly pay premium prices for features they may not even need.

This guide will walk you through the exact strategies I've used to reduce BigQuery storage costs by up to 90% for multiple customers. These aren't theoretical optimizations—they're battle-tested techniques that have saved real companies thousands of dollars monthly.

P.S. The first time I saw that $3.18k bill, I thought, “Shit, we are paying WHAT?!” My Founder was pacing around the office like we’d have to start burning cash in a trash can just to keep the lights on. Someone even joked, “Is it time to raise a Series B just to cover this?” But then we made a few small tweaks, and guess what? In the last 15 days, the cost dropped to $1.11k, a 90% savings since we set it up. Let me show you how to get results like this!!

Understanding BigQuery Storage Costs

Before you can cut costs, you should grasp how BigQuery actually bills for stored data. Unlike classic SQL servers that charge by disk size and license tier, BigQuery operates on a unique pricing model that still surprises tenured data engineers.

Storage Billing Models

BigQuery provides two separate billing models for storage, and each one can affect overall costs in different ways.

Logical Storage Billing (Default)

Bills on the size of uncompressed data
Time travel and fail-safe copies are not included in the charge
Costs per gigabyte are higher, but the structure is easier to follow

Physical Storage Billing (Opt-in)

Bills on the size of the compressed format
Time travel and fail-safe copies are counted in the assessment
For datasets that compress well, this option can yield much lower costs

The model you select can therefore move the needle on your total BigQuery storage costs.

Active vs Long-term Storage

BigQuery automatically splits your data into two pricing tiers:

Active Storage includes data touched in the past 90 days and is charged at the full rate.
Long-term Storage applies to data that has not changed for more than 90 days and costs half the active rate.

Each partition is checked on its own, switching tiers whenever its modification timestamp updates. This automatic mode can save money, yet many companies still overlook it in day-to-day operations.

Why BigQuery Storage Costs Spike

After working with numerous customers, I have identified the top three factors that consistently drive BigQuery storage costs upward.

Misconfigured Default Settings

BigQuery's built-in time travel capability can quickly balloon costs when the default configuration is left unchanged. Every insert, update, or delete operation generates a read-only snapshot, and all of those snapshots are kept for seven days unless the retention period is shortened manually. For high-throughput tables, that week of historical data can easily exceed the current dataset volume, increasing billing by hundreds of gigabytes or even terabytes.

Poor Data Lifecycle Management

Many companies treat BigQuery as if it were a perpetual warehouse, loading data and leaving it to sit unless someone complains. Over time, this approach produces:

Mountains of stale dev and QA snapshots are collecting dust
Raw ingested logs are retained long after they have been restructured to support reports
Failing to implement automated cleanup for temporary datasets

Inefficient Table Design

Storage costs can also spiral when tables are architected for convenience instead of performance. Issues frequently observed include:

Very large monolithic tables that block automatic pruning because no partition key exists
High-cardinality STRING fields used instead of INT64 or TIMESTAMP, wasting precious bytes
Copies of the same dataset are spread across several tables, with no consistent deduplication plan

Quick Wins for Immediate Savings

Let me share the quickest ways to drive down BigQuery storage costs with minimal work yet noticeable savings.

Switch to Physical Storage Billing

This single change delivered the 90% cost reduction I mentioned earlier. Here's how to implement it:

This single adjustment removed about 90% of my own costs. To turn it on, run:

ALTER SCHEMA my_dataset
SET OPTIONS(storage_billing_model = 'PHYSICAL');

When to Use Physical Storage Billing:

Data is stored as compressed files (Parquet, ORC, Avro)
You have large tables that are rarely updated
Time travel requirements are minimal

Optimize Time Travel Settings

Shortening the time travel period can cut storage costs right away:

ALTER SCHEMA my_dataset
SET OPTIONS(max_time_travel_hours = 48);

For most workloads, a 48-hour time travel window provides enough protection to guard against accidental loss while keeping overhead low.

Set Table Expiration Policies

You can also automate cleanup by setting expiration dates:

ALTER TABLE my_dataset.my_table SET OPTIONS (
  expiration_timestamp = TIMESTAMP_ADD(CURRENT_TIMESTAMP(), INTERVAL 30 DAY)
);

Audit and Delete Unused Data

Regularly run a query against INFORMATION_SCHEMA to identify tables that are stale:

SELECT
  table_schema,
  table_name,
  creation_time,
  last_modified_time
FROM `project_id.dataset_id.INFORMATION_SCHEMA.TABLES`
WHERE last_modified_time < TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 90 DAY)
ORDER BY creation_time DESC;

Review the list, then drop any table that serves no current purpose.

Advanced Storage Optimization Strategies

After applying basic cleanup, consider these advanced steps for deeper cost reductions.

Implement Intelligent Partitioning

Partitioning can cut both query bills and excess storage by structuring the data grid more logically:

CREATE TABLE my_dataset.sales_data
PARTITION BY DATE(transaction_date)
AS SELECT * FROM my_dataset.raw_sales;

Best Practices for Partitioning:

Use date/timestamp columns for true time-based partitions.
Think about integer ranges when values are uniformly spread.
Combine partitioning and clustering for peak read and write speed.

Apply Table Clustering

When tables grow large, clustering lets you subdivide them so queries touch only relevant segments, which in turn speeds scans and trims storage overhead:

CREATE TABLE my_dataset.user_events
PARTITION BY DATE(event_date)
CLUSTER BY user_id, event_type
AS SELECT * FROM my_dataset.raw_events;

Clustering works particularly well for:

Filter columns that vary widely and appear in WHERE predicates.
Keys used in JOIN across tables.
Data that arrives in a sequence that matches how people later want to read it.

Optimize Data Types

Daily data ends up taking more space than expected when types are slack:

Before (Inefficient):

CREATE TABLE inefficient_table (
  id STRING,           -- 24 bytes minimum
  status STRING,       -- Variable length
  created_at STRING    -- 19+ bytes for timestamp
);

After (Optimized):

CREATE TABLE optimized_table (
  id INT64,            -- 8 bytes
  status INT64,        -- 8 bytes (use a lookup table)
  created_at TIMESTAMP -- 8 bytes
);

Key advice:

Stick to native timestamp and numeric types wherever you can.
Swap high cardinality STRING fields for coded integers from a reference table whenever it makes sense.

Implement Data Archiving Strategies

To cut long-term storage costs, move old data to Google Cloud Storage and leave only the hottest records in BigQuery:

EXPORT DATA OPTIONS(
  uri='gs://my-bucket/archived-data/year=2024/*',
  format='PARQUET'
) AS
SELECT * FROM my_dataset.historical_data
WHERE year = 2024;

Archived tables stored in Parquet compress well and cost less. You can later query them as external tables without reloading, keeping analysis flexible while freeing up BigQuery slots.

Optimize Your BigQuery Storage with Pump

Looking to cut down your BigQuery costs? Pump makes that easier. By using our service and following the brief guide in this post, one customer sliced storage costs by 90% in only 45 days. The payoff was a happy customer and a much lower monthly rate. Work through the same steps, and you may be startled by your own savings.

Why choose Pump for optimization?

Save up to 60% on cloud costs
Seamless integration with your existing architecture
Results in weeks, not months

Check out the above image to see how much our customer saved in less than two months.

Taking Action Today

You do not need a large engineering team or a weeks-long project to change BigQuery storage bills. Start with these quick, tactical steps:

Audit Current Costs: Run the supplied storage-analysis queries to see where and why charges accumulate
Implement Physical Storage Billing: Change any datasets that use compressed formats to the physical billing option.
Optimize Time Travel: Reduce time travel windows to 24-48 hours for nearly every dataset.
Set Expiration Policies: Create rules that automatically clear out temporary and development tables after use.

The 90% savings I delivered for one customer came not from magic but from matching settings to the pricing model. Apply these fixes in order, and you can reach similar savings without sacrificing the performance your team relies on.

How to Save 90% on BigQuery Storage Costs

Understanding BigQuery Storage Costs

Storage Billing Models

Active vs Long-term Storage

Why BigQuery Storage Costs Spike

Misconfigured Default Settings

Poor Data Lifecycle Management

Inefficient Table Design

Quick Wins for Immediate Savings

Switch to Physical Storage Billing

Optimize Time Travel Settings

Set Table Expiration Policies

Audit and Delete Unused Data

Advanced Storage Optimization Strategies

Implement Intelligent Partitioning

Apply Table Clustering

Optimize Data Types

Implement Data Archiving Strategies

Optimize Your BigQuery Storage with Pump

Why choose Pump for optimization?

Taking Action Today

Similar Blog Posts

Product

Company

Resources

Legal

Product

Company

Resources

Legal

Product

Company

Resources

Legal