Cutting Cloud Run costs with Caching and Data Optimization

From $350 to $70: Cutting Cloud Run Costs by 4x with Caching and Data Optimization

The Initial Challenge: High Costs and Slow Startups

Our team recently tackled a significant cost challenge with a Plotly Dash application, “FSA Dashboard,” deployed on Google Cloud Run. To ensure a snappy user experience, especially after periods of inactivity, the service was initially configured with min-instances=1, meaning at least one instance remained active 24/7.

While this delivered excellent responsiveness, it resulted in monthly Cloud Run costs for this specific service ranging from $300 to $350 - a figure that felt high for an internal tool with moderate, non-constant usage. Our primary goal was to drastically reduce these costs by leveraging Cloud Run’s ability to scale down to zero instances (min-instances=0) when idle.

However, simply flipping the switch to min-instances=0 wasn’t an option due to a critical bottleneck: the application’s startup process. Upon starting, the Dash application needed to load substantial amounts of data directly from numerous BigQuery data marts into memory, typically as pandas DataFrames.

This direct, on-demand data loading was incredibly slow, consistently taking over five minutes for the application to become fully initialized and ready to serve requests. Cloud Run has strict startup timeouts, and a five-minute initialization time was well beyond acceptable limits, leading to failed cold starts and an unreliable user experience.

The Solution: Introducing a Caching Layer and Data Pre-processing

To enable fast cold starts and thus min-instances=0, we engineered a significant architectural shift focused on pre-processing and caching data closer to the application. The solution involved three main components:

  1. We provisioned a Redis Mem cache instance on Google Cloud to serve as a high-speed, in- memory caching layer for our BigQuery data.
  2. A dedicated Cloud Run Job was developed and scheduled to run periodically. This job’s sole purpose was to query, process, and transform the necessary data from BigQuery, then load it into specific keys within the Redis Mem cache instance.
  3. The core Dash application was refactored to no longer query BigQuery directly during startup or request processing; instead, it now reads its pre-loaded data directly from the Redis cache.

Technical Deep Dive: Efficient Data Handling

Beyond the architectural changes, the way data was handled within this new setup was crucial for performance and memory efficiency. We opted to store the data in Redis as compact Parquet byte arrays, leveraging Parquet’s efficiency for both storage size and read performance. Within the Dash application, we implemented a singleton class to manage access to this cached data, ensuring that the data was loaded from Redis into the application’s memory only once per instance (or upon a refresh trigger).

This meant subsequent data requests within the application hit the fast, in-memory representation. Furthermore, to optimize processing large datasets within Dash callbacks, we implemented predicate pushdown filtering. This technique applies user-defined filters as early as possible, ideally before the full dataset is even loaded into a pandas DataFrame, dramatically reducing the amount of data processed by each callback and minimizing memory usage.

The Results: Dramatically Reduced Costs and Improved Performance

These comprehensive architectural and code changes yielded impressive results. The application’s initialization time plummeted from over five minutes to approximately 30 seconds, well within the acceptable range for Cloud Run cold starts. This also led to a substantial reduction in memory footprint, as instances no longer needed to load multiple large DataFrames directly.

With the startup time resolved, we were finally able to configure the Cloud Run service with min-instances=0, allowing it to scale down to zero when idle and only pay for compute time when requests were actively being served. This change was the key to unlocking significant cost efficiencies: our monthly Cloud Run costs for this service dropped from the $300-$350 range to approximately $60-$80, representing a remarkable 4x cost reduction.