Performance Optimizations

Table of Contents

Measure the Starting Point
Uncover the Real Bottlenecks
Use Reactive Programming
Database Optimizations in Java applications
Serialization Optimizations in Java applications
Thread Pool and Connection Tuning: The Configuration Magic
Horizontal Scaling
Performance Optimizations - Compress Response Payloads
Measure the Results
Key Lessons
References

Measure the Starting Point

Before making any changes, establish clear performance baselines. This step is non-negotiable; without knowing your starting point, you can’t measure progress or identify the biggest opportunities for improvement.

We need to know what the initial metrics looks like:

// Initial Performance Metrics

Maximum throughput: 50,000 requests/second
Average response time: 350ms
95th percentile response time: 850ms
CPU utilization during peak: 85-95%
Memory usage: 75% of available heap
Database connections: Often reaching max pool size (100)
Thread pool saturation: Frequent thread pool exhaustion

Use a combination of tools to gather these metrics:

JMeter: For load testing and establishing basic throughput numbers
Micrometer + Prometheus + Grafana: For real-time monitoring and visualization
JProfiler: For deep-dive analysis of hotspots in the code
Flame graphs: To identify CPU-intensive methods

With these baseline metrics in hand, we can prioritize optimizations and measure their impact.

Uncover the Real Bottlenecks

Initial profiling revealed several interesting bottlenecks:

Thread pool saturation: The default Tomcat connector was hitting its limits
Database connection contention: HikariCP configuration was not optimized for our workload
Inefficient serialization: Jackson was consuming significant CPU during request/response processing
Blocking I/O operations: Especially when calling external services
Memory pressure: Excessive object creation causing frequent GC pauses

Tackle each of these systematically.

Use Reactive Programming

Database Optimizations in Java applications

Serialization Optimizations in Java applications

Thread Pool and Connection Tuning: The Configuration Magic

With WebFlux, we needed to tune Netty’s event loop settings

spring:
  reactor:
    netty:
      worker:
        count: 16  # Number of worker threads (2x CPU cores)
      connection:
        provider:
          pool:
            max-connections: 10000
            acquire-timeout: 5000

For the parts of our application still using Spring MVC, I tuned the Tomcat connector:

server:
  tomcat:
    threads:
      max: 200
      min-spare: 20
    max-connections: 8192
    accept-count: 100
    connection-timeout: 2000

These settings allowed us to handle more concurrent connections with fewer resources.

Horizontal Scaling

Performance Optimizations - Compress Response Payloads

Measure the Results

After all optimizations, our metrics improved dramatically

// Final Performance Metrics
Maximum throughput: 1,200,000 requests/second
Average response time: 85ms (was 350ms)
95th percentile response time: 120ms (was 850ms)
CPU utilization during peak: 60-70% (was 85-95%)
Memory usage: 50% of available heap (was 75%)
Database queries: Reduced by 70% thanks to caching
Thread efficiency: 10x improvement with reactive programming

The most satisfying result? During our Black Friday sale, the system handled 1.2 million requests per second without breaking a sweat no alerts, no downtime, just happy customers.

Key Lessons

Measurement is everything: Without proper profiling, I would have optimized the wrong things.
Reactive isn’t always better: We kept some endpoints on Spring MVC where it made more sense, using a hybrid approach.
The database is usually the bottleneck: Caching and query optimization delivered some of our biggest wins.
Configuration matters: Many of our improvements came from simply tuning default configurations.
Don’t scale prematurely: We optimized the application first, then scaled horizontally, which saved significant infrastructure costs.
Test with realistic scenarios: Our initial benchmarks using synthetic tests didn’t match production patterns, leading to misguided optimizations.
Optimize for the 99%: Some endpoints were impossible to optimize further, but they represented only 1% of our traffic, so we focused elsewhere.
Balance complexity and maintainability: Some potential optimizations were rejected because they would have made the codebase too complex to maintain.

Performance optimization isn’t about finding one magic bullet; it’s about methodically identifying and addressing bottlenecks across your entire system. With Spring Boot, the capabilities are there; you just need to know which levers to pull.

References

https://medium.com/@mohitbajaj1995/how-i-optimized-a-spring-boot-application-to-handle-1m-requests-second-0cbb2f2823ed

Links to this note

Java notes