Performance Optimizations

Measure the Starting Point

Before making any changes, establish clear performance baselines. This step is non-negotiable; without knowing your starting point, you can’t measure progress or identify the biggest opportunities for improvement.

We need to know what the initial metrics looks like:

// Initial Performance Metrics

Maximum throughput: 50,000 requests/second
Average response time: 350ms
95th percentile response time: 850ms
CPU utilization during peak: 85-95%
Memory usage: 75% of available heap
Database connections: Often reaching max pool size (100)
Thread pool saturation: Frequent thread pool exhaustion

Use a combination of tools to gather these metrics:

  1. JMeter: For load testing and establishing basic throughput numbers
  2. Micrometer + Prometheus + Grafana: For real-time monitoring and visualization
  3. JProfiler: For deep-dive analysis of hotspots in the code
  4. Flame graphs: To identify CPU-intensive methods

With these baseline metrics in hand, we can prioritize optimizations and measure their impact.

Uncover the Real Bottlenecks

Initial profiling revealed several interesting bottlenecks:

  1. Thread pool saturation: The default Tomcat connector was hitting its limits
  2. Database connection contention: HikariCP configuration was not optimized for our workload
  3. Inefficient serialization: Jackson was consuming significant CPU during request/response processing
  4. Blocking I/O operations: Especially when calling external services
  5. Memory pressure: Excessive object creation causing frequent GC pauses

Tackle each of these systematically.

Use Reactive Programming

  1. Use Reactive Programming
  2. Reactive Programming - Convert REST endpoints to be Reactive

Database Optimizations in Java applications

Serialization Optimizations in Java applications

Thread Pool and Connection Tuning: The Configuration Magic

With WebFlux, we needed to tune Netty’s event loop settings

spring:
  reactor:
    netty:
      worker:
        count: 16  # Number of worker threads (2x CPU cores)
      connection:
        provider:
          pool:
            max-connections: 10000
            acquire-timeout: 5000

For the parts of our application still using Spring MVC, I tuned the Tomcat connector:

server:
  tomcat:
    threads:
      max: 200
      min-spare: 20
    max-connections: 8192
    accept-count: 100
    connection-timeout: 2000

These settings allowed us to handle more concurrent connections with fewer resources.

Horizontal Scaling

Performance Optimizations - Compress Response Payloads

Measure the Results

After all optimizations, our metrics improved dramatically

// Final Performance Metrics
Maximum throughput: 1,200,000 requests/second
Average response time: 85ms (was 350ms)
95th percentile response time: 120ms (was 850ms)
CPU utilization during peak: 60-70% (was 85-95%)
Memory usage: 50% of available heap (was 75%)
Database queries: Reduced by 70% thanks to caching
Thread efficiency: 10x improvement with reactive programming

The most satisfying result? During our Black Friday sale, the system handled 1.2 million requests per second without breaking a sweat no alerts, no downtime, just happy customers.

Key Lessons

  1. Measurement is everything: Without proper profiling, I would have optimized the wrong things.
  2. Reactive isn’t always better: We kept some endpoints on Spring MVC where it made more sense, using a hybrid approach.
  3. The database is usually the bottleneck: Caching and query optimization delivered some of our biggest wins.
  4. Configuration matters: Many of our improvements came from simply tuning default configurations.
  5. Don’t scale prematurely: We optimized the application first, then scaled horizontally, which saved significant infrastructure costs.
  6. Test with realistic scenarios: Our initial benchmarks using synthetic tests didn’t match production patterns, leading to misguided optimizations.
  7. Optimize for the 99%: Some endpoints were impossible to optimize further, but they represented only 1% of our traffic, so we focused elsewhere.
  8. Balance complexity and maintainability: Some potential optimizations were rejected because they would have made the codebase too complex to maintain.

Performance optimization isn’t about finding one magic bullet; it’s about methodically identifying and addressing bottlenecks across your entire system. With Spring Boot, the capabilities are there; you just need to know which levers to pull.

References

  1. https://medium.com/@mohitbajaj1995/how-i-optimized-a-spring-boot-application-to-handle-1m-requests-second-0cbb2f2823ed

Links to this note