Microsoft Dev BlogsMay 22, 2023

The Art of HTTP Connection Pooling: How to Optimize Your Connections for Peak Performance

The Art of HTTP Connection Pooling:

Summary

Migrating an on-prem system to the public cloud can be challenging, especially when faced with connection issues such as timeout exceptions or HTTP 504 Gateway Timeout messages. Connection pooling is a useful technique for improving application performance by reducing the overhead of establishing new connections, particularly for applications that make multiple HTTP requests simultaneously. Through a benchmarking exercise comparing load and stress testing of applications with and without connection pools, it was observed that applications using connection pools gain significant performance optimization in service requests average response time. In some cases, the transaction speed is five times faster, especially in multi-cloud or hybrid cloud environments.

Case Study

During the modernization of a clinical product called Utilization Management in the Cloud, the test team encountered intermittent 5-second delays during pre-production load testing of a microservice. The problem involved a critical integration between a Clinical Guideline Service hosted in Azure and a service located in AWS. It was evident that client resets (RST) were due to different ephemeral port reuse behaviors on the Azure and AWS network infrastructure.

Connection Pool Starvation

HTTP connection pooling can be automatically managed by software development frameworks. However, connection pool starvation can result when the number of available connections is insufficient to meet the demand from client applications. In such cases, connection pool tuning can improve system performance. By using a keep-alive strategy, the client and server can maintain a persistent connection to reduce latency for subsequent requests. However, the keep-alive parameters should be carefully configured to avoid excessive resource usage, particularly in high-traffic environments.

Conclusion

Scaling up the service by adding more CPUs and memory is a common quick fix when network bottlenecks occur in web applications deployed in the cloud. However, connection pooling is a more efficient technique for improving application performance, particularly in multi-cloud or hybrid cloud environments.