You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/grpc_channel_pool_guide.md
+4-4Lines changed: 4 additions & 4 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -14,11 +14,11 @@ This guide provides general best practices for configuring and tuning gRPC chann
14
14
15
15
## Background
16
16
17
-
When handling heavy traffic, a single gRPC connection often becomes a performance bottleneck. To overcome this, the Google Cloud Java client libraries use **Channel Pooling** via the Gax-Java (Google API Extensions for Java) library.
17
+
For applications with high throughput or concurrency demands, a single gRPC connection can potentially become a performance bottleneck due to limits on concurrent streams. Users may experience a spike in latency as requests queue on the gRPC connection when the limit is reached. Google middleware enforces a limit of 100 streams per connection. To overcome this, the Google Cloud Java client libraries use **Channel Pooling** via the Gax-Java (Google API Extensions for Java) library.
18
18
19
19
GAX is the internal transport layer shared by all Google Cloud Java client libraries. It handles connection management, retries, and configuration, including channel pools.
20
20
21
-
Channel pooling spreads the outbound RPC load across multiple identical gRPC connections, ensuring higher throughput and better resilience.
21
+
Channel pooling spreads the outbound RPC load across multiple identical gRPC connections, which can help achieve higher throughput and better resilience for demanding workloads.
Having **too few** connections causes in-flight requests to queue on the client side, which shows up as high latency. Having **too many**causes idle connections to be dropped by GFE, inducing latency spikes as connections are re-established when traffic returns.
100
+
If a pool has **too few** connections for its workload, in-flight requests might queue on the client side, which can show up as high latency. Conversely, having **too many**connections may lead to idle connections being dropped by GFE, potentially inducing tail latency spikes as connections need to be re-established when traffic returns.
101
101
102
102
### Why Do We Account for Concurrency?
103
103
@@ -106,7 +106,7 @@ gRPC sends multiple requests over the same connection simultaneously using HTTP/
106
106
-**Serial throughput of one stream**: If each request takes 20ms, a single stream can complete 1,000ms / 20ms = **50 requests per second**.
107
107
-**Concurrent streams needed**: If your application must serve 5,000 requests per second, one stream handling 50 req/s is not enough. You need enough simultaneous streams to collectively handle 5,000 req/s. That requires 5,000 / 50 = **100 streams open at the same time**.
108
108
109
-
If you size your pool without accounting for this concurrency demand, you risk **client-side queuing** (too few channels, too few streams) or **idle connection dropouts** (more channels than your traffic needs).
109
+
If the pool is sized without considering these concurrency factors, it may lead to **client-side queuing** (if sized too small) or **idle connection dropouts** (if sized too large).
0 commit comments