Skip to content

Fix intermittent read hangs on reused HTTP/2 and pooled HTTP/1.1 connections#884

Merged
benoitc merged 1 commit into
masterfrom
fix/pooled-read-hangs
Jun 17, 2026
Merged

Fix intermittent read hangs on reused HTTP/2 and pooled HTTP/1.1 connections#884
benoitc merged 1 commit into
masterfrom
fix/pooled-read-hangs

Conversation

@benoitc

@benoitc benoitc commented Jun 17, 2026

Copy link
Copy Markdown
Owner

HTTP/2 (primary)

A response that signals END_STREAM with a trailing HEADERS frame (trailers, or an empty trailing HEADERS, as proxies/ALBs emit for no-content-length streamed bodies) left the body reader parked forever. The h2 lib delivers it as a {trailers,...} event with no terminal DATA frame, and handle_h2_event ignored it, so the read blocked until the connection died (~30s) and returned nothing. h2-only because curl handles trailers fine.

This is not the issue-#544 socket active/passive path: h2 connections never traverse the pool is_ready/available socket path, they are reused via checkout_h2 and the h2 lib owns the socket.

Fix: route {trailers, StreamId, _} to the END_STREAM path so the read completes on fresh and reused connections.

Also adds a per-stream recv_timeout watchdog for sync h2 reads, which previously parked on an infinity gen_statem call with no timer, so any other lost-frame condition fails fast with {error, timeout} instead of hanging.

HTTP/1.1

The #544 idle close-detection sets pooled sockets to {active, once}, so data can land in the connection mailbox while idle. At checkout the old code discarded it with a best-effort receive ... after 0, which could strand bytes (passive recv blocks on an empty buffer) or corrupt the next read.

Fix: at checkout, drain the mailbox and peek the socket buffer, and refuse to reuse the connection on any pending data, close, or error. hackney does not pipeline, so idle data can never belong to the next response. Healthy idle connections still reuse normally, preserving keep-alive and the #544 stale-connection detection.

Tests

  • hackney_http2_trailers_tests: deterministic trailered-response repro, a 40-request pooled-reuse stress alternating trailered and END_STREAM-on-DATA responses, and a never-ending response that must fail fast on recv_timeout.
  • hackney_pool_tests: a connection fed unsolicited idle data is refused at checkout, while a clean idle connection stays reusable.

Full eunit suite (964 tests) and dialyzer pass.

…ections

HTTP/2: a response that signals END_STREAM with a trailing HEADERS frame
(trailers, or an empty trailing HEADERS, as proxies emit for no-content-length
streamed bodies) left the body reader parked forever. The h2 lib delivers it as
a {trailers,...} event with no terminal DATA frame, and handle_h2_event ignored
it, so the read blocked until the connection died (~30s). Route trailers to the
END_STREAM path so the read completes on fresh and reused connections.

Add a per-stream recv_timeout watchdog for sync h2 reads, which previously
parked on an infinity gen_statem call with no timer, so any other lost-frame
condition fails fast with {error, timeout}.

HTTP/1.1: at checkout, refuse to reuse a pooled connection that received
unsolicited data while idle instead of discarding the bytes. hackney does not
pipeline, so idle data cannot belong to the next response; reusing the socket
stranded or corrupted the next read. Drain the mailbox and peek the socket
buffer, and drop the connection on any pending data, close, or error. Healthy
idle connections still reuse normally, preserving keep-alive and the #544
stale-connection detection.
@benoitc benoitc merged commit fd13f1f into master Jun 17, 2026
6 checks passed
@benoitc benoitc deleted the fix/pooled-read-hangs branch June 17, 2026 08:49
@benoitc benoitc mentioned this pull request Jun 17, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant