Read InputStream bodies straight into the buffer, no staging array#2233
Read InputStream bodies straight into the buffer, no staging array#2233pavel-ptashyts wants to merge 1 commit into
Conversation
InputStreamBodyGenerator's Body.transferTo allocated a fresh byte[] (sized to the target's writable region) on every call, read the stream into it, then copied those bytes into the target ByteBuf — a per-chunk allocation plus a redundant copy. Read directly into the target via ByteBuf.writeBytes(InputStream, int) instead (the same approach FileLikeMultipartPart uses), dropping both the staging array and the copy; the per-instance chunk field is gone. Behaviour is unchanged: it writes the bytes read this call and returns CONTINUE while data remains, STOP at EOF or on an I/O error (logged, as before). The '- 10' writable margin is preserved. Reachability note: AsyncHttpClient's own request path routes an InputStreamBodyGenerator to NettyInputStreamBody (NettyRequestFactory), so this Body is reached only through the public InputStreamBodyGenerator.createBody() API (external/custom callers); the change improves that path rather than pruning a public type. Adds InputStreamBodyGeneratorTest covering byte-for-byte transfer across multiple reads, a single read draining a small stream, and immediate STOP on an empty stream. No public API change. Fixes finding AsyncHttpClient#8.
| int read; | ||
| try { | ||
| read = inputStream.read(chunk); | ||
| read = target.writeBytes(inputStream, target.writableBytes() - 10); |
There was a problem hiding this comment.
Can we drop the - 10 here instead of keeping it? I dug into the history, and it comes from an old commit literally titled "More safe guess, still need a better solution", so it doesn't appear to have been an intentional requirement. None of the current consumers (BodyChunkedInput, NettyBodyBody, AuthenticatorUtils) write anything into the buffer after transferTo, and InputStreamMultipartPart already uses target.writeBytes(inputStream, target.writableBytes()) with no reserved margin.
Keeping it also preserves two edge-case bugs. If writableBytes() < 10, the computed length becomes negative and Netty throws IllegalArgumentException, which isn't caught by the existing IOException handler. (The old code failed here too, but with NegativeArraySizeException.) If writableBytes() == 10, the computed length is 0, read() returns 0, and we return STOP without writing anything, silently truncating a stream that still has data.
This is reachable via something like new InputStreamBodyGenerator(stream, 10L) through BodyChunkedInput, since its chunk size is min(contentLength, 8192). Only external callers can hit this today, but since this PR is cleaning up exactly this legacy code, it seems like the right opportunity to remove the workaround rather than preserve its behavior. A small test covering contentLength <= 10 would lock the fix in.
|
|
||
| int read = -1; | ||
| boolean write = false; | ||
| // Read straight from the stream into the target buffer: no per-call staging byte[] and no extra |
There was a problem hiding this comment.
Two corrections are needed in this comment.
First, "no per-call staging byte[] and no extra copy" is only true for heap buffers. InputStream.read() can only read into a byte[], so for direct buffers Netty still has to stage through heap memory internally. For an 8 KB chunk, threadLocalTempArray allocates a new array on every call because the size exceeds the 1024-byte thread-local cache, and the unsafe path acquires a pooled heap buffer and performs a copyMemory() on every call. BodyChunkedInput allocates from the channel allocator, which prefers direct buffers by default, so on that path this change is roughly allocation-for-allocation equivalent to the previous implementation. The improvement is real for heap buffers, which is what the new test exercises, so I'd reword the comment to make that distinction.
Second, FileLikeMultipartPart is the wrong reference. It's abstract and never calls writeBytes(InputStream, int). The class that actually uses this pattern is InputStreamMultipartPart, and it passes the full writableBytes() without reserving any margin.
| public class InputStreamBodyGeneratorTest { | ||
|
|
||
| private static final int CHUNK_SIZE = 1024 * 8; | ||
|
|
There was a problem hiding this comment.
Minor: the existing tests in this package (ByteArrayBodyGeneratorTest and FeedableBodyGeneratorTest) use @RepeatedIfExceptionsTest(repeats = 5) on every test method, so using plain @Test breaks the local convention.
ByteArrayBodyGeneratorTest also already has the same single-read and multi-read scaffold (same chunk size, same 3 * CHUNK_SIZE + 42 sizing, and the same drain loop), while FeedableBodyGeneratorTest provides a readFromBody helper that covers the short-stream cases. It would be worth aligning with the existing annotation and reusing the existing test patterns rather than introducing a second drain idiom to the package.
InputStreamBodyGenerator's Body.transferTo allocated a fresh byte[] (sized to the target's writable region) on every call, read the stream into it, then copied those bytes into the target ByteBuf — a per-chunk allocation plus a redundant copy. Read directly into the target via ByteBuf.writeBytes(InputStream, int) instead (the same approach FileLikeMultipartPart uses), dropping both the staging array and the copy; the per-instance chunk field is gone.
Behaviour is unchanged: it writes the bytes read this call and returns CONTINUE while data remains, STOP at EOF or on an I/O error (logged, as before). The '- 10' writable margin is preserved.
Reachability note: AsyncHttpClient's own request path routes an InputStreamBodyGenerator to NettyInputStreamBody (NettyRequestFactory), so this Body is reached only through the public InputStreamBodyGenerator.createBody() API (external/custom callers); the change improves that path rather than pruning a public type.
Adds InputStreamBodyGeneratorTest covering byte-for-byte transfer across multiple reads, a single read draining a small stream, and immediate STOP on an empty stream. No public API change.