Avoid string intermediates in MCP transport read path#1319
Avoid string intermediates in MCP transport read path#1319
Conversation
Co-authored-by: ericstj <8918108+ericstj@users.noreply.github.com>
|
@copilot both of these implementations are very similar -- do you think it would be possible to factor our the functionality of reading a line from stream in UTF-8 as either a helper method, or an extension method on PipeReader? |
…elper Co-authored-by: ericstj <8918108+ericstj@users.noreply.github.com>
Done in b530d53. The common functionality is now in
Both transports now call |
src/ModelContextProtocol.Core/Client/StreamClientSessionTransport.cs
Outdated
Show resolved
Hide resolved
…ove NoBomUtf8Encoding from StreamClientSessionTransport Co-authored-by: ericstj <8918108+ericstj@users.noreply.github.com>
Done in 09cf5eb. All four review items addressed:
|
|
@copilot This is looking good. Can you add any tests around this? Is is possible to measure our memory footprint for read and ensuring it's below a particular threshold? It might also be interesting to add some tests around the line splitting behavior to ensure coverage of all the special cases - empty lines, mix of carriage returns and no, non-ascii characters, invalid encoding adjacent to a newline such as an unpaired surrogate. Make tests that do their best to break this since it's effectively trying to replace a full encoding implementation for line breaks. |
Co-authored-by: ericstj <8918108+ericstj@users.noreply.github.com>
Added 15 tests in
On memory measurement: adding a precise allocation bound in xUnit tests is not straightforward without a benchmark harness (e.g. BenchmarkDotNet with |
|
@copilot can you also add a case for a multi-byte character sequence that's interrupted with a |
Co-authored-by: ericstj <8918108+ericstj@users.noreply.github.com>
Added in d050187 as The test uses
Both old and new paths raise |
|
Another instance of this problem in StreamableHttpClientSessionTransport: There are some string intermediates in Sse types, but not for the payload so it's not worth changing unless it shows up through usage. |
…-sdk into copilot/fix-string-encoding-issue
|
I needed to tune the buffer size of the PipeReader to show a performance improvement. With the default value of 4KB it regressed perf on large payload sizes (1MB) because the number of buffer segments. Benchmarks are here. The results for these show Baseline:
Fix:
|
The read side of
StreamServerTransportandStreamClientSessionTransportwas decoding bytes →stringviaTextReader.ReadLineAsync(), then re-encoding back to UTF-8 internally duringJsonSerializer.Deserialize(string, ...). This round-trip was unnecessary sinceSystem.Text.Jsoncan deserialize directly from bytes.Changes
TextReaderwithPipeReaderon the input side of both transports.PipeReader.Create(stream)works across all targets; theSystem.IO.Pipelinesdependency was already present.JsonSerializer.Deserialize(ReadOnlySpan<byte>, typeInfo); multi-segment usenew Utf8JsonReader(ReadOnlySequence<byte>, ...)+JsonSerializer.Deserialize(ref reader, typeInfo).EncodingUtilities.GetUtf8String()(for trace logging) only allocates whenLogger.IsEnabled(LogLevel.Trace). For single-segment sequences it callsEncoding.UTF8.GetString(span)directly; multi-segment falls back toToArray().EndsWithCarriageReturn()checks the last byte of the last segment directly rather than creating an intermediate slice.encodingconstructor parameter fromStreamClientSessionTransport— it was only used to configure the now-removedTextReaderwrapper. Both call sites already passednull.PipeReaderExtensions— the buffer management loop, newline detection, and CRLF trimming are now in a singlePipeReaderExtensionsclass (inProtocol/). Both transports callawait pipeReader.ReadLinesAsync(ProcessLineAsync, cancellationToken)and share identicalProcessLineAsyncimplementations.GetUtf8StringtoEncodingUtilities— with an explicit empty-sequence fast path, following the existing utility pattern.NoBomUtf8EncodingtoStdioClientTransport— removed fromStreamClientSessionTransport(no longer needed there) and added as a privatestatic readonlyfield inStdioClientTransport, its only consumer.PipeReaderExtensionsTests— 16 tests covering line-splitting edge cases viaStreamServerTransport: empty/blank lines, LF and CRLF termination (single and multi-segment), mixed line endings, standalone\rpreservation, CRLF split across pipe segments, non-ASCII and multi-byte UTF-8 content, multi-byte character split across segments, invalid JSON line recovery, unterminated data not delivered, and a multi-byte UTF-8 sequence interrupted by\n(verifying both resulting lines raiseJsonExceptionand are silently skipped, matching the behavior of the previousStreamReader-based implementation).Original prompt
💬 We'd love your input! Share your thoughts on Copilot coding agent in our 2 minute survey.