-
Notifications
You must be signed in to change notification settings - Fork 873
Multi Part Download: Use Chunked Stream #4231
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: feature/transfermanager
Are you sure you want to change the base?
Conversation
1602121 to
5c59042
Compare
5c59042 to
0bf170a
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR refactors the S3 multipart download buffering implementation to use a chunked stream approach instead of single large ArrayPool buffers. The change addresses potential issues with very large parts (>2GB) and Large Object Heap allocations while maintaining the same functional behavior.
Key Changes:
- Replaces
StreamPartBufferandBufferedDataSourcewithChunkedBufferStreamandChunkedPartDataSourcethat use multiple small 80KB ArrayPool chunks instead of single large buffers - Removes the obsolete
AddBufferoverload methods fromIPartBufferManagerinterface, consolidating onAddDataSource - Removes unused
configparameter fromBufferedMultipartStreamconstructor
Reviewed changes
Copilot reviewed 15 out of 15 changed files in this pull request and generated 9 comments.
Show a summary per file
| File | Description |
|---|---|
| sdk/src/Services/S3/Custom/Transfer/Internal/ChunkedBufferStream.cs | New core stream implementation that manages data across multiple 80KB ArrayPool chunks to avoid LOH and support parts >2GB |
| sdk/src/Services/S3/Custom/Transfer/Internal/ChunkedPartDataSource.cs | New IPartDataSource wrapper for ChunkedBufferStream with part number tracking and disposal management |
| sdk/src/Services/S3/Custom/Transfer/Internal/BufferedPartDataHandler.cs | Updated to create ChunkedPartDataSource instead of StreamPartBuffer for out-of-order parts |
| sdk/src/Services/S3/Custom/Transfer/Internal/BufferedMultipartStream.cs | Removed unused config parameter from constructor |
| sdk/src/Services/S3/Custom/Transfer/Internal/PartBufferManager.cs | Removed obsolete AddBuffer overload methods |
| sdk/src/Services/S3/Custom/Transfer/Internal/IPartBufferManager.cs | Removed obsolete AddBuffer method declarations from interface |
| sdk/src/Services/S3/Custom/Transfer/Internal/StreamPartBuffer.cs | Deleted - replaced by ChunkedBufferStream |
| sdk/src/Services/S3/Custom/Transfer/Internal/BufferedDataSource.cs | Deleted - replaced by ChunkedPartDataSource |
| sdk/test/Services/S3/UnitTests/Custom/StreamPartBufferTests.cs | Deleted - tests for removed class |
| sdk/test/Services/S3/UnitTests/Custom/BufferedDataSourceTests.cs | Deleted - tests for removed class |
| sdk/test/Services/S3/UnitTests/Custom/PartBufferManagerTests.cs | Updated all test cases to use ChunkedPartDataSource instead of StreamPartBuffer |
| sdk/test/Services/S3/UnitTests/Custom/BufferedPartDataHandlerTests.cs | Updated mocks and assertions to expect ChunkedPartDataSource instead of StreamPartBuffer |
| sdk/test/Services/S3/UnitTests/Custom/BufferedMultipartStreamTests.cs | Updated constructor calls to remove config parameter |
sdk/src/Services/S3/Custom/Transfer/Internal/ChunkedPartDataSource.cs
Outdated
Show resolved
Hide resolved
sdk/src/Services/S3/Custom/Transfer/Internal/ChunkedBufferStream.cs
Outdated
Show resolved
Hide resolved
sdk/src/Services/S3/Custom/Transfer/Internal/ChunkedBufferStream.cs
Outdated
Show resolved
Hide resolved
0bf170a to
7ec7d2f
Compare
7ec7d2f to
3ac6179
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
Copilot reviewed 13 out of 13 changed files in this pull request and generated 1 comment.
sdk/test/Services/S3/UnitTests/Custom/ChunkedPartDataSourceTests.cs
Outdated
Show resolved
Hide resolved
| #endregion | ||
|
|
||
| #region Logger | ||
| private readonly Logger _logger = Logger.GetLogger(typeof(PartBufferManager)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixing this to be consistent with the other classes
Description
This PR refactors the buffered multipart download data handling by replacing the single large ArrayPool buffer approach with a chunked buffer stream design. The key changes include:
ChunkedBufferStreamclass: A writable/readable stream that stores data across multiple small (80KB) ArrayPool buffers, avoiding Large Object Heap (LOH) allocationsChunkedPartDataSourceclass: AnIPartDataSourceimplementation that wrapsChunkedBufferStreamfor seamless integration with thePartBufferManagerBufferedDataSourceandStreamPartBufferclasses: These are replaced by the new chunked approachBufferedPartDataHandler: Modified to use the newChunkedBufferStreaminstead ofStreamPartBufferPartBufferManagerandIPartBufferManager: Simplified interface and implementation to work with the new data source patternMotivation and Context
The previous buffering implementation used single large ArrayPool buffers which had two issues:
The new chunked approach:
#3806
Testing
ChunkedBufferStream(878 lines) covering:ChunkedPartDataSource(602 lines) covering:BufferedPartDataHandlerTestsandPartBufferManagerTeststo work with the new implementationBufferedDataSourceandStreamPartBufferPerformance Testing
I did notice some slowdown from the older implementation
New
vs
Old
50TB Testing
I also tested with the 50TB file and didnt see any issues
Screenshots (if appropriate)
Ran using visual studio performance analyzer
before change
after change
no LOH allocations
Types of changes
Checklist
License