Skip to content

Conversation

@varadarajan-tw
Copy link
Contributor

@varadarajan-tw varadarajan-tw commented Oct 31, 2025

This PR

  • Adds support for fastput from Buffer for SFTP destination. this will allow us to run concurrent uploads of chunks speeding up uploads. This method is not available in the libary we use. so, we had to write our own version. The library only supports fastput from file. Manual load tests confirmed that fastPut is 3x faster than put. Thats why we wanted to write on that does the same thing as fastPut but from a buffer instead.
  • Adds a new setting called uploadStrategy. This is set to standard by default. If a customers SFTP server doesn't support concurrent, they can use standard.

[PS: I might have to do some more research. On digging in further, it seems like fastput/fastget is not really a server support capability]

From tests, we are seeing 3x improvement in upload speeds with fastPutFromBuffer method.

Video Walkthrough of this change

Testing

New Setting to select upload strategy

image

Test Authentication continues to work

image

10k upload worked successfully within ~6 seconds with fastput enabled. Previously, we were able to upload only upto 25-30s

image image

Regression tested with useConcurrent writes set to false with batch size of 10k. Uploads were successful
image

  • Added unit tests for new functionality
  • Tested end-to-end using the local server
  • [If destination is already live] Tested for backward compatibility of destination. Note: New required fields are a breaking change.
  • [Segmenters] Tested in the staging environment
  • [Segmenters] [If applicable for this change] Tested for regression with Hadron.

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR introduces a custom SFTP client wrapper (SFTClientCustom) to replace the standard ssh2-sftp-client for file uploads. The new implementation provides a custom fastPutFromBuffer method with lower-level control over the upload process using the ssh2 library's SFTPWrapper directly.

Key changes:

  • New SFTClientCustom class that wraps ssh2-sftp-client with custom upload logic
  • Updated uploadSFTP function to use the new custom client instead of executeSFTPOperation
  • Custom abort signal handling implementation for upload operations

Reviewed Changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 5 comments.

File Description
packages/destination-actions/src/destinations/sftp/sftp-wrapper.ts New custom SFTP client class with manual chunked upload implementation
packages/destination-actions/src/destinations/sftp/client.ts Modified uploadSFTP to use custom client and implement new abort handling logic

@codecov
Copy link

codecov bot commented Oct 31, 2025

Codecov Report

❌ Patch coverage is 92.92929% with 7 lines in your changes missing coverage. Please review.
✅ Project coverage is 80.38%. Comparing base (3c3295e) to head (99f171e).
⚠️ Report is 13 commits behind head on main.

Files with missing lines Patch % Lines
...tion-actions/src/destinations/sftp/sftp-wrapper.ts 92.72% 4 Missing ⚠️
...estination-actions/src/destinations/sftp/client.ts 90.90% 2 Missing and 1 partial ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #3387      +/-   ##
==========================================
+ Coverage   79.98%   80.38%   +0.40%     
==========================================
  Files        1211     1268      +57     
  Lines       22356    25306    +2950     
  Branches     4407     5229     +822     
==========================================
+ Hits        17881    20343    +2462     
- Misses       3695     4161     +466     
- Partials      780      802      +22     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

Copilot reviewed 14 out of 15 changed files in this pull request and generated 6 comments.

@segmentio segmentio deleted a comment from Copilot AI Nov 4, 2025
@varadarajan-tw varadarajan-tw changed the title Adds fastput from buffer for SFTP destination [SFTP destination] Adds support for fastput upload from buffer Nov 4, 2025
@varadarajan-tw varadarajan-tw marked this pull request as ready for review November 4, 2025 10:18
@varadarajan-tw varadarajan-tw requested a review from a team as a code owner November 4, 2025 10:18
@varadarajan-tw varadarajan-tw changed the title [SFTP destination] Adds support for fastput upload from buffer [SFTP destination] Adds support for fast upload from buffer Nov 4, 2025
})
}

const processWrites = async () => {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alternatively, could we have used "async" library which would have further simplified the implementation?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you help understand how an async library would help? Its anyways 5-6 lines of code.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My main concern is that we are first executing the first 64 chunk and then moving on to the next 64 chunk. Let's say, if one chunk out of 64 chunk is stuck, we are not moving ahead.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

async library will handle it out of the box and we our code will be much more simpler.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. Can we add retries in case of common error like "Connection reset by peer"?

Copy link
Contributor Author

@varadarajan-tw varadarajan-tw Nov 5, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thats a good point. However, I do want to point out that Centrifuge should be the source of truth for any sort of retries in our platform. So, we should leave that to centrifuge by indicating the delivery failed with a retryable status code. Retries within destination code could increase the time for processing and lead to timeout issues.

  1. There is no library that could handle this low level detail for chunking. Even the sftp/ssh2 library that we use has a [fastput](https://github.com/mscdex/ssh2/blob/844f1edfc41589737671f96a4f4e76afdf46abd4/lib/protocol/SFTP.js#L508) method that does the same concurrent upload. I don't think SFTP protocol or any library has this feature OOTB. This is consistent with how our platform works today. All deliveries are retried on the whole even for HTTP or S3 uploads. The current approach (Even with the old synchronous put method) can result in incomplete uploads to SFTP. The technique that is generally followed in this scenario is to first upload a file with different extension/name and then rename the file once all chunks are uploaded. This is something we plan to do in subsequent PRs.
  2. Connection reset by peer will be retried by Centrifuge because we throw an error with 500 (retryable) right now for any error other than file not found error.

itsarijitray
itsarijitray previously approved these changes Nov 4, 2025
@joe-ayoub-segment joe-ayoub-segment merged commit 1804c0d into main Nov 5, 2025
14 checks passed
@joe-ayoub-segment joe-ayoub-segment deleted the fastput-from-buffer branch November 5, 2025 09:39
@joe-ayoub-segment
Copy link
Contributor

PR deployed

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants