Skip to content

Conversation

@0marperez
Copy link
Contributor

@0marperez 0marperez commented Oct 17, 2025

Issue #

N/A

Description of changes

  • Project setup
  • Publishing config
  • Transfer manager client
  • Business metric
  • Transfer interceptors
  • Upload file operation

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

@github-actions
Copy link

A new generated diff is ready to view.

  • No codegen difference in the AWS SDK

@0marperez 0marperez added the no-changelog Indicates that a changelog entry isn't required for a pull request. Use sparingly. label Oct 17, 2025
@github-actions
Copy link

A new generated diff is ready to view.

  • No codegen difference in the AWS SDK

@github-actions
Copy link

A new generated diff is ready to view.

  • No codegen difference in the AWS SDK

@github-actions
Copy link

A new generated diff is ready to view.

  • No codegen difference in the AWS SDK

@github-actions
Copy link

A new generated diff is ready to view.

  • No codegen difference in the AWS SDK

@0marperez 0marperez marked this pull request as ready for review October 17, 2025 15:14
@0marperez 0marperez requested a review from a team as a code owner October 17, 2025 15:14
Copy link
Member

@lauzadis lauzadis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice start

public enum class AwsBusinessMetric(public override val identifier: String) : BusinessMetric {
S3_EXPRESS_BUCKET("J"),
DDB_MAPPER("d"),
S3_TRANSFER("G"),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are two new features S3_TRANSFER_UPLOAD_DIRECTORY and S3_TRANSFER_DOWNLOAD_DIRECTORY we should add here

subprojects {
group = "aws.sdk.kotlin"
version = hllPreviewVersion
version = if (name == "s3-transfer-manager") sdkVersion else hllPreviewVersion
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can override the version in the subproject rather than hacking it in here

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or better, override / set hllPreviewVersion in the submodules which need it, and use sdkVersion as the default

*/

description = "S3 Transfer Manager for the AWS SDK for Kotlin"
extra["displayName"] = "AWS :: SDK :: Kotlin :: HLL :: S3TransferManager"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can use spaces in the display name, S3 Transfer Manager

commonMain {
dependencies {
implementation(project(":aws-runtime:aws-http"))
implementation(libs.s3)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of depending on an already-published version of S3, we should require S3 to be bootstrapped locally for S3TM to build. Like how we did it for DDB Mapper

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was an intentional choice to help prevent changes to S3 from breaking the S3 TM. We don't need to have the latest APIs from S3.

Copy link
Contributor Author

@0marperez 0marperez Oct 20, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think it would happen often, but I think either approach is fine. The current one is just slightly safer.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't that mean that S3TM will always depend on the n-1 S3 client? So S3TM-1.2.3 uses S3-1.2.2? That doesn't seem right.

We want to catch changes to S3 which break S3TM and fix them as soon as possible. All of our modules are a comprehensive platform and should use the same, unified set of dependency versions.

S3TransferManager.Companion {
client = s3Client
}.uploadFile {
bucket = "aoperez"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These tests should either be made generic or not committed yet

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh I see they have an @Ignore annotation, that seems fine too

* Builds a low-level S3 upload part request from a high-level upload file request
* and data from the S3 Transfer Manager.
*/
internal fun buildUploadPartRequest(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion: This and the function below might be better as extension functions

/**
* An interceptor that emits the S3 Transfer Manager business metric
*/
internal object BusinessMetricInterceptor : HttpInterceptor {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

naming: add S3 / S3 Transfer Manager somewhere in the name for better IDE search discovery


internal fun build(): S3TransferManager =
S3TransferManager(
client = client?.withConfig { interceptors += BusinessMetricInterceptor } ?: error("client must be set"),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

correctness: the underlying client will emit a business metric for every request, whether the S3TM was used or not. Is that what we want?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

withConfig makes a copy of the client with some overrides as far as I understand

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh right

public suspend fun uploadFile(uploadFileRequest: UploadFileRequest): Deferred<UploadFileResponse> = coroutineScope {
val multiPartUpload = uploadFileRequest.contentLength >= multipartUploadThreshold
val uploadedParts = mutableListOf<CompletedPart>()
var mpuUploadId = "null"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this feels better as a lateinit var

operationHook(TransferInitiated) {
if (multiPartUpload) {
context.response = client.createMultipartUpload(context.request as CreateMultipartUploadRequest)
mpuUploadId = (context.response as CreateMultipartUploadResponse).uploadId ?: throw Exception("Missing upload id in create multipart upload response")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's throw a better Exception here and elsewhere

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This isn't a scenario the user can cause themselves, is it? This could only happen because of a bug in our S3TM implementation it seems. If so, I'm generally fine with non-specific errors since we cannot expect users to catch them and act on them.

public val checksumCrc32: String?,
public val checksumCrc32C: String?,
public val checksumCrc64Nvme: String?,
public val checksumSha1: String?,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correctness: The builder shouldn't be passing all of its values to the class's constructor. The constructor should accept a builder object and set its own fields that way. This results in less boilerplate:

public class UploadFileRequest(builder: UpdateFileRequest.Builder) {
    public val acl = builder.acl
    public val body = builder.body
    public val bucket = builder.bucket
    // ...etc...

    public companion object {
        public operator fun invoke(block: Builder.() -> Unit): UploadFileRequest =
            Builder().apply(block).build()
    }

    public class Builder {
        public var acl: ObjectCannedAcl? = null
        public var body: ByteStream? = null
        public var bucket: String? = null
        // ...etc...

        internal fun build(): UploadFileRequest = UploadFileRequest(this)
    }
}

We follow this pattern in our codegenned shapes and many of our handwritten ones.

Comment applies to other types in this PR.

public var tagging: String? = null
public var websiteRedirectLocation: String? = null

public fun build(): UploadFileRequest =
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: Generally our build methods are internal because we don't want users to think they have to invoke it manually inside of a DSL block:

val req = UploadFileRequest {
    bucket = "foo"
    key = "bar"
    build() // <-- this is unnecessary
}

import aws.smithy.kotlin.runtime.content.ByteStream
import aws.smithy.kotlin.runtime.time.Instant

public class UploadFileRequest private constructor(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Question: These classes, their builders, their converters, etc. all seem like a lot of repeated boilerplate. Why aren't we code-generating these based on the list of fields from the spec?

Comment on lines 23 to 32
/**
* Determines the actual part size to use for a multipart S3 upload.
*
* This function calculates the part size based on the total size
* of the file and the requested part size. If the requested part size is
* too small to allow the upload to fit within S3's 10,000-part limit, the
* part size will be automatically increased so that exactly 10,000 parts
* are uploaded.
*/
internal fun resolvePartSize(uploadFileRequest: UploadFileRequest, tm: S3TransferManager, logger: Logger): Long {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: This function doesn't feel like it belongs in this file. Nothing else in UploadFile.kt uses it, only S3TransferManager.kt.

val targetNumberOfParts = uploadFileRequest.contentLength / tm.targePartSize
return if (targetNumberOfParts > MAX_NUMBER_PARTS) {
ceilDiv(uploadFileRequest.contentLength, MAX_NUMBER_PARTS).also {
logger.debug { "Target part size is too small to meet the 10,000 S3 part limit. Increasing part size to $it" }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: 10000 is already a constant in this file. Reuse that constant in this string rather than duplicating it. This'll reduce the likelihood that the values get out of sync.

Comment on lines +39 to +42
/**
* High level utility for managing transfers to Amazon S3.
*/
public class S3TransferManager private constructor(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Style: This class has too much logic inside of it and that won't be sustainable once we add more operations and variants. We should refactor this, possibly so that individual operations are located in a single file and maybe even represented by a class.

* all parts to fit within S3's limit of 10,000 parts, the part size will be
* automatically increased so that exactly 10,000 parts are uploaded.
*/
public suspend fun uploadFile(uploadFileRequest: UploadFileRequest): Deferred<UploadFileResponse> = coroutineScope {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Style: Method is too long and should be refactored.

* all parts to fit within S3's limit of 10,000 parts, the part size will be
* automatically increased so that exactly 10,000 parts are uploaded.
*/
public suspend fun uploadFile(uploadFileRequest: UploadFileRequest): Deferred<UploadFileResponse> = coroutineScope {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correctness: This method shouldn't return Deferred<UploadFileResponse>, it should return UploadFileResponse just like our low-level APIs do. If users want this to happen asynchronously, they can wrap it in async { } themselves.

Note that coroutineScope does not return until all the work inside is completed (including child coroutines) so the Deferred value would already be filled anyway.

Comment on lines +26 to +50
public interface TransferInterceptor {
// Transfer initialization hooks
public fun readBeforeTransferInitiated(context: TransferContext) {}
public fun modifyBeforeTransferInitiated(context: TransferContext): TransferContext = context
public fun readAfterTransferInitiated(context: TransferContext) {}
public fun modifyAfterTransferInitiated(context: TransferContext): TransferContext = context

// Byte transferring hooks
public fun readBeforeBytesTransferred(context: TransferContext) {}
public fun modifyBeforeBytesTransferred(context: TransferContext): TransferContext = context
public fun readAfterBytesTransferred(context: TransferContext) {}
public fun modifyAfterBytesTransferred(context: TransferContext): TransferContext = context

// File transfer hooks
public fun readBeforeFileTransferred(context: TransferContext) {}
public fun modifyBeforeFileTransferred(context: TransferContext): TransferContext = context
public fun readAfterFileTransferred(context: TransferContext) {}
public fun modifyAfterFileTransferred(context: TransferContext): TransferContext = context

// Transfer completion hooks
public fun readBeforeTransferCompleted(context: TransferContext) {}
public fun modifyBeforeTransferCompleted(context: TransferContext): TransferContext = context
public fun readAfterTransferCompleted(context: TransferContext) {}
public fun modifyAfterTransferCompleted(context: TransferContext): TransferContext = context
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Question: Why do all of these hooks have the same TransferContext parameter? Our other interceptors have unique context parameters per hook because we gradually accumulate more context over the lifetime of the request. For instance, readBeforeFileTransferred can never have an upload ID but readAfterFileTransferred can.

Comment on lines 18 to 24
jvmTest {
dependencies {
implementation(libs.smithy.kotlin.test.jvm)
implementation(libs.smithy.kotlin.testing.jvm)
implementation(libs.smithy.kotlin.aws.signing.common)
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correctness: We should not have JVM-only tests in most cases. S3TM must work across all targets and our test dependencies should be available on all targets.

@github-actions
Copy link

A new generated diff is ready to view.

  • No codegen difference in the AWS SDK

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

no-changelog Indicates that a changelog entry isn't required for a pull request. Use sparingly.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants