Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Remote IO: S3 support #479
Remote IO: S3 support #479
Changes from 10 commits
48110ee
ae033ed
351169a
8e57584
488c060
70dc29c
5f48899
66460f8
465bf17
84007c1
e553f98
6104106
8595c01
b50835a
c6a416f
f81de01
718f291
34ffb03
de151c0
9489bef
fc37192
52f0595
376c918
3e71a1c
7136a0c
938151a
0a90950
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we only use
https
please and rejecthttp
? Or do you want that for testing?It looks like yes. I would like this interface to be "safe" by default, and so I would like the user to have to explicitly opt in to using an unencrypted link, given that we send secrets over the wire.
Also, how does
parse_s3_url
help directly? That returns astd::pair
not astd::string
. Should one useurl_from_bucket_and_object
on the result?Should we error-check and raise if the URL doesn't start with
https://
orhttp://
?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We also wants http for high performance access to public data.
NB: only a time specific signature are send over the wire, curl uses
aws_secret_access_key
to generate the AWS authentication signature V4. Of cause, the payload is send unencrypted.I think it is reasonable to use https by default and accept http if the user overwrite the endpoint url explicitly?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be nice to have std::format in C++20...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is a lot of logic repeated in these three functions and it would benefit from adding a single helper with most of the logic that you could just pass the endpoint created since the creation of
ep
is really the only difference.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried, but I cannot find a way to implement a
from_unique_ptr
factory function without it getting very complicated.Basically, I want something like this:
But I cannot find a nice way to call
from_unique_ptr()
with a derived class instances likeunique_ptr[cpp_HttpEndpoint]
. Any suggestions?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps like this. I am not sure it is much cleaner:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The fundamental problem is that Cython does not natively understand smart pointer polymorphism in the same way that it understands raw pointer polymorphism, so you cannot pass a unique_ptr to a child class where it expects you to pass a unique_ptr to a base class or vice versa. It's a clear sign of the C rather than C++ roots of the project. Here's a patch that builds for me locally and is a bit cleaner IMHO than what Lawrence proposed, with the caveat that you temporarily have a raw pointer that you construct a unique_ptr from rather than using make_unique. Up to you two if you like it or not.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could this just be a templated C++ factory function that we
extern
to Cython and call here?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You could do part of it in C++ but RemoteFile is a Python class so you'd still need a wrapper around the C++ function to handle instantiation and assignment to attributes of that object.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Think that is fine. Basically what we would be saying is here is a factory function that takes some arguments and returns a
std::unique_ptr<cpp_RemoteHandle>
, which we justmove
toret._handle
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yup that would be fine. Probably just use inline C++ to define that in this file itself.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using a C++ cast function as @jakirkham suggest: 6104106
It is a bit more verbose than @vyasr's raw pointer approach, but it enforces the pointer uniqueness.