Skip to content

Replace HTTPlib client with Curl #58

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 38 commits into
base: main
Choose a base branch
from

Conversation

Tmonster
Copy link
Contributor

First steps to use Curl as the HTTP client. There are still some questions around when to call curl_global_cleanup(), since it should only be called once curl is completely done.

I also might not be initializing options correctly, so looking for feedback on that.

Copy link
Collaborator

@samansmink samansmink left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great! left some initial comments

auto header_collector = make_uniq<HeaderCollector>();
{
// Set URL
curl_easy_setopt(*curl, CURLOPT_URL, request_info->url.c_str());
Copy link
Collaborator

@samansmink samansmink May 20, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's now quite a bit of repetition in these curl configurations between the different request methods. Could we perhaps move the shared config between them in a function?

mode skip

query IIIIII rowsort
SELECT * from read_csv_auto('https://github.com/duckdb/duckdb/raw/9cf66f950dde0173e1a863a7659b3ecf11bf3978/data/csv/customer.csv');
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe its nice to use logging to ensure we get the expected http headers?

for example:

pragma enable_logging('HTTP');
-- do query
select response.headers['Connection'] from duckdb_logs_parsed('HTTP');

We could verify for example that the __RESPONSE_STATUS__ header is properly parsed into the response headers: select response.headers['__RESPONSE_STATUS__'] from duckdb_logs_parsed('HTTP');

}

// If header starts with HTTP/... curl has followed a redirect and we have a new Header,
// so we clear all of the current header_collection
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not sure I understand this, the comment suggests this is going to clear the header_collection, but then doesn't?

@carlopi
Copy link
Collaborator

carlopi commented May 20, 2025

Just a concern of mine, to keep in mind: could we have a branch (that I assumed was main, but could be another one) that stays in sync with what we want to release on 1.3.0?

That would mean either creating a separate v1.3-ossivalis branch, or the other way around have a curl branch.

@Tmonster
Copy link
Contributor Author

that stays in sync with what we want to release on 1.3.0?

There is already a v1.3-ossivalis branch https://github.com/duckdb/duckdb-httpfs/tree/v1.3-ossivalis
That's what you mean? I think the plan is to merge this when green/approved to main, then ask customers who need CURL to download from Nightly.

@carlopi
Copy link
Collaborator

carlopi commented May 21, 2025

Maybe we should sync in a bit, v1.3-ossivalis branch is not in sync with https://github.com/duckdb/duckdb/blob/v1.3-ossivalis/.github/config/out_of_tree_extensions.cmake, and I assume it's not compatible with v1.3-ossivalis branch of duckdb/duckdb.

Happy to have v1.3-ossivalis here track what gets released, but then it needs to be bumped.

@Tmonster
Copy link
Contributor Author

@carlopi I have a PR here to update the v1.3 branch to current main. We can then use that for bug fixes after this merges
#63

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants