Enable Support for Custom Session+Proxy Configurations #644
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Description
This PR introduces the ability for users to pass custom
requests.Session
objects to theSharingClient
in the Delta Sharing Python library. This enhancement allows users to configure more complex session settings that cannot be achieved using environment variables alone, such as authenticated proxies, custom headers, SSL configurations, timeout settings, and other session-related configurations. This provides users with greater flexibility when working in complex network environments or when specific session configurations are required.This PR also updates the Delta Sharing File System in Spark to support proxy configurations. This means users can now define proxy settings, including authenticated proxies, custom headers, SSL configurations, and timeout settings, through the Spark configuration.
Key Changes
1.
SharingClient
Class Update (Python)The
SharingClient
class now accepts an optionalsession
parameter in its constructor. This allows users to pass a customrequests.Session
object when creating aSharingClient
. If no session is provided, a newrequests.Session
will be created as before:2.
DataSharingRestClient
Class Update (Python)The
DataSharingRestClient
class now accepts an optionalsession
parameter. The custom session is passed fromSharingClient
toDataSharingRestClient
, ensuring all HTTP requests utilize the custom session.The
__auth_session
method uses the provided session or creates a new one if none is provided:3. High-Level Function Updates (Python)
The
load_as_pandas
andload_table_changes_as_pandas
functions now accept an optionalsharing_client
parameter. If asharing_client
is provided, these functions will use itsrest_client
for making HTTP requests, ensuring the custom session is used.3. Proxy Configuration Support (Spark)
The
DeltaSharingFileSystem
class now supports proxy configurations through its configuration settings. Users can define proxy hosts, ports, and other related settings in the Spark configuration. In order to use this, you can configure the following properties:5.
ConfUtils
Updates (Spark)The
ConfUtils
utility object has been updated to handle the retrieval and validation of proxy-related configurations, including custom headers and SSL configurations.6. HTTP Client Configuration (Spark)
The
DeltaSharingFileSystem.createHttpClient
method has been enhanced to configure the HTTP client with proxy settings, custom headers, SSL configurations, and timeout settings.Example Usage
Python
This is a simplified example of how to use the updated
SharingClient
with a customrequests.Session
to configure an authenticated proxy, custom headers, and other settings:Spark