Skip to content

Commit 09919c1

Browse files
#151: Add option resolve_hostnames (#152)
Co-authored-by: Nicola Coretti <[email protected]>
1 parent ff9b7f2 commit 09919c1

File tree

12 files changed

+303
-30
lines changed

12 files changed

+303
-30
lines changed

CHANGELOG.md

Lines changed: 6 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,10 @@
22

33
## [Unreleased]
44

5+
## [0.27.0] - 2024-09-09
6+
57
- Relocked dependencies (Internal)
8+
- [#151](https://github.com/exasol/pyexasol/issues/151): Added option to deactivate hostname resolution
69

710
## [0.26.0] - 2024-07-04
811

@@ -12,9 +15,9 @@
1215

1316
This driver facade should only be used if one is certain that using the dbapi2 is the right solution for their scenario, taking all implications into account. For more details on why and who should avoid using dbapi2, please refer to the [DBAPI2 compatibility section](/docs/DBAPI_COMPAT.md) in our documentation.
1417

15-
- Droped support for python 3.7
16-
- Droped support for Exasol 6.x
17-
- Droped support for Exasol 7.0.x
18+
- Dropped support for python 3.7
19+
- Dropped support for Exasol 6.x
20+
- Dropped support for Exasol 7.0.x
1821
- Relocked dependencies (Internal)
1922
- Switched packaging and project workflow to poetry (internal)
2023

README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -41,6 +41,7 @@ PyEXASOL provides API to read & write multiple data streams in parallel using se
4141
- [DB-API 2.0 compatibility](/docs/DBAPI_COMPAT.md)
4242
- [Optional dependencies](/docs/DEPENDENCIES.md)
4343
- [Changelog](/CHANGELOG.md)
44+
- [Developer Guide](/docs/DEVELOPER_GUIDE.md)
4445

4546

4647
## PyEXASOL main concepts
@@ -116,4 +117,3 @@ Enjoy!
116117

117118
## Maintained by
118119
[Exasol](https://www.exasol.com) 2023 — Today
119-

docs/DEVELOPER_GUIDE.md

Lines changed: 27 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,27 @@
1+
# Developer Guide
2+
3+
This guide explains how to develop `pyexasol` and run tests.
4+
5+
## Initial Setup
6+
7+
Create a virtual environment and install dependencies:
8+
9+
```sh
10+
poetry install --all-extras
11+
```
12+
13+
Run the following to enter the virtual environment:
14+
15+
```sh
16+
poetry shell
17+
```
18+
19+
## Running Integration Tests
20+
21+
To run integration tests first start a local database:
22+
23+
```sh
24+
nox -s db-start
25+
```
26+
27+
Then you can run tests as usual with `pytest`.

docs/REFERENCE.md

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -114,6 +114,7 @@ Open new connection and return `ExaConnection` object.
114114
| `udf_output_connect_address` | `('udf_host', 8580)` | Specific SCRIPT_OUTPUT_ADDRESS value to connect from Exasol to UDF script output server (Default: inherited from TCP server) |
115115
| `udf_output_dir` | `/tmp` | Path or path-like object pointing to directory for script output log files (Default: `tempfile.gettempdir()`) |
116116
| `http_proxy` | `http://myproxy.com:3128` | HTTP proxy string in Linux [`http_proxy`](https://www.shellhacks.com/linux-proxy-server-settings-set-proxy-command-line/) format (Default: `None`) |
117+
| `resolve_hostnames` | `False` | Explicitly resolve host names to IP addresses before connecting. Deactivating this will let the operating system resolve the host name (Default: `True`) |
117118
| `client_name` | `MyClient` | Custom name of client application displayed in Exasol sessions tables (Default: `PyEXASOL`) |
118119
| `client_version` | `1.0.0` | Custom version of client application (Default: `pyexasol.__version__`) |
119120
| `client_os_username` | `john` | Custom OS username displayed in Exasol sessions table (Default: `getpass.getuser()`) |
@@ -122,6 +123,12 @@ Open new connection and return `ExaConnection` object.
122123
| `access_token` | `...` | OpenID access token to use for the login process |
123124
| `refresh_token` | `...` | OpenID refresh token to use for the login process |
124125

126+
### Host Name Resolution
127+
128+
By default pyexasol resolves host names to IP addresses, randomly shuffles the IP addresses and tries to connect until connection succeeds. See the [design documentation](/docs/DESIGN.md#automatic-resolution-and-randomization-of-connection-addresses) for details.
129+
130+
If host name resolution causes problems, you can deactivate it by specifying argument `resolve_hostnames=False`. This may be required when connecting through a proxy that allows connections only to defined host names. In all other cases we recommend to omit the argument.
131+
125132
## connect_local_config()
126133
Open new connection and return `ExaConnection` object using local .ini file (usually `~/.pyexasol.ini`) to read credentials and connection parameters. Please read [local config](/docs/LOCAL_CONFIG.md) page for more details.
127134

pyexasol/connection.py

Lines changed: 55 additions & 24 deletions
Original file line numberDiff line numberDiff line change
@@ -16,6 +16,10 @@
1616

1717
from . import callback as cb
1818

19+
from typing import (
20+
NamedTuple,
21+
Optional
22+
)
1923
from .exceptions import *
2024
from .statement import ExaStatement
2125
from .logger import ExaLogger
@@ -27,6 +31,13 @@
2731
from .version import __version__
2832

2933

34+
class Host(NamedTuple):
35+
"""This represents a resolved host name with its IP address and port number."""
36+
hostname: str
37+
ip_address: Optional[str]
38+
port: int
39+
fingerprint: Optional[str]
40+
3041
class ExaConnection(object):
3142
cls_statement = ExaStatement
3243
cls_formatter = ExaFormatter
@@ -69,6 +80,7 @@ def __init__(self
6980
, udf_output_connect_address=None
7081
, udf_output_dir=None
7182
, http_proxy=None
83+
, resolve_hostnames=True
7284
, client_name=None
7385
, client_version=None
7486
, client_os_username=None
@@ -104,6 +116,7 @@ def __init__(self
104116
:param udf_output_connect_address: Specific SCRIPT_OUTPUT_ADDRESS value to connect from Exasol to UDF script output server (default: inherited from TCP server)
105117
:param udf_output_dir: Directory to store captured UDF script output logs, split by <session_id>_<statement_id>/<vm_num>
106118
:param http_proxy: HTTP proxy string in Linux http_proxy format (default: None)
119+
:param resolve_hostnames: Explicitly resolve host names to IP addresses before connecting. Deactivating this will let the operating system resolve the host name (default: True)
107120
:param client_name: Custom name of client application displayed in Exasol sessions tables (Default: PyEXASOL)
108121
:param client_version: Custom version of client application (Default: pyexasol.__version__)
109122
:param client_os_username: Custom OS username displayed in Exasol sessions table (Default: getpass.getuser())
@@ -144,6 +157,7 @@ def __init__(self
144157
'udf_output_dir': udf_output_dir,
145158

146159
'http_proxy': http_proxy,
160+
'resolve_hostnames': resolve_hostnames,
147161

148162
'client_name': client_name,
149163
'client_version': client_version,
@@ -652,30 +666,17 @@ def _init_ws(self):
652666
"""
653667
dsn_items = self._process_dsn(self.options['dsn'])
654668
failed_attempts = 0
655-
656-
ws_prefix = 'wss://' if self.options['encryption'] else 'ws://'
657-
ws_options = self._get_ws_options()
658-
659669
for hostname, ipaddr, port, fingerprint in dsn_items:
660-
self.logger.debug(f"Connection attempt [{ipaddr}:{port}]")
661-
662-
# Use correct hostname matching IP address for each connection attempt
663-
if self.options['encryption']:
664-
ws_options['sslopt']['server_hostname'] = hostname
665-
666670
try:
667-
self._ws = websocket.create_connection(f'{ws_prefix}{ipaddr}:{port}', **ws_options)
671+
self._ws = self._create_websocket_connection(hostname, ipaddr, port)
668672
except Exception as e:
669-
self.logger.debug(f'Failed to connect [{ipaddr}:{port}]: {e}')
670-
671673
failed_attempts += 1
672-
673674
if failed_attempts == len(dsn_items):
674-
raise ExaConnectionFailedError(self, 'Could not connect to Exasol: ' + str(e))
675+
raise ExaConnectionFailedError(self, 'Could not connect to Exasol: ' + str(e)) from e
675676
else:
676677
self._ws.settimeout(self.options['socket_timeout'])
677678

678-
self.ws_ipaddr = ipaddr
679+
self.ws_ipaddr = ipaddr or hostname
679680
self.ws_port = port
680681

681682
self._ws_send = self._ws.send
@@ -686,6 +687,32 @@ def _init_ws(self):
686687

687688
return
688689

690+
def _create_websocket_connection(self, hostname:str, ipaddr:str, port:int) -> websocket.WebSocket:
691+
ws_options = self._get_ws_options()
692+
# Use correct hostname matching IP address for each connection attempt
693+
if self.options['encryption'] and self.options["resolve_hostnames"]:
694+
ws_options['sslopt']['server_hostname'] = hostname
695+
696+
connection_string = self._get_websocket_connection_string(hostname, ipaddr, port)
697+
self.logger.debug(f"Connection attempt {connection_string}")
698+
try:
699+
return websocket.create_connection(connection_string, **ws_options)
700+
except Exception as e:
701+
self.logger.debug(f'Failed to connect [{connection_string}]: {e}')
702+
raise e
703+
704+
def _get_websocket_connection_string(self, hostname:str, ipaddr:Optional[str], port:int) -> str:
705+
host = hostname
706+
if self.options["resolve_hostnames"]:
707+
if ipaddr is None:
708+
raise ValueError("IP address was not resolved")
709+
host = ipaddr
710+
if self.options["encryption"]:
711+
return f"wss://{host}:{port}"
712+
else:
713+
return f"ws://{host}:{port}"
714+
715+
689716
def _get_ws_options(self):
690717
options = {
691718
'timeout': self.options['connection_timeout'],
@@ -729,13 +756,13 @@ def _get_login_attributes(self):
729756

730757
return attributes
731758

732-
def _process_dsn(self, dsn):
759+
def _process_dsn(self, dsn: str) -> list[Host]:
733760
"""
734761
Parse DSN, expand ranges and resolve IP addresses for all hostnames
735762
Return list of (hostname, ip_address, port) tuples in random order
736763
Randomness is required to guarantee proper distribution of workload across all nodes
737764
"""
738-
if len(dsn.strip()) == 0:
765+
if dsn is None or len(dsn.strip()) == 0:
739766
raise ExaConnectionDsnError(self, 'Connection string is empty')
740767

741768
current_port = constant.DEFAULT_PORT
@@ -787,24 +814,28 @@ def _process_dsn(self, dsn):
787814
result.extend(self._resolve_hostname(hostname, current_port, current_fingerprint))
788815
# Just a single hostname or single IP address
789816
else:
790-
result.extend(self._resolve_hostname(m.group('hostname_prefix'), current_port, current_fingerprint))
817+
hostname = m.group('hostname_prefix')
818+
if self.options["resolve_hostnames"]:
819+
result.extend(self._resolve_hostname(hostname, current_port, current_fingerprint))
820+
else:
821+
result.append(Host(hostname, None, current_port, current_fingerprint))
791822

792823
random.shuffle(result)
793824

794825
return result
795826

796-
def _resolve_hostname(self, hostname, port, fingerprint):
827+
def _resolve_hostname(self, hostname: str, port: int, fingerprint: Optional[str]) -> list[Host]:
797828
"""
798829
Resolve all IP addresses for hostname and add port
799830
It also implicitly checks that all hostnames mentioned in DSN can be resolved
800831
"""
801832
try:
802-
hostname, alias_list, ipaddr_list = socket.gethostbyname_ex(hostname)
803-
except OSError:
833+
hostname, _, ipaddr_list = socket.gethostbyname_ex(hostname)
834+
except OSError as e:
804835
raise ExaConnectionDsnError(self, f'Could not resolve IP address of hostname [{hostname}] '
805-
f'derived from connection string')
836+
f'derived from connection string') from e
806837

807-
return [(hostname, ipaddr, port, fingerprint) for ipaddr in ipaddr_list]
838+
return [Host(hostname, ipaddr, port, fingerprint) for ipaddr in ipaddr_list]
808839

809840
def _validate_fingerprint(self, provided_fingerprint):
810841
server_fingerprint = hashlib.sha256(self._ws.sock.getpeercert(True)).hexdigest().upper()

pyexasol/version.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1 +1 @@
1-
__version__ = '0.26.0'
1+
__version__ = '0.27.0'

pyproject.toml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
[tool.poetry]
22
name = "pyexasol"
3-
version = "0.26.0"
3+
version = "0.27.0"
44
license = "MIT"
55
readme = "README.md"
66
description = "Exasol python driver with extra features"

0 commit comments

Comments
 (0)