You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
@@ -10,23 +10,17 @@ The Browser domain represents the highest level of Pydoll's automation hierarchy
10
10
The Browser domain sits at the intersection of process management, protocol communication, and resource coordination. It orchestrates multiple specialized components to provide a unified interface for browser automation:
-**Memory efficiency**: Prevents duplicate Tab instances for same target
144
138
-**Event routing**: Ensures events route to correct Tab instance
145
139
146
-
### Proxy and Context Authentication
140
+
### Proxy Authentication Architecture
141
+
142
+
Pydoll implements **automatic proxy authentication** via the Fetch domain to avoid exposing credentials in CDP commands. The implementation uses **two distinct mechanisms** depending on proxy scope:
147
143
148
-
Pydoll implements **automatic proxy authentication** via the Fetch domain to avoid exposing credentials in CDP commands:
|**Scope**| All tabs in default context | Only tabs in that context |
214
+
|**Efficiency**| One listener for all tabs | One listener per tab |
215
+
|**Isolation**| No context separation | Each context has different credentials |
216
+
217
+
**Design rationale for tab-level auth:**
174
218
175
-
1. Enables Fetch domain on the Tab
176
-
2. Registers `Fetch.requestPaused` handler (continues normal requests)
177
-
3. Registers `Fetch.authRequired` handler (provides credentials, then disables Fetch)
219
+
-**Context isolation**: Each context can have a **different proxy** with **different credentials**
220
+
-**CDP limitation**: Fetch domain cannot be scoped to a specific context at browser level
221
+
-**Tradeoff**: Slightly less efficient (one listener per tab), but necessary for per-context proxy support
178
222
179
223
This architecture ensures **credentials never appear in CDP logs** and authentication is handled transparently.
180
224
181
225
!!! warning "Fetch Domain Side Effects"
182
-
Enabling Fetch for proxy auth temporarily pauses **all requests** in that Tab until the auth callback fires. This is a CDP limitation - Fetch enables global request interception. After authentication completes, Fetch is disabled to minimize overhead.
226
+
- **Browser-level Fetch**: Temporarily pauses **all requests across all tabs** in the default context until auth completes
227
+
- **Tab-level Fetch**: Temporarily pauses **all requests in that specific tab** until auth completes
228
+
229
+
This is a CDP limitation - Fetch enables request interception. After authentication completes, Fetch is disabled to minimize overhead.
Pydoll's proxy authentication uses Fetch domain to hide credentials:
631
+
Pydoll's proxy authentication uses two different Fetch domain strategies:
585
632
633
+
**Browser-Level (Global Proxy):**
586
634
-**Security benefit**: Credentials never logged in CDP traces
587
-
-**Performance cost**: Fetch pauses **all requests** until auth completes
635
+
-**Performance cost**: Fetch pauses **all requests across all tabs** until auth completes
636
+
-**Efficiency**: Single listener for all tabs in default context
588
637
-**Mitigation**: Fetch is disabled after first auth, minimizing overhead
589
638
639
+
**Tab-Level (Per-Context Proxy):**
640
+
-**Security benefit**: Credentials never logged in CDP traces
641
+
-**Performance cost**: Fetch pauses **all requests in that tab** until auth completes
642
+
-**Efficiency**: Separate listener per tab (less efficient, but necessary for isolation)
643
+
-**Isolation benefit**: Each context can have different proxy credentials
644
+
-**Mitigation**: Fetch is disabled after first auth per tab
645
+
590
646
**Why not use Browser.setProxyAuth?** This CDP command doesn't exist. Fetch is the only mechanism for programmatic auth.
591
647
648
+
**Why tab-level for contexts?** CDP's Fetch domain cannot be scoped to a specific BrowserContext. Since each context can have a different proxy with different credentials, Pydoll must handle auth at the tab level to respect context boundaries.
649
+
592
650
### Port Randomization Strategy
593
651
594
652
Random CDP ports (9223-9322) prevent collisions when running parallel browser instances:
for cb_id, cb_data inself._event_callbacks.items():
156
+
if cb_data['event'] == event_name:
157
+
if asyncio.iscoroutinefunction(cb_data['callback']):
158
+
await cb_data['callback'](event_data)
159
+
else:
160
+
cb_data['callback'](event_data)
161
161
```
162
162
163
-
This ensures callbacks run in background tasks to prevent blocking the event loop.
163
+
Asynchronous callbacks are awaited sequentially. This means each callback completes before the next one executes, which is important for:
164
+
165
+
-**Predictable Execution Order**: Callbacks execute in registration order
166
+
-**Error Handling**: Exceptions in one callback don't prevent others from executing
167
+
-**State Consistency**: Callbacks can rely on sequential state changes
168
+
169
+
!!! info "Sequential vs Concurrent Execution"
170
+
Callbacks execute sequentially within the same event. However, different events can be processed concurrently since the event loop handles multiple connections simultaneously.
164
171
165
172
## Event Flow and Lifecycle
166
173
@@ -294,35 +301,45 @@ The `ConnectionHandler` is the central component managing WebSocket communicatio
294
301
```python
295
302
classConnectionHandler:
296
303
def__init__(self, ...):
297
-
self._callbacks = {} # Event name -> list of callbacks
for callback_id, callback, temporary inself._callbacks[event_name]:
319
-
await callback(event_data)
320
-
321
-
if temporary:
322
-
callbacks_to_remove.append(callback_id)
328
+
for cb_id, cb_data inself._event_callbacks.items():
329
+
if cb_data['event'] == event_name:
330
+
# Execute callback (await if async, call directly if sync)
331
+
if asyncio.iscoroutinefunction(cb_data['callback']):
332
+
await cb_data['callback'](event_data)
333
+
else:
334
+
cb_data['callback'](event_data)
335
+
336
+
# Mark temporary callbacks for removal
337
+
if cb_data['temporary']:
338
+
callbacks_to_remove.append(cb_id)
323
339
324
-
for callback_id in callbacks_to_remove:
325
-
awaitself.remove_callback(callback_id)
340
+
# Remove temporary callbacks after all callbacks executed
341
+
for cb_id in callbacks_to_remove:
342
+
self.remove_callback(cb_id)
326
343
```
327
344
328
345
This architecture ensures:
@@ -374,9 +391,12 @@ Each tab maintains its own:
374
391
375
392
- Event domain enablement state
376
393
- Callback registry
377
-
-WebSocket connection (or shared connection with target ID)
394
+
-Event communication channel
378
395
- Network logs (if network events enabled)
379
396
397
+
!!! info "Communication Architecture"
398
+
Each tab has its own event communication channel to the browser. For technical details on how WebSocket connections and target IDs work at the protocol level, see [Browser Domain Architecture](./browser-domain.md).
0 commit comments