You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
A Python address parser built for Ryan Data that uses `usaddress` to parse US addresses into structured Pydantic models with ZIP code and state validation.
10
+
Parse and validate US addresses with Pydantic models, ZIP/state validation, pandas integration, and semantic-release powered CI.
10
11
11
-
## Installation
12
+
## Highlights
12
13
13
-
### Using pip
14
+
- Structured parsing of US addresses into 26 components with Pydantic models
15
+
- ZIP and state validation backed by authoritative datasets
16
+
- Pandas-friendly parsing for batch workloads
17
+
- Custom errors (`RyanDataAddressError`, `RyanDataValidationError`) with package context
18
+
- Builder API for programmatic address construction
19
+
- Semantic-release CI for automated tagging and releases
from ryandata_address_utils import AddressService, parse
49
40
50
-
# Simple parsing
51
41
result = parse("123 Main St, Austin TX 78749")
52
-
53
42
if result.is_valid:
54
-
print(result.address.StreetName) # "Main"
55
-
print(result.address.ZipCode) # "78749"
56
-
print(result.to_dict()) # All fields as dict
43
+
print(result.address.ZipCode) # "78749"
44
+
print(result.to_dict()) # full address dict
57
45
else:
58
-
print(result.validation.errors)
46
+
print(result.validation.errors)# custom errors with context
59
47
60
-
# Or use the full service
61
48
service = AddressService()
62
-
result =service.parse("456 Oak Ave, Dallas TX 75201")
49
+
service.parse("456 Oak Ave, Dallas TX 75201")
63
50
```
64
51
65
-
## Key Features
66
-
67
-
-**Parse US addresses** into 26 structured components
68
-
-**Validate ZIP codes** against real US ZIP code database (~33,000 ZIPs)
69
-
-**Validate states** - abbreviations and full names
70
-
-**Pandas integration** for batch processing
71
-
-**Extensible architecture** - swap parsers, data sources, validators
72
-
-**Builder pattern** for programmatic address constructionYes
73
-
74
-
## Pandas Integration
52
+
## Pandas integration
75
53
76
54
```python
77
55
import pandas as pd
78
56
from ryandata_address_utils import AddressService
79
57
80
-
df = pd.DataFrame({
81
-
"address": [
82
-
"123 Main St, Austin TX 78749",
83
-
"456 Oak Ave, Dallas TX 75201",
84
-
]
85
-
})
86
-
58
+
df = pd.DataFrame({"address": ["123 Main St, Austin TX 78749", "456 Oak Ave, Dallas TX 75201"]})
87
59
service = AddressService()
88
-
result = service.parse_dataframe(df, "address") # <-- This is where your named address column goes, and then it'll parse and add the split cols to the dataframe
-**Enhanced Error Handling**: Added `RyanDataAddressError` and `RyanDataValidationError` classes that inherit from Pydantic's error types while including package identification for better error tracing
5
+
-**Automatic Address Formatting**: Implemented automatic Address1, Address2, and FullAddress property computation using Pydantic model validators
6
+
-**Raw Input Preservation**: Added RawInput field to Address model to capture original input strings
7
+
-**Automated Releases**: Reinstated GitHub Actions release workflow with semantic-release for automated versioning and releases
8
+
9
+
## Fixes
10
+
-**Pandas Integration**: Fixed validation error handling in pandas integration methods when `errors='coerce'` is used
11
+
-**Workflow Issues**: Resolved GitHub Actions workflow failures and cache problems
12
+
-**Import Compatibility**: Cleaned up imports for Python 3.9+ compatibility
13
+
-**Version Handling**: Made version reading more robust to prevent import errors
0 commit comments