Skip to content

Releases: Abstract-Data/RyanData-Address-Utils

Release v0.7.2

13 Dec 03:37

Choose a tag to compare

Release Notes - v0.7.2

New Features

ValidationRunner Integration Fix

  • Added loc property to RyanDataAddressError - Validation errors now include Pydantic-compatible field location information, fixing the "unknown" field name issue in ValidationRunner reports. Errors for ZIP code and state validation now correctly display StateName or ZipCode instead of unknown.

Enhanced Cleaning Metrics Tracking

  • New OperationType constants - Standard constants for categorizing transformation operations:

    • NORMALIZATION - Format standardization (abbreviations, ZIP formats)
    • FORMATTING - Whitespace, punctuation, case changes
    • EXPANSION - Abbreviation expansion (via libpostal)
    • CLEANING - Removal of invalid data
    • PARSING - Component extraction from raw input
  • New tracking methods in TransformationTracker:

    • track_case_normalization() - Detects case changes in street names and cities
    • track_street_type_changes() - Detects street type abbreviations (Street→St, Avenue→Ave)
    • track_direction_changes() - Detects directional abbreviations (North→N, Southeast→SE)
    • track_unit_type_changes() - Detects unit type abbreviations (Apartment→Apt, Suite→Ste)
    • track_punctuation_removal() - Detects period and punctuation removal
    • track_component_parsing() - Records what components were extracted from raw input
  • New constant mappings for detecting transformations:

    • STREET_TYPE_TO_ABBREV - 40+ street type mappings
    • DIRECTION_TO_ABBREV - 8 directional mappings
    • UNIT_TYPE_TO_ABBREV - 18 unit type mappings

Bug Fixes

  • Fixed validation report displaying "unknown" as the field name for ZIP and state validation errors when using ValidationRunner from abstract_validation_base

Tests

  • Added 16 new tests for enhanced tracking functionality

Full Changelog: v0.7.1...v0.7.2

Release v0.7.1

12 Dec 08:55

Choose a tag to compare

v0.7.1 (2025-12-12)

Documentation

  • Add AGENTS.md - Comprehensive AI coding assistant guide with codebase overview, key files reference, coding conventions, common tasks, and development commands for AI agents (Claude, GPT, Copilot, Cursor)

  • Add llms.txt - Structured context file following the llms.txt specification for LLM discoverability, including quick facts, main entry points, key classes, and common operations

  • Update README.md:

    • Update Python version badge from 3.9+ to 3.12+ to reflect actual requirements
    • Add ProcessLog system and abstract-validation-base to Highlights section
    • Add new "Transformation tracking" section with code example
    • Expand APIs section with ProcessLog, ProcessEntry, ValidationBase, and RyanDataValidationBase
    • Add links to CHANGELOG.md and AGENTS.md in Documentation section

Full Changelog: v0.7.0...v0.7.1

Release v0.7.0

12 Dec 08:43

Choose a tag to compare

Release v0.7.0

🚀 Major Changes

Integration with abstract-validation-base

This release integrates the abstract-validation-base library as the foundation for validation infrastructure, providing a more robust and standardized validation framework.

✨ New Features

  • New Composable Validators: Added Zip5FormatValidator and Zip4FormatValidator for granular ZIP code format validation
  • ValidatorPipelineBuilder: create_default_validators() now uses ValidatorPipelineBuilder for flexible validator composition
  • Automatic Tag Creation: New version-check.yml workflow automatically creates tags when versions are updated
  • Enhanced Release Workflow: Support for alpha, beta, rc pre-release versions in addition to patch, minor, major

🔄 Breaking Changes

  • Python 3.12+ Required: Minimum Python version is now >=3.12.8 (required by abstract-validation-base)
  • Removed support for Python 3.9, 3.10, 3.11

🏗️ Internal Changes

  • Replaced local validation classes with library imports:
    • BaseValidator, CompositeValidator, ValidatorProtocol
    • ValidationResult, ValidationError
    • ProcessLog, ProcessEntry
  • Deleted redundant local files: core/validation/*.py, core/process_log.py, core/results.py
  • Updated to Python 3.12+ type parameter syntax (class Foo[T] instead of Generic[T])
  • Simplified Codecov integration
  • All CI workflows now use Python 3.12/3.13

📦 Dependencies

  • Added: abstract-validation-base (git dependency)
  • Requires: Python >=3.12.8, <=3.13

Full Changelog: v0.6.0...v0.7.0

v0.6.0

12 Dec 04:50

Choose a tag to compare

Full Changelog: v0.5.0...v0.6.0

v0.5.0

10 Dec 08:49

Choose a tag to compare

Release Notes - v0.5.0

🚀 Major Features

International Address Parsing Support

  • New parse_auto() method: Automatically routes between US and international parsing

    • Attempts US parsing first, falls back to libpostal for international addresses
    • Smart detection of international addresses based on content analysis
    • Maintains backward compatibility with existing US parsing
  • InternationalAddress Model: New Pydantic model for structured international address components

    • Fields: HouseNumber, Road, City, State, PostalCode, Country, CountryCode
    • Compatible with libpostal's international parsing capabilities
  • Enhanced Service Methods:

    • parse_international(): Direct libpostal-based parsing
    • parse_auto(): Intelligent routing between US and international parsers
    • Automatic fallback on US validation failure

🛠️ CI/CD & Quality Improvements

Codecov Integration

  • Coverage Reporting: Automated coverage collection and upload

    • pytest-cov integration with XML and terminal output
    • Coverage badge in README showing live percentages
    • Support for all Python versions (3.9-3.13)
  • Test Analytics: Comprehensive test execution insights

    • Test failure tracking and flaky test detection
    • Performance analytics and historical trends
    • JUnit XML output with detailed test metadata

Build Optimization

  • Libpostal Caching: Dramatically improved CI build times

    • Cache libpostal installation (~5-10 min → ~10-30 sec)
    • Persistent caching across workflow runs
    • Conditional installation (rebuild only on cache miss)
  • Workflow Enhancements:

    • Protected branch support with CODECOV_TOKEN
    • Separate coverage and test analytics uploads
    • Improved error handling and resilience

🧪 Testing Enhancements

International Address Test Suite

  • Comprehensive Coverage: 40+ international address formats tested

    • European, Asian, Middle Eastern, and global address formats
    • Unicode support and special character handling
    • Complex multi-line international addresses
  • Advanced Test Cases:

    • International parsing edge cases
    • Fallback behavior validation
    • Cross-format compatibility testing

📚 Documentation & Developer Experience

README Updates

  • Codecov Badge: Live coverage percentage display
  • International Parsing Guide: Complete setup and usage instructions
  • Enhanced Examples: International address parsing demonstrations

Code Quality

  • Linting Consistency: Ruff formatting applied across codebase
  • Type Safety: Enhanced mypy coverage and error handling

🔧 Technical Details

Dependencies

  • New Optional Dependencies: libpostal integration (when available)
  • CI Dependencies: pytest-cov, codecov-cli for analytics
  • Build Dependencies: Enhanced libpostal installation scripts

Breaking Changes

  • Deprecation Notice: parse_auto_route() deprecated in favor of parse_auto()
  • Warning: Deprecation warnings added for smooth migration

Performance

  • CI Speed: 80-90% reduction in build times with caching
  • Memory: Optimized international parsing with lazy loading
  • Compatibility: Maintained across Python 3.9-3.13

🐛 Bug Fixes & Stability

International Parsing

  • Robust Fallback: Graceful degradation when libpostal unavailable
  • Error Handling: Improved validation error messages
  • Edge Cases: Better handling of malformed international addresses

CI/CD Reliability

  • Cache Management: Smart cache invalidation and rebuilding
  • Upload Resilience: Guaranteed analytics upload even on test failures
  • Cross-Platform: Consistent behavior across all supported Python versions

📈 Metrics & Analytics

Coverage Tracking

  • Baseline: 77% overall code coverage achieved
  • Breakdown: Detailed per-module coverage reporting
  • Trends: Historical coverage tracking and alerts

Test Analytics

  • Execution Times: Performance monitoring for test suites
  • Failure Patterns: Automated detection of flaky tests
  • Suite Health: Comprehensive test suite metrics

Migration Guide

For Existing Users

  1. No Action Required: All existing APIs remain functional
  2. Optional Enhancement: Use parse_auto() for international address support
  3. Deprecation: Replace parse_auto_route() with parse_auto()

For CI/CD

  1. Automatic: Codecov integration works out-of-the-box
  2. Token Required: Add CODECOV_TOKEN to repository secrets
  3. Caching: Builds automatically benefit from libpostal caching

This release significantly expands the library's capabilities for global address processing while maintaining robust CI/CD infrastructure and comprehensive testing coverage.

v0.4.0

10 Dec 04:55

Choose a tag to compare

Release Notes (v0.4.0) — libpostal remote & Windows-friendly usage

What’s new

  • Remote libpostal API: FastAPI service (Docker image) with /health, /parse, /parse_international, /parse_auto, matching local parse shapes.
  • Python remote client: parse_remote and LibpostalRemoteClient map API responses into existing models; optional Docker auto-start.
  • Extras:
    • [api] for FastAPI/uvicorn/libpostal bindings
    • [remote] for Docker auto-start + HTTP client
    • [pandas] for DataFrame helpers

How to use the libpostal remote

  1. Run the API container (recommended):

    docker run -p 8000:8000 ghcr.io/abstract-data/ryandata-addr-utils-libpostal:latest
    Health: curl http://localhost:8000/health

  2. Call from Python:

    • Auto-start (Docker required):

      from ryandata_address_utils import parse_remote
      result = parse_remote("10 Downing St, London")

  • Use an existing service (no auto-start, e.g., Windows without Docker):

    from ryandata_address_utils.remote import LibpostalRemoteClient
    
    client = LibpostalRemoteClient(base_url="http://localhost:8000", auto_start=False)
    result = client.parse_auto("10 Downing St, London")```  
  • Reuse one client to avoid repeated health checks and startup delay.

  1. Env toggles:
    • RYANDATA_LIBPOSTAL_URL: point at an existing service; disables auto-start.
    • RYANDATA_LIBPOSTAL_AUTOSTART: 0/1 to disable/enable Docker auto-start.
    • RYANDATA_LIBPOSTAL_IMAGE, RYANDATA_LIBPOSTAL_CONTAINER, RYANDATA_LIBPOSTAL_PORT: override image/name/port.

Benefits for Windows/PC users

  • No local libpostal install needed: run the Docker image and consume via HTTP.
  • Works without Docker: point RYANDATA_LIBPOSTAL_URL (or base_url) to a hosted service.
  • Same ParseResult-shaped responses as local parsing; existing integrations stay consistent.

Performance tips

  • Start the container once and keep it running.
  • Reuse a single LibpostalRemoteClient to skip repeated health checks.
  • If a service is already running, set base_url/RYANDATA_LIBPOSTAL_URL and auto_start=False to reduce overhead.

Install quick ref

  • Base: bash uv add git+https://github.com/Abstract-Data/RyanData-Address-Utils.git
  • With pandas: uv add "ryandata-address-utils[pandas] @ git+https://github.com/Abstract-Data/RyanData-Address-Utils.git"
  • With API + remote: uv add "ryandata-address-utils[api,remote] @ git+https://github.com/Abstract-Data/RyanData-Address-Utils.git"

v0.3.1

10 Dec 01:19

Choose a tag to compare

What’s new

  • Added explicit ZIP fields: ZipCode5, ZipCode4, and ZipCodeFull.
  • Normalized ZIP input (5-digit or ZIP+4 with/without dash) into the new fields and kept legacy ZipCode populated for compatibility.
  • FullAddress now uses ZipCodeFull for formatting.
  • Zip validator now uses the normalized ZIP fields and state checks.

Fixes & validation

  • Strict validation for ZIP5 (5 digits) and ZIP4 (4 digits when present).
  • ZIP+4 parsing supported both with a dash and as 9 contiguous digits.
  • Tests & quality
  • Added tests for ZIP+4 parsing/formatting and invalid length cases.
  • All checks passing: pytest, ruff, mypy.

Artifacts

  • ryandata_address_utils-0.3.1-py3-none-any.whl
  • ryandata_address_utils-0.3.1.tar.gz

Changelog: v0.3.0...v0.3.1

v0.3.0

10 Dec 00:49

Choose a tag to compare

Release Notes for v0.3.0

Features

  • Enhanced Error Handling: Added RyanDataAddressError and RyanDataValidationError classes that inherit from Pydantic's error types while including package identification for better error tracing
  • Automatic Address Formatting: Implemented automatic Address1, Address2, and FullAddress property computation using Pydantic model validators
  • Raw Input Preservation: Added RawInput field to Address model to capture original input strings
  • Automated Releases: Reinstated GitHub Actions release workflow with semantic-release for automated versioning and releases

Fixes

  • Pandas Integration: Fixed validation error handling in pandas integration methods when errors='coerce' is used
  • Workflow Issues: Resolved GitHub Actions workflow failures and cache problems
  • Import Compatibility: Cleaned up imports for Python 3.9+ compatibility
  • Version Handling: Made version reading more robust to prevent import errors
  • Git Configuration: Fixed release workflow git configuration issues

Documentation

  • UV Support: Added comprehensive UV installation and development instructions
  • Badge Updates: Updated README badges to reference correct GitHub Actions workflows

Refactoring

  • Model Validators: Replaced property-based address formatting with Pydantic model validators for better performance and consistency

Chores

  • Code Formatting: Applied ruff formatting across entire codebase
  • Dependency Updates: Updated uv.lock and project dependencies
  • CI/CD: Configured semantic-release for automated releases

Full Changelog: v0.2.0...v0.3.0