diff --git a/.github/workflows/python-package.yml b/.github/workflows/python-package.yml index 3bbe091..a5053d6 100644 --- a/.github/workflows/python-package.yml +++ b/.github/workflows/python-package.yml @@ -8,7 +8,7 @@ jobs: runs-on: ubuntu-latest strategy: matrix: - python-version: ['3.8', '3.9', '3.10', '3.11'] + python-version: ['3.8', '3.9', '3.10', '3.11', '3.12'] steps: - uses: actions/checkout@v2 diff --git a/README.md b/README.md index 1d11f9d..13d8433 100644 --- a/README.md +++ b/README.md @@ -1,181 +1,390 @@ + # Placekey-py + + [](https://badge.fury.io/py/placekey) + [](https://pypistats.org/packages/placekey) + [](LICENSE) -A Python library for working with [Placekeys](https://placekey.io). Documentation for this package can be found [here](https://placekey.github.io/placekey-py/), and documentation for the Placekey service API can be found [here](https://docs.placekey.io/). The Plackey design specification is available [here](https://docs.placekey.io/Placekey_Technical_White_Paper.pdf). The details in Placekey encoding is [here](https://docs.placekey.io/Placekey_Encoding_Specification%20White_Paper.pdf). We welcome your feedback. + + +A Python library for working with [Placekeys](https://placekey.io). Documentation for this package can be found [here](https://placekey.github.io/placekey-py/), and documentation for the Placekey service API can be found [here](https://docs.placekey.io/). The Plackey design specification is available [here](https://docs.placekey.io/Placekey_Technical_White_Paper.pdf). The details in Placekey encoding is [here](https://docs.placekey.io/Placekey_Encoding_Specification%20White_Paper.pdf). We welcome your feedback. + + ## Installation + + This package can be installed from [PyPI](https://pypi.org/project/placekey/) by + + ```shell script -pip install placekey + +pip install placekey + ``` + + MacOS Big Sur may need to run `brew install geos` if the installation of the `shapely` dependency fails. + + ## Usage + + The basic functionality of the Placekey library is conversion between Placekeys and latitude-longitude coordinates. + + ```python + >>> import placekey as pk + >>> lat, long = 0.0, 0.0 + >>> pk.geo_to_placekey(lat, long) + '@dvt-smp-tvz' + ``` + + ```python + >>> pk.placekey_to_geo('@dvt-smp-tvz') + (0.00018033323813810344, -0.00018985758738881587) + ``` + + The library also allows for conversion between Placekeys and [H3 indices](https://github.com/uber/h3-py). + + ```python + >>> pk.placekey_to_h3('@dvt-smp-tvz') + '8a754e64992ffff' + ``` + + ```python + >>> pk.h3_to_placekey('8a754e64992ffff') + '@dvt-smp-tvz' + ``` + + The distance in meters between two Placekeys can be found with the following function. + + ```python + >>> pk.placekey_distance('@dvt-smp-tvz', '@5vg-7gq-tjv') + 12795124.895573696 + ``` + + An upper bound on the maximal distance in meters between two Placekeys based on the length of their shared prefix is provided by `placekey.get_prefix_distance_dict()`. + + ```python + >>> pk.get_prefix_distance_dict() + {0: 20040000.0, - 1: 20040000.0, - 2: 2777000.0, - 3: 1065000.0, - 4: 152400.0, - 5: 21770.0, - 6: 8227.0, - 7: 1176.0, - 8: 444.3, - 9: 63.47} + +1: 20040000.0, + +2: 2777000.0, + +3: 1065000.0, + +4: 152400.0, + +5: 21770.0, + +6: 8227.0, + +7: 1176.0, + +8: 444.3, + +9: 63.47} + ``` + + Placekeys found in a data set can be partially validated by + + ```python + >>> pk.placekey_format_is_valid('222-227@dvt-smp-tvz') + True + ``` + + ```python + >>> pk.placekey_format_is_valid('@123-456-789') + False + ``` + You can now access the locations of placekey’s free datasets in S3 using placekey-py! Use these two functions: - ```python -print(pk.list_free_datasets()) + + +```python +print(pk.list_free_datasets()) print(pk.return_free_datasets_location_by_name('chipotle-locations')) + ``` -1. List Free Datasets: Returns a list of all names of Placekey’s available free datasets - -2. Return Free Datasets Location By Name: Using one of the names from List Free Datasets above, returns the publicly accessible S3 URI of said dataset. + + +1. List Free Datasets: Returns a list of all names of Placekey’s available free datasets + +2. Return Free Datasets Location By Name: Using one of the names from List Free Datasets above, returns the publicly accessible S3 URI of said dataset. + + You can use these locations to download files programmatically (with boto3) or directly in Spark. + ## API Client + + This package also includes a client for the Placekey API. The methods in the client are automatically rate limited. + + ```python + >>> from placekey.api import PlacekeyAPI + >>> placekey_api_key = "..." + >>> pk_api = PlacekeyAPI(placekey_api_key) + ``` + + The `PlacekeyAPI.lookup_placekey` method can be used to lookup the Placekey for a single place. + + ```python + >>> pk_api.lookup_placekey(latitude=37.7371, longitude=-122.44283) + {'query_id': '0', 'placekey': '@5vg-82n-kzz'} + ``` + + ```python + >>> place = { ->>> "location_name": "Twin Peaks Petroleum", ->>> "street_address": "598 Portola Dr", ->>> "city": "San Francisco", ->>> "region": "CA", ->>> "postal_code": "94131", ->>> "iso_country_code": "US" + +>>> "location_name": "Twin Peaks Petroleum", + +>>> "street_address": "598 Portola Dr", + +>>> "city": "San Francisco", + +>>> "region": "CA", + +>>> "postal_code": "94131", + +>>> "iso_country_code": "US" + >>> } ->>> pk_api.lookup_placekey(**place, fields=["building_placekey","address_placekey","confidence_score","gers"]) -{'query_id': '0', - 'placekey': '227-223@5vg-82n-pgk', - 'address_placekey': '227@5vg-82n-pgk', - 'building_placekey': '227@5vg-82n-pgk', - 'confidence_score': 'HIGH', - 'gers': None} + +>>> pk_api.lookup_placekey(**place, fields="building_placekey","address_placekey","confidence_score","gers", "address_confidence_score"]) + +{ +'query_id': '0', +'placekey': '227-223@5vg-82n-pgk', +'address_placekey': '227@5vg-82n-pgk', +'building_placekey': '227@5vg-82n-pgk', +'confidence_score': 'HIGH', +'address_confidence_score': 'HIGH', +'gers': None +} + ``` + + The `PlacekeyAPI.lookup_placekeys` method can be used to lookup Placekeys for multiple places. + + ```python + >>> places = [ ->>> { ->>> "street_address": "1543 Mission Street, Floor 3", ->>> "city": "San Francisco", ->>> "region": "CA", ->>> "postal_code": "94105", ->>> "iso_country_code": "US" ->>> }, ->>> { ->>> "query_id": "thisqueryidaloneiscustom", ->>> "location_name": "Twin Peaks Petroleum", ->>> "street_address": "598 Portola Dr", ->>> "city": "San Francisco", ->>> "region": "CA", ->>> "postal_code": "94131", ->>> "iso_country_code": "US" ->>> }, ->>> { ->>> "latitude": 37.7371, ->>> "longitude": -122.44283 ->>> } + +>>> { + +>>> "street_address": "1543 Mission Street, Floor 3", + +>>> "city": "San Francisco", + +>>> "region": "CA", + +>>> "postal_code": "94105", + +>>> "iso_country_code": "US" + +>>> }, + +>>> { + +>>> "query_id": "thisqueryidaloneiscustom", + +>>> "location_name": "Twin Peaks Petroleum", + +>>> "street_address": "598 Portola Dr", + +>>> "city": "San Francisco", + +>>> "region": "CA", + +>>> "postal_code": "94131", + +>>> "iso_country_code": "US" + +>>> }, + +>>> { + +>>> "latitude": 37.7371, + +>>> "longitude": -122.44283 + +>>> } + >>> ] + >>> pk_api.lookup_placekeys(places, fields=["building_placekey","address_placekey","confidence_score","gers"]) + [{'query_id': 'place_0', - 'placekey': '0rsdbudq45@5vg-7gq-5mk', - 'address_placekey': '0rsdbudq45@5vg-7gq-5mk', - 'building_placekey': '22g@5vg-7gq-5mk', - 'confidence_score': 'HIGH', - 'gers': None}, - {'query_id': 'thisqueryidaloneiscustom', - 'placekey': '227-223@5vg-82n-pgk', - 'address_placekey': '227@5vg-82n-pgk', - 'building_placekey': '227@5vg-82n-pgk', - 'confidence_score': 'HIGH', - 'gers': None}, - {'query_id': 'place_2', - 'placekey': '@5vg-82n-kzz', - 'confidence_score': 'HIGH', - 'gers': None}] + +'placekey': '0rsdbudq45@5vg-7gq-5mk', + +'address_placekey': '0rsdbudq45@5vg-7gq-5mk', + +'building_placekey': '22g@5vg-7gq-5mk', + +'confidence_score': 'HIGH', + +'gers': None}, + +{'query_id': 'thisqueryidaloneiscustom', + +'placekey': '227-223@5vg-82n-pgk', + +'address_placekey': '227@5vg-82n-pgk', + +'building_placekey': '227@5vg-82n-pgk', + +'confidence_score': 'HIGH', + +'gers': None}, + +{'query_id': 'place_2', + +'placekey': '@5vg-82n-kzz', + +'confidence_score': 'HIGH', + +'gers': None}] + +``` + +You can submit a Pandas dataset and have it come back placekey'd: +```python +df = pd.DataFrame({ +"address": ["1543 Mission Street, Floor 3", "598 Portola Dr", None], +"city": ["San Francisco", "San Francisco", None], +"region": ["CA", "CA", None], +"postal": ["94105", "94131", None], +"country": ["US", "US", None], +"latitude": [None, None, 37.7371], +"longitude": [None, None, -122.44283] +}) + +column_mappings = { +"street_address": "address", +"city": "city", +"region": "region", +"postal_code": "postal", +"iso_country_code": "country", +"latitude": "latitude", +"longitude": "longitude" +} + +df_with_placekeys = pk_api._placekey_pandas_df(df, column_mappings, fields=['address_placekey', 'address_confidence_score']) +print(df_with_placekeys) +``` + ``` + address city region postal country latitude longitude placekey address_placekey +0 1543 Mission Street, Floor 3 San Francisco CA 94105 US NaN NaN 22g@5vg-7gq-5mk 22g@5vg-7gq-5mk +1 598 Portola Dr San Francisco CA 94131 US NaN NaN 227@5vg-82n-pgk 227@5vg-82n-pgk +2 None None None None None 37.7371 -122.44283 @5vg-82n-kzz NaN +``` + +You can also join two pandas datasets together (placekey'd or not). +```python +join = pk_api._join_pandas_df(df_1, column_mappings_1, df_2, column_mappings_2, on='address_placekey', how='outer') ``` Full details on how to query the API and how to get an API key can be found [here](https://docs.placekey.io/). + + ## Notebooks + + Jupyter notebooks demonstrating various Placekey functionality are contained in the [placekey-notebooks](https://github.com/Placekey/placekey-notebooks) repository. + + ## Support -This package runs on Python 3. + + +This package runs on Python 3. \ No newline at end of file diff --git a/placekey/__version__.py b/placekey/__version__.py index 557f49c..9faa928 100644 --- a/placekey/__version__.py +++ b/placekey/__version__.py @@ -1 +1 @@ -__version__ = '0.0.35' \ No newline at end of file +__version__ = '0.0.36' \ No newline at end of file