GitHub

geolocation lookup

Build the "prepare" and "locate" command line tools:

Make build directory:

mkdir out

Generate the makefile:

cd out
cmake ../geolocation

Compile:

make

Prepare the database

Unpack and preprocess the database into binary format:

cd geolocation
unzip database.zip
cd ../out
./prepare ../geolocation/database.csv

The "prepare" command preprocesses the csv file in order to create database in binary format. The resulting database contains only needed records i.e. ip range start (csv column 1), country code (csv column 3) and the city (csv column 6)). Preprocessing is done using https://github.com/ben-strasser/fast-cpp-csv-parser header only library for csv parsing.

The binary format of the resulting database file is as follows:

+--------------------------------------------------------+
|                                                        |
|             4 bytes (DB_COUNT_RECORD_SIZE)             | db_record_count
|                                                        |
+--------------------------------------------------------+
|                                                        |
|                                                        |
|     (db_record_count * DB_INDEX_RECORD_SIZE) bytes     | db_index (ip range start)
|                                                        |
|                                                        |
+--------------------------------------------------------+
|                                                        |
|                                                        |
|                                                        |
|                                                        |
|    (db_record_count * DB_LOCATION_RECORD_SIZE) bytes   | db_records (country code and city)
|                                                        |
|                                                        |
|                                                        |
|                                                        |
+--------------------------------------------------------+

Run the program:

cd out
./locate ../geolocation/database.csv

The "locate" program maps the binary database file into process address space and all the search is done using pointer arithmetic. There is no additional data structure.

Run the tests on the specific cpu to improve the cache usage:

cd geolocation
taskset --cpu-list 1 ./geolocation_test.py --executable ../out/locate --database ../geolocation/database.csv
Database loaded Memory usage: 3.13mb Load time: 40μs
OK    1.0.0.0 US Los Angeles Memory usage: 3.13mb Lookup time: 42μs
OK    71.6.28.0 US San Jose Memory usage: 3.13mb Lookup time: 25μs
OK    71.6.28.255 US San Jose Memory usage: 3.13mb Lookup time: 13μs
OK    71.6.29.0 US Concord Memory usage: 3.13mb Lookup time: 11μs
OK    53.103.144.0 DE Stuttgart Memory usage: 3.13mb Lookup time: 24μs
OK    53.255.255.255 DE Stuttgart Memory usage: 3.13mb Lookup time: 10μs
OK    54.0.0.0 US Rahway Memory usage: 3.13mb Lookup time: 12μs
OK    223.255.255.255 AU Brisbane Memory usage: 3.13mb Lookup time: 32μs
OK    5.44.16.0 GB Hastings Memory usage: 3.13mb Lookup time: 18μs
OK    8.24.99.0 US Hastings Memory usage: 3.13mb Lookup time: 15μs
Final points for 10 measurements:  52.237044000000004

In order to achieve the best perfomance the "taskset" command is used to schedule execution on the specified cpu maximising the cache hit.

min/max/avg score from 100 subsequent test runs:

cd geolocation
./test.sh 100
min: 47.2458545
max: 52.6927515
avg: 48,24

Development and test environment:

Developed and tested on ubuntu 20.04, 64 bit, Intel Core i3.

Name		Name	Last commit message	Last commit date
Latest commit History 25 Commits
geolocation		geolocation
.gitignore		.gitignore
Geolocation.md		Geolocation.md
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

geolocation lookup

Build the "prepare" and "locate" command line tools:

Prepare the database

Run the program:

Run the tests on the specific cpu to improve the cache usage:

min/max/avg score from 100 subsequent test runs:

Development and test environment:

About

Releases

Packages

Languages

mstrozyn/geolocation

Folders and files

Latest commit

History

Repository files navigation

geolocation lookup

Build the "prepare" and "locate" command line tools:

Prepare the database

Run the program:

Run the tests on the specific cpu to improve the cache usage:

min/max/avg score from 100 subsequent test runs:

Development and test environment:

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages