This repository hosts the analytical code and pointers to datasets resulting from a project to generate a continent-wide set of crop field labels for Africa covering the years 2017-2023. The data are intended for training and assessing machine learning models that can be used to map agricultural fields over large areas and multiple years.
The project was funded by the Lacuna Fund, and led by Farmerline, in collaboration with Spatial Collective and the Agricultural Impacts Research Group at Clark University.
Please refer to the technical report for more details on the methods used to develop the dataset, an analysis of label quality, and usage guidelines. The report and additional documents, analyses, and demonstration code used to develop labels by cloning the repository:
git clone [email protected]:agroimpacts/lacunalabels.git
cd lacunalabels
pip install -e .
Please see the next sections for details on accessing the imagery and label data.
The imagery and labels can be obtained either from Zenodo or the Registry of Open Data on AWS, and may used in accordance with Planet’s participant license agreement for the NICFI contract. The code used to annotate and analyze the data is available under an Apache 2.0 license.
The data are in the bucket s3://africa-field-boundary-labels
in the
us-west-2
region, and are organized as follows:
Prefix/key | Description |
---|---|
imagery/ | Contains 4 band Planet image chips in geotiff format, named as follows: XX1234567890_YYYY-MM.tif Representing a grid identifier and the image acquisition date and month |
labels/ | A set of 3-class labels in geotiff format, named as: XX1234567890_123456_YYYY-MM.tif Representing a grid identifier, a labelling assignment identifier, and month and year of imagery being labelled. These labels represent one possible set from a larger number of labelling assignments, which were created using the demonstration notebook provided in this repository. |
label-catalog-filtered.csv | Details of each labeling assignment for the subset developed using the demonstration notebook, including information on label quality. |
mapped_fields_final.parquet | The original digitized field boundaries collected on the Planet imagery, in geoparquet format. These can be used together with the image chips, the code provided in the demonstration notebook, the full labelling assignment (label_catalog_allclasses.csv) and the image chip catalog (image_chip_catalog.csv) to make different subsets of labels. |
To access the data, use the AWS CLI to download the data to a local
directory. We recommend making a new directory called “data” in your
home directory, changing into that, and then using the sync
function,
as follows (note: this assumes a *nix-based terminal.
cd ~
mkdir data
cd data
aws s3 sync s3://africa-field-boundary-labels/ . --dryrun
aws s3 sync s3://africa-field-boundary-labels/ .
If the second to last line previews a successful download, run the last line, which will download all data on the bucket into your directory.
The AWS S3 console can also be used to download the data.
The image chips and geoparquet file can de downloaded directly from the Zenodo link, or from the Registry of Open Data on AWS.
Please cite the dataset as follows:
Estes, L. D., Wussah, A., Asipunu, M., Gathigi, M., Kovačič, P., Muhando, J., Yeboah, B. V., Addai, F. K., Akakpo, E. S., Allotey, M. K., Amkoya, P., Amponsem, E., Donkoh, K. D., Ha, N., Heltzel, E., Juma, C., Mdawida, R., Miroyo, A., Mucha, J., Mugami, J., Mwawaza, F., Nyarko, D. A., Oduor, P., Ohemeng, K. N., Segbefia, S. I. D., Tumbula, T., Wambua, F., Xeflide, G. H., Ye, S., Yeboah, F.(2024). A region-wide, multi-year set of crop field boundary labels for Africa. arXiv:2412.18483.