This is the official repository containing the dataset that was used for the paper "Extreme Multi-Label Skill Extraction Training using Large Language Models". This dataset is a list of vacancy sentences, each tagged with an ESCO skill. The dataset is hosted on HuggingFace. Finally, the full evaluation scores used in table 1 are available in this repository under the eval
directory. The filenames correspond to the rows in the table.
If you use this dataset in a scientific publication, we would appreciate using the following citation:
@inproceedings{01HDNKFAFWHB08J12A55DCN7GD,
author = {{Decorte, Jens-Joris and Verlinden, Severine and Van Hautte, Jeroen and Deleu, Johannes and Develder, Chris and Demeester, Thomas}},
booktitle = {{AI4HR & PES, ECML-PKDD 2023 Workshop, Proceedings}},
language = {{eng}},
location = {{Torino, Italy}},
pages = {{1--10}},
title = {{Extreme multi-label skill extraction training using large language models}},
year = {{2023}},
}