CVSS Metric Calculator

Do not forget to download transformers GitHub and install

Dependencies

Prerequisite: Python 3.8.6

scipy==1.6.3
numpy==1.20.2
pandas==1.2.4
nltk==3.6.2
matplotlib==3.4.2
tqdm==4.49.0
transformers==4.6.0.dev0
torch==1.4.0
nlp==0.4.0
activations==0.1.0
brokenaxes==0.4.2
easydict==1.9
file_utils==0.0.1
scikit_learn==0.24.2
utils==1.0.1
xgboost==1.4.2

To ease the installation of dependencies we suggest using the requirements.txt, via pip:

$ pip install -r requirements.txt

Optionally, you can create a Python virtual environment and install all the dependencies in it:

$ python3 -m venv venv
$ source venv/bin/activate

Dataset

The vulnerability dataset is obtained from the National Vulnerability Database (NVD), a United States government repository of standards-based vulnerability management data. We obtain the information through their Application Programming Interface (API), starting from index 0 to 152,000, representing data collected until April 2021. We filter the data to only consider descriptions related to version 3 of CVSS. We divide them into train and test sets, composed of 63,848 and 15,962 instances, respectively, found in the data folder.

We process the collected data to retrieve vulnerability descriptions and the classes for each of the eight categories analyzed: Attack Vector, Attack Complexity, Privileges Required, User Interaction, Scope, Confidentiality, Integrity, and Availability. A visual representation of class proportions, for each category, of the dataset is displayed in the following figure:

Experiments

We compare the performance of BERT, RoBERTa, ALBERT, BART, DeBERTa, and DistilBERT models in the created dataset. In the following table we display the hyperparameters for finetuning, which was based on the authors' methodology for each model. All pretrained models are obtained from the HuggingFace repository.

Model	Learning Rate	Training Epochs	Batch Size	Weight Decay
BERT	3e-05	3	4	0
RoBERTa	1.5e-05	2	4	0.01
ALBERT	3e-05	3	8	0
DeBERTa	3e-05	10	4	0
DistilBERT	5e-05	3	8	0

Our experiments used the script train.sh, train_specific_model.sh, and infer.sh. train_specific_model.sh is used to train a specific model for the different categories with varying parameters (learning rate, epochs, batch size, and weight decay). train.sh trains distilbert (by default in train.py, line 161) considering different text pre-processing approaches.

Citing Paper

If our work or code helped you in your research, please use the following BibTeX entries.

@ARTICLE{9786831,  
    author={Costa, Joana Cabral and Roxo, Tiago and Sequeiros, João B. F. and Proenca, Hugo and INÁCIO, Pedro R. M.},  
    journal={IEEE Access},   
    title={Predicting CVSS Metric via Description Interpretation},   
    year={2022},  
    volume={10},  
    number={},  
    pages={59125-59134},  
    doi={10.1109/ACCESS.2022.3179692}}

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
data		data
imgs		imgs
vocab		vocab
.gitignore		.gitignore
CVSSDataset.py		CVSSDataset.py
LICENSE		LICENSE
README.md		README.md
infer.py		infer.py
infer.sh		infer.sh
lemmatization.py		lemmatization.py
remove_stop_words.py		remove_stop_words.py
requirements.txt		requirements.txt
stemmatization.py		stemmatization.py
train.py		train.py
train.sh		train.sh
train_specific_model.sh		train_specific_model.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

CVSS Metric Calculator

Do not forget to download transformers GitHub and install

Dependencies

Dataset

Experiments

Citing Paper

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

CVSS Metric Calculator

Do not forget to download transformers GitHub and install

Dependencies

Dataset

Experiments

Citing Paper

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages