Skip to content
/ pyvers Public

Python package and app for claim verification

License

Notifications You must be signed in to change notification settings

jedick/pyvers

Repository files navigation

Code style: black

pyvers

A Python package and app for training and running claim verification models.

Claim verification is a task in natural language processing (NLP) with applications ranging from fact-checking to verifying the accuracy of scientific citations. The models used in this package are based on the transformer deep-learning architecture.

Features

Running the app

There is no need to install pyvers to run the app. The pip install command takes care of the requirements for the app. Then, run the two python commands in different terminals.

pip install torch transformers litserve gradio
python app/server.py
python app/app.py

App usage:

  • Browse to the URL generated by the last command.
  • Input a claim and evidence (example).
  • Hit "Enter" or press the Submit button to run the inference.
  • The probabilities predicted by the model are printed in the Classification text box and visualized in the barchart.
  • Change the model using the dropdown at the top. This automatically re-runs the inference using the selected model.

Screenshot:

Screenshot of pyvers app

Installation

Install pyvers if you want to fine-tune models or use the data modules.

Run these commands in the root directory of the repository.

  • The first command installs the requirements.
  • The second command install the pyvers package in development mode.
    • Remove the -e for a standard installation.
pip install -r requirements.txt
pip install -e .

Loading data

pyvers.data.FileDataModule

  • This class loads data from local data files in JSON lines format (jsonl).
  • Supported datasets include SciFact and Citation-Integrity.
  • The schema for the data files is described here.
  • Get data files for SciFact and Citation-Integrity with labels used in pyvers here.
  • The data module can be used to shuffle training data from both datasets.
from pyvers.data import FileDataModule
# Set the model used for the tokenizer
model_name = "bert-base-uncased"

# Load data from one dataset
dm = FileDataModule("data/scifact", model_name)

# Shuffle training data from two datasets
dm = FileDataModule(["data/scifact", "data/citint"], model_name)

# Get some tokenized data
dm.setup("fit")
next(iter(dm.train_dataloader()))

pyvers.data.NLIDataModule

from pyvers.data import NLIDataModule
model_name = "bert-base-uncased"

# Load data from HuggingFace datasets
dm = NLIDataModule("facebook/anli", model_name)

# Get some tokenized data
dm.prepare_data()
dm.setup("fit")
next(iter(dm.train_dataloader()))

pyvers.data.ToyDataModule

  • This is a small handmade toy dataset.
  • There are no data files; the dataset is hard-coded in the class definition.

Fine-tuning example

This takes about a minute on a CPU.

# Import required modules
import pytorch_lightning as pl
from pyvers.data import ToyDataModule
from pyvers.model import PyversClassifier

# Initialize data and model
dm = ToyDataModule("bert-base-uncased")
model = PyversClassifier(dm.model_name)

# Train model
trainer = pl.Trainer(enable_checkpointing=False, max_epochs=20)
trainer.fit(model, datamodule=dm)

# Test model
trainer.test(model, datamodule=dm)

# Show predictions
predictions = trainer.predict(model, datamodule=dm)
print(predictions)

This is what we get (results vary between runs):

┏━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃        Test metric        ┃       DataLoader 0        ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━┩
│        AUROC Macro        │          0.963            │
│      AUROC Weighted       │          0.963            │
│         Accuracy          │           88.9            │
│         F1 Macro          │           88.6            │
│         F1 Micro          │           88.9            │
│          F1_NEI           │          100.0            │
│         F1_REFUTE         │           80.0            │
│        F1_SUPPORT         │           85.7            │
└───────────────────────────┴───────────────────────────┘

[['SUPPORT', 'SUPPORT', 'SUPPORT', 'NEI', 'NEI', 'NEI', 'REFUTE', 'REFUTE', 'SUPPORT']]

# Ground-truth labels are:
# [['SUPPORT', 'SUPPORT', 'SUPPORT', 'NEI', 'NEI', 'NEI', 'REFUTE', 'REFUTE', 'REFUTE']]

Zero-shot example

This uses a DeBERTa model trained on MultiNLI, Fever-NLI and Adversarial-NLI (ANLI) for zero-shot classification of claim-evidence pairs.

import pytorch_lightning as pl
from pyvers.model import PyversClassifier
from pyvers.data import ToyDataModule
dm = ToyDataModule("MoritzLaurer/DeBERTa-v3-base-mnli-fever-anli")
model = PyversClassifier(dm.model_name)
trainer = pl.Trainer()
dm.setup(stage="test")
predictions = trainer.predict(model, datamodule=dm)
print(predictions)
# [['SUPPORT', 'SUPPORT', 'SUPPORT', 'REFUTE', 'REFUTE', 'REFUTE', 'REFUTE', 'REFUTE', 'REFUTE']]

The pretrained model successfully distinguishes between SUPPORT and REFUTE on the toy dataset but misclassifies NEI as REFUTE. This can be improved with fine-tuning.

When using a pre-trained model for zero-shot classification, check the mapping between labels and IDs.

from transformers import AutoConfig

model_name = "answerdotai/ModernBERT-base"
config = AutoConfig.from_pretrained(model_name, num_labels=3)
print(config.to_dict()["id2label"])
# {0: 'LABEL_0', 1: 'LABEL_1', 2: 'LABEL_2'}

model_name = "MoritzLaurer/DeBERTa-v3-base-mnli-fever-anli"
config = AutoConfig.from_pretrained(model_name, num_labels=3)
print(config.to_dict()["id2label"])
# {0: 'entailment', 1: 'neutral', 2: 'contradiction'}

Because it uses labels that are consistent with the NLI categories listed below, for zero-shot classification we would choose the pretrained DeBERTa model rather than ModernBERT. However, fine-tuning either model for text classification should work (see this page for information on fine-tuning ModernBERT).

Label to ID mapping

ID pyvers Fever* MultiNLI, ANLI
0 SUPPORT SUPPORTS entailment
1 NEI NOT ENOUGH INFO neutral
2 REFUTE REFUTES contradiction

* Text labels only

About

Python package and app for claim verification

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages