InterSent

Code for our paper Bridging Continuous and Discrete Spaces: Interpretable Sentence Representation Learning via Compositional Operations at EMNLP 2023

Requirements

transformers == 4.18.0
pytorch-lightning == 1.6.1

Data

ParaNMT
Discofuse Wikipedia balanced portion
Wikisplit
Google Sentence Compression The data folder should have a similar structure as the following:

└── data 
    └── paranmt
        └── para-nmt-5m-processed.txt
    └── discofuse
        ├── discofuse-train-balanced.txt
        └── discofuse-valid-balanced.txt
        └── discofuse-test-balanced.txt
    └── wikisplit
        ├── wikisplit-train.txt
        └── wikisplit-valid.txt
        └── wikisplit-test.txt
    └── google
        ├── sent-comp-train.txt
        └── sent-comp-test.txt

Training

To train InterSent from scratch, run the following:

bash train.sh

Evaluation

To evaluate InterSent on interpretability, run the following with your checkpoint path:

bash test.sh

To evaluate InterSent on STS, run the following with your checkpoint path:

bash stseval.sh

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
senteval		senteval
.gitignore		.gitignore
README.md		README.md
data.py		data.py
model.py		model.py
run.py		run.py
stseval.py		stseval.py
stseval.sh		stseval.sh
test.sh		test.sh
train.sh		train.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

InterSent

Requirements

Data

Training

Evaluation

About

Releases

Packages

Languages

jyhuang36/InterSent

Folders and files

Latest commit

History

Repository files navigation

InterSent

Requirements

Data

Training

Evaluation

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages