This repository contains corpora, model weight, and code for training and evaluating for the paper: Personalized Dialogue Generation with Persona-Adaptive Attention
Here is our overall architecture (PAA):
Automatic evaluation results on ConvAI2 dataset over our implemented approach. Boldface indicates the best result in terms of the corresponding metrics.
Comparison with GPT2 under low-resource scenario, we sampled 10% to 90% of training data to train GPT2-SMALL, GPT2-MEDIUM and PAA.
We use Anaconda to manage our environment, so we recommend to use Anaconda to install the environment. Suppose the anaconda is installed in your system, you can create a new environment by:
conda env create -f environment.yml
Then you can activate the environment by:
conda activate PersonaGeneration
we use gpt2
weight from Huggingface, you can download it to downloaded_LM
directory as:
downloaded_LM/gpt2-pytorch_model.bin
Due to the github file size limit, we zip the training and testing text files under data/convAI2
directory
Please unzip as:
- data
- convAI2
- train_self_original.txt
- valid_self_original.txt
- convAI2
python train.py \
--config=config/gated_transformer/final_gated_transformer-small.yml \
--dataset=convai2 \
--lr=1e-6 \
--gated=yes \
--fusion_mode=pr-cr \
--auto_tau=accurate \
--auto_tau_numerator=persona \
--response_gated=no \
--shared_enc=no \
--shared_crossattention=no \
--add_persona_to_decoder=yes \
--add_persona_indicator=yes \
--add_role_indicator=yes
config
: the config yml file, we have some pre-defined yml files under config directorydataset
: the dataset name, we supportconvai2
lr
: learning rategated
: whether to use extra cross-attention mechanism, we supportyes
andno
fusion_mode
: the way to fuse cross-attention, the deail will be demonstrated in the following section, for vanilla PAA, usepr-cr
tau
: thetau
value in the paper, we support manually assigned (from 0.0-1.0)auto_tau
: the automatically computedtau
(accurate
)auto_tau_numerator
: the numerator of the auto tau, we supportpersona
andcontext
response_gated
: whether fuse response into the weighted cross-attended result from PAA, default is no, enabling will drastically reduce the performanceshared_enc
: whether to share the encoder between persona and contextshared_crossattention
: whether to share the cross-attention between persona and contextadd_persona_to_decoder
: whether to add persona to the decoder inputadd_persona_indicator
: whether to add persona indicator to the decoder input & encoder inputadd_role_indicator
: whether to add role indicator to the context encoder input
After model training, use the checkpoint file to do evaluation:
F1: To evaluate the F1, you can run run_test_f1.py
with --model_path
to your own checkpoint.
Text Decoding: To decode evaluation text file, run run_test_decoding.py
with --model_path
to your own checkpoint.
BLEU: To evaluate BLEU, do validate_bleu.py
on the decoded text file.
We saved the decoded text from PAA and GPT2 models under generated_text/
directory.
In the fusion_mode
, we support follow modes:
- pr-cr
- The vanilla PAA design described in the paper
- cr-pr
- The
Context-Adaptive Attention (c)
in the above figure
- The
- random
- The persona mask and the context mask are randomly assigned
- prr-crr
- The
Dual Weights Attention (a)
design
- The
- skipc-pr
- The 'Skipped Weight Attention (b)' design described in the above figure
- param_gate
- The
Parametric Attention (e)
design described in the above
- The
- condition_bias
- The Condition-bias Re-implementation.
- Paper:
A Simple and EfficientMulti-Task Learning Approach for Conditioned DialogueGeneration.
- attention_routing
- The Attention Routing Re-implementation
- Paper:
A Pre-Training Based Personalized Dialogue Generation Model with Persona-Sparse Data
If you found our work or code is useful, please use the follow bibtex to cite us:
@article{Huang_Zhang_Ko_Liu_Wu_Wang_Tang_2023,
title={Personalized Dialogue Generation with Persona-Adaptive Attention},
volume={37},
url={https://ojs.aaai.org/index.php/AAAI/article/view/26518},
DOI={10.1609/aaai.v37i11.26518},
number={11},
journal={Proceedings of the AAAI Conference on Artificial Intelligence},
author={Huang, Qiushi and Zhang, Yu and Ko, Tom and Liu, Xubo and Wu, Bo and Wang, Wenwu and Tang, H},
year={2023},
month={Jun.},
pages={12916-12923}
}