-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make training code available? #2
Comments
+1 @JohnnyC08 It'd also be interesting to see if including context (conversation history / last utterances) improves the accuracy of predictions. |
@creatorrr That's interesting. How would you go about doing that? My first thought is using a rolling window and making a single block of text out of the elements of the window and assigning the label to the text block of the last element in the block. How would you do it? |
@JohnnyC08 I was thinking of something simpler, prepending dialog act labels of the last three utterances to the input vector when finetuning it. For example, take this conversation: A: Do you want to grab lunch? [Yes-No-Question]
B: Not really. [Dispreferred-answers]
A: Oh okay. [Response-Acknowledgement]
B: How about tomorrow? <<TO PREDICT>> Then the input vector would be: |
@creatorrr @JohnnyC08 Did either of you end up creating a previous context-dependent model? Also, were you able to successfully predict on a GPU? Loading the model is entirely allocating all my card's memory, suggesting a leak in the loading of the model that is downloaded. |
@bhavitvyamalik Thanks again for publishing the model. I think that some comments on how training was conducted would really make this repo more complete. What are the inputs for training on the SBWA corpus? Are they single sentences or sequences of sentences? What training scripts were used to train this model? Any utilities to customize this for another dataset? What parameters were used for fine-tuning? What outputs of the DistilBert encoding are used for the classification task? I am attempting to use this for DA labeling on a conversational dataset and it is giving various and poor results for the same simple sentence "Okay." I assume this is because of dropout and also over or under fitting. Overall I'm not sure that this model gives me confidence required to use for my project as is. If training scripts and the data were released that would be awesome! |
Haven’t gotten around to it yet, been really busy but will give it a try one of these weekends @argideritzalpea |
Ever got a chance to try this out @JohnnyC08 ? |
Hi, |
@JohnnyC08 I ended up training a deberta based dialog act classifer on silicone-merged dataset using sentence pairs (previous utterance, current utterance) and it performs better than single utterances. You can take a look here. |
I'm interested in reproducing the model in pytorch and am curious how you preprocessed the data and trained it. I didn't see any metrics reported and want to see what those look like as well! So, the training script would be nice to have as well!
Great repo by the way!
The text was updated successfully, but these errors were encountered: