ExtroyaIntro

NLU project

Let it Go! Let it goooo!

Files in this repository(Update as it is put in)

1)preprocess.py (Contains the preprocessed text data and the batches for training, validation and test sets

I am adding the preprocessing file for now.

One improvement all of us should try to implement is using 4 separate two class classifiers. Since each label has a label of 4 tags of which each tag can be one of two entities so Each of 4 classifiers will try to predict one entity for each tag e.g. for ,,, each position is a tag and here the first tag takes values 'E' or 'N'

I will upload the preprocessing file later where there will be four binary columns indicating one of the two binary entities for each of four tags. Update(sayambhu): I have now added the preprocessing where the four different binary columns for four different label types are given. #E for 1 and I for 0 #N for 1 and S for 0 #T for 1 and F for 0 #J for 1 and P for 0

Edit: I have uploaded the pickle files and I made a mistake. Each row is a 500 length vector

Aniket will try to implement the RNN model in the kernel for that facebook data. #The improvement point is using biRNNs. Try to do the same code using biRNNs

https://www.kaggle.com/prnvk05/rnn-mbti-predictor/notebook

I will try to implement the Bag of words based classifier according to this kernel #I know there is no improvement on using these methods which have tfidf or bag of words.

But mostly I will improve preprocessing as much as possible. You two should not focus on preprocessing. I will do that.

https://www.kaggle.com/depture/multiclass-and-multi-output-classification

Rishi will try to implement the CNN based classifier according to this code on github or his own code if he wants #The improvement part might be to get multilayer CNNs or different pooling methods

https://github.com/aboyker/convnet-document-classification

Name		Name	Last commit message	Last commit date
Latest commit History 31 Commits
Attention Model		Attention Model
README.md		README.md
data_analysis.py		data_analysis.py
lstner.py		lstner.py
lstnerc.py		lstnerc.py
multilayerthi.py		multilayerthi.py
personality-detection-written.pdf		personality-detection-written.pdf
pos_tags.pckl		pos_tags.pckl
preprocess.py		preprocess.py
preprocess1.py		preprocess1.py
preprocess2.py		preprocess2.py
preprocess3.py		preprocess3.py
test_x.pickle		test_x.pickle
test_y.pickle		test_y.pickle
train_x.pickle		train_x.pickle
train_y.pickle		train_y.pickle
val_x.pickle		val_x.pickle
val_y.pickle		val_y.pickle
word2v.py		word2v.py
wordcl.py		wordcl.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ExtroyaIntro

Edit: I have uploaded the pickle files and I made a mistake. Each row is a 500 length vector

But mostly I will improve preprocessing as much as possible. You two should not focus on preprocessing. I will do that.

About

Releases

Packages

Contributors 3

Languages

testerpce/ExtroyaIntro

Folders and files

Latest commit

History

Repository files navigation

ExtroyaIntro

Edit: I have uploaded the pickle files and I made a mistake. Each row is a 500 length vector

But mostly I will improve preprocessing as much as possible. You two should not focus on preprocessing. I will do that.

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Languages

Packages