This is the homepage for the AI4ALL 2018 NLP research project. Here you can find links to all class materials used for the research project.
Instructors: Rob Voigt ([email protected]), Bingbin Liu ([email protected])
- Week 1 - Fri / Lesson 0: Introduction to NLP
- Week 2 - Tue / Lesson 1: Rule-based classifiers
- Week 2 - Wed / Lesson 2: Evaluation metrics (Exercise sheet here)
- Week 2 - Thu / Lesson 3: Probability theory and Bayes rule (Exercise sheet here)
- Week 2 - Fri / Lesson 4: Naive Bayes classifier
- Week 3 - Mon / Lesson 5: More NLP
- Week 3 - Tue / Lesson 6: Naive Bayes classifier for Twitter project
- Week 3 - Wed / Lesson 7: Neural Networks
- 5 minutes talk at the banquet
- Lesson 0: Data exploration spreadsheet
- Lecture on text processing (e.g. regular expression, tokenization, lemmatization/stemming) from Stanford CS 124 by Professor Dan Jurafsky
- Python cheat sheet: feel free put comments / things you'd like to know about in the slides!
- Naive Bayes cheat sheet
- Latex to make our slides / poster pretty
- Next Steps: Resources for after AI4ALL
We will go through this together on July 3rd, but feel free to start on your own! :)
-
Install Anaconda.
Anaconda is a python distribution that makes it really easy to install additional python packages and manage different Python versions. You can download Anaconda from https://www.anaconda.com/download/. Make sure to download the Python 3.6 version! This should also automatically install Jupyter notebook, which you'll need to run the notebooks.
-
Install numpy and nltk:
Open a Terminal window and type
conda install nltk numpy pandas
-
Copy ("clone") the GitHub repository to your computer:
Open a Terminal window and type
git clone https://github.com/ClaraBing/AI4ALL2018
This will copy all the notebooks to your computer.
-
Change into the directory:
In the same Terminal window, type
cd AI4ALL2018
-
Download the tokenizer models:
Start a Python console by typing
python
in the Terminal window. Then run the following commands:import nltk nltk.download("punkt") exit()
-
Run the jupyter notebook:
jupyter notebook
The directory filled
contains versions of the iPython notebooks with the solutions filled in, which will be released at the end of each day. If you would like to run these, you need to copy them to the main directory (i.e. AI4ALL2018
), overwriting the blank versions of the notebooks that are currently there. Then run jupyter notebook
and you should be able to access the completed versions of the notebooks.