ECS 171 Final Project: Predicting Restaurant Ratings with Sentiment Analysis

Authors

Nicholas Martinez
Olivia Shen
Jonathan Tran

Project Overview

The goal of this project was to create a tool for predicting restaurant ratings given a review input, as well as to determine what kind of sentiments are most important for predicting the success of a restaurant.

Installation

To install the required libraries, please run:

pip install -r requirements.txt

To install the model needed to extract the key topics from the reviews, run:

wget https://github.com/explosion/sense2vec/releases/download/v1.0.0/s2v_reddit_2015_md.tar.gz
tar -xvzf s2v_reddit_2015_md.tar.gz
rm s2v_reddit_2015_md.tar.gz
rm ._s2v_old

Note: The downloaded tar file is 573 MB so it may take a while. If the wget takes too long to run click that github link and download it in your browser, into the ecs-171-final-project folder. Then run the other 3 lines (after the wget) above.

Folders

inital-data-cleaning

These files converted the large file containing the reviews of every business in california to the reviews of every restaurant in Sacramento and Yolo county.

The files here are not very important to run (and also take 4+ hours). The files that they generate are only used in eda.ipynb. These files may be obtained at https://drive.google.com/drive/u/1/folders/1WY9swnKV1Ky-ZU1z6Gsyd9w3QZr7C9lU put these files in a folder named data

The files do require a download of the large file because it is too big to upload to github https://datarepo.eng.ucsd.edu/mcauley_group/gdrive/googlelocal/#subsets
California -> both reviews and metadata put in folder named data

model-testing

These files train, evaulate, and save the models to be used on the website. This is the most related to our ECS 171 course.

linear_model.ipynb
- based on weighted sentiments
- generates linear.pt
neural_model.ipynb
- based on weighted sentiments
- generates nn_model.pt
poly_model.ipynb
- we ended up not using this one and the file takes 1 hour to run so its not important
total_sent_neural_model.ipynb
- based on the total sentiment counts
- generates nn4_40.pt
neural_test.py
- contains functions to train, test, and evaulate the models

models

The saved pytorch models to be loaded for the website

processed_data

Contains files that are used to train the model. The json files are the training data. The csv files are for initial data cleaning and aren't very important.

topics-sentiments

This is the code that ends up generating the files that contain the features used in the models.

eda.ipynb
- mostly graphs and tables of the data thats generated
- requires ys-reviews-with-categories.json and ys-reviews-restaurants.json, which may be downloded at the link below
sentiment_analysis.ipynb
- runs a sentiment analysis model on every review and assigns positive or negative food/service/location/clean/price to every sentence in the review.
- it then appends these reviews to their corresponding restaurants
- this code takes up to 24 hours so probably don't run this
- you can download the files this produces here: https://drive.google.com/drive/u/1/folders/1WY9swnKV1Ky-ZU1z6Gsyd9w3QZr7C9lU put these files in a folder named data
topic-detection.ipynb
- uses a model to assign a list of topics to every sentence (food/service/location/clean/price)
- also takes a few hours so don't run
- the files generated here are needed for sentiment_analysis.ipynb
topic-sentiment.ipynb
- requires the files produced by sentiment_analysis.ipynb (or just download from link above)
- generates the final files to be used in the models
- each row represents a business
- the columns represent the types/counts of sentiments the business has
- generates restaurant-topic-sentiment.json and restaurant-topic-sentiment2.json

website

Has the code needed to run the website

To run:

cd ./website
flask --app app.py run

app.py
- contains the actual flask backend
predict.py
- a file that contains classes to help abstract the models and their methods
process_text.py
- this is code that requires installation of the topic detection models. This code will also download a sentiment analysis model.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ECS 171 Final Project: Predicting Restaurant Ratings with Sentiment Analysis

Authors

Project Overview

Installation

Folders

inital-data-cleaning

model-testing

models

processed_data

topics-sentiments

website

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 129 Commits
intial-data-cleaning		intial-data-cleaning
model-testing		model-testing
models		models
processed_data		processed_data
topics-sentiments		topics-sentiments
website		website
.gitignore		.gitignore
README.md		README.md
requirements.txt		requirements.txt

Folders and files

Latest commit

History

Repository files navigation

ECS 171 Final Project: Predicting Restaurant Ratings with Sentiment Analysis

Authors

Project Overview

Installation

Folders

inital-data-cleaning

model-testing

models

processed_data

topics-sentiments

website

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages