Web-Scraping-Reviews-from-Goodreads

Introduction

Web scraping, web harvesting, or web data extraction is data scraping used for extracting data from websites using its HTML structure. In this project I have started out by introducing the common steps invloved in webscraping using Beautiful Soup.

Beautiful Soup is a Python library for pulling data out of HTML and XML files. It works with your favorite parser to provide idiomatic ways of navigating, searching, and modifying the parse tree. It commonly saves programmers hours or days of work.

The second part of the project deals with extracting reviews from the Goodreads site. Specifically novels falling under the historical section was targeted.

Once the extraction is complete, we'll be analysing the extarcted reviews. A comparison is done to illustrate the most frequently used words in the reviews. And then a collage is constructed to showcase this data.

Requirements

Python
BeautifulSoup
requests
WordCloud

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
README.md		README.md
Web_Scraping.ipynb		Web_Scraping.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Web-Scraping-Reviews-from-Goodreads

Introduction

Requirements

An Example of a Review Collage

About

Releases

Packages

Languages

poverty149/Web-Scraping-Reviews-from-Goodreads

Folders and files

Latest commit

History

Repository files navigation

Web-Scraping-Reviews-from-Goodreads

Introduction

Requirements

An Example of a Review Collage

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages