Skip to content

poverty149/Web-Scraping-Reviews-from-Goodreads

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 

Repository files navigation

Web-Scraping-Reviews-from-Goodreads

Introduction

Web scraping, web harvesting, or web data extraction is data scraping used for extracting data from websites using its HTML structure. In this project I have started out by introducing the common steps invloved in webscraping using Beautiful Soup.

Beautiful Soup is a Python library for pulling data out of HTML and XML files. It works with your favorite parser to provide idiomatic ways of navigating, searching, and modifying the parse tree. It commonly saves programmers hours or days of work.

The second part of the project deals with extracting reviews from the Goodreads site. Specifically novels falling under the historical section was targeted.

Once the extraction is complete, we'll be analysing the extarcted reviews. A comparison is done to illustrate the most frequently used words in the reviews. And then a collage is constructed to showcase this data.

Requirements

  • Python
  • BeautifulSoup
  • requests
  • WordCloud

An Example of a Review Collage

reviews

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published