Skip to content

arpitv424/Scraped-News

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

11 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Scraped News


alt text


About

This project is all about scraping news from various well known news providing websites, this project anticipates the virality of the news headlines. The (news-scrape.csv) is the dataset for the same and the total number of news are around 9000. The websites which is scraped are InShorts, BBC News, ABC News, Washington Post, Daily Mail, Google News, FOX News


Procedure to run the code

Install all the required python libraries.

Beautiful Soup

pip install bs4

lxml

pip install lxml

Requests

pip install requests

Pandas (For handling of data)

pip install pandas

The code which has extension (.py) can be executed directly but for the code which has extension (.ipynb) requires a virtual environment to run it (Jupyter Notebook) .


Important: It is not necessary that final length of the data will be same as above mentioned, it is based upon number of available content on the website


Sample Data

alt text

About

News are collected by scraping from various websites to anticipate its virality.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •