Skip to content

SandroBarillaPXL/expertlab-sprint2-scraping

Repository files navigation

expertlab-sprint2-scraping

This repository serves as a PoC of web scraping

💻 App: Torfs scraper

This app scrapes the website of Torfs with the webscraping tool Puppeteer for items. Users can choose which category they want to scrape by filling in the appropriate URL. For testing purposes, a development mode is available. When enabled, the user can choose how many pages they want to scrape instead of everything. This speeds up the process and is useful for testing.

The app displays all scraped items in a table containing the names, types, amount of colors, prices images and a link to the original item. The user can also download this data as a JSON-file.

Screenshots Screenshot 1

Dev mode enabled
Screenshot 2

JSON download
Screenshot 3 JSON

🔧 Installation

👤 Manual installation

  1. Clone this repository and navigate to the folder
git clone https://github.com/SandroBarillaPXL/expertlab-sprint2-scraping
cd expertlab-sprint2-scraping
  1. Install the dependencies
npm install
  1. Start the backend API-server, accessible at http://localhost:3000
node scripts/api.js
  1. Start the frontend with a simple HTTP server of your choice, like the "live server" extension in Visual Studio Code for local use.

💡 Note: Puppeteer requires a Chromium browser to be installed on your system.


🐳 Docker installation

  1. Clone this repository and navigate to the directory
git clone https://github.com/SandroBarillaPXL/expertlab-sprint2-scraping
cd expertlab-sprint2-scraping
  1. Build the Docker images (optional) for both the frontend and backend
docker build -t <username>/<imagename-frontend>:<tag> -f docker/Dockerfile-fe .
docker build -t <username>/<imagename-backend>:<tag> -f docker/Dockerfile-be .
  1. Run the Docker containers
docker run -d -p 3000:3000 <username>/<imagename-backend>:<tag>
docker run -d -p <port>:80 <username>/<imagename-frontend>:<tag>

Alternatively, you can use the docker-compose.yml file to run the containers. By default, the app is available at http://localhost:8500.

docker compose -f ./docker/docker-compose.yml up -d

ℹ️ Sources

About

This repository serves as a PoC of web scraping

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published