SG Property Hub is a comprehensive platform for managing and analyzing property data. It consists of multiple components including a back-end API, a data pipeline, a front-end application, and a property crawler worker. Each component is containerized using Docker for easy deployment and orchestration.
The data pipeline component handles data processing tasks using Apache Spark and Apache Airflow.
- DAGs: Contains Airflow DAGs for orchestrating data workflows.
- Spark: Contains Spark jobs for data processing.
- MinIO: Used for object storage.
- Dependencies: Managed using
requirements.txt
.
The back-end component provides APIs and database management functionalities.
- API: Contains the API logic.
- DB: Contains database-related scripts and configurations.
- Dependencies: Managed using
requirements.txt
.
The front-end application built with Next.js.
- Components: Reusable UI components.
- Utilities: Helper functions and utilities.
- Configuration: Next.js and Tailwind CSS configurations.
- Dependencies: Managed using
package.json
.
The property crawler worker is responsible for web scraping property data.
- Scripts: Contains scripts for crawling different property websites.
- Dependencies: Managed using
requirements.txt
.
- Docker
- Docker Compose
- Node.js (for Next.js application)
-
Clone the repository:
git clone https://github.com/vietdoo/sg-property-hub.git cd sg-property-hub
-
Back-End:
cd back-end docker-compose up --build
-
Data Pipeline:
cd data-pipeline docker-compose up --build
-
Next-Hub:
cd next-hub npm install npm run dev
-
Property Crawler Worker:
cd property-crawler-worker docker-compose up --build
Access the API at http://localhost:8000
.
Access the Airflow UI at http://localhost:8080
.
Access the front-end application at http://localhost:3000
.
Run the crawler scripts as needed.
Contributions are welcome! Please open an issue or submit a pull request.
This project is licensed under the MIT License.
Feel free to customize this README file further based on your specific requirements.