(**note - **this codebase for the initial prototype app is now archived)
LeadsDB is a lead generation system. On the back end, the system performs daily ingestions of company data as well as Newly Registered Domains (NRDs). It attempts to identify companies which may serve as candidate leads, then presents the best results to the user (see below).
- ✓ Lead data can be retrieved through a REST API.
- 🚧 Users receive email blasts of a few companies each week which match their preferences. (Under construction: Approx. 85% completed)
- 🔴 Users can integrate LeadsDB into their Hubspot or Salesforce projects, and leads will populate directly in their existing system. (Not started)
This repository is home to the user-facing front end, as well as the user-facing REST API. All code related to data ingestion exists in a separate, private micro-client.
This repository integrates a Next.js frontend with a Flask backend. For data storage, an external Cassandra database is used which then forwards data to external data warehouses through CDC trigger procedures. It includes features such as automatic NRD ingestion, company data enrichment using Abstract API, and subscriber management using the Notion API.
The app is automatically deployed to GCP Cloud Run, where it lives as a stateless app. Importantly, that means that contributors should not add code which saves state locally. We are loosely following a trunk-based branching strategy which will be formalized once Phase 2 is complete.
- Node.js
- Python 3.x
- pip package manager
git clone https://github.com/IsaacBell/leads-db.git
cd leads-db
pnpm install
python -m spacy download en_core_web_md
pip install -r requirements.txt
- Create a
.env
file in the root directory. - Add the following variables to the
.env
file:ASTRA_DB_API_ENDPOINT
: Astra DB API endpoint URLASTRA_DB_APPLICATION_TOKEN
: Astra DB application tokenPULSAR_STREAMING_API_TOKEN
: Pulsar streaming API tokenASTRA_DB_STREAMING_URL
: Astra DB streaming URLABSTRACT_API_COMPANY_ENRICHMENT_API_URL
: Abstract API company enrichment URLABSTRACT_API_COMPANY_ENRICHMENT_API_KEY
: Abstract API company enrichment API keyABSTRACT_API_SCRAPE_URL
: Abstract API scrape URLABSTRACT_API_SCRAPE_API_KEY
: Abstract API scrape API keyNOTION_TOKEN
: Notion API tokenNOTION_DB_ID
: Notion database IDOPENAI_API_KEY
: OpenAI API KeyKAFKA_URL
: Kafka broker addressKAFKA_USERNAME
: Kafka usernameKAFKA_PASSWORD
: Kafka passwordMOESIF_APP_ID
: Moesif API monetization platform
pnpm run dev
This will concurrently start the Next.js frontend and the Flask backend.
The frontend is built using Next.js and React. It provides a user interface for entering email, country, and industry preferences to subscribe for weekly updates.
app/page.tsx
: The main page.utils/staticData.ts
: Contains static data for countries and industries.
The backend is built with Flask and provides various API endpoints for company data management and subscriber management.
api/index.py
: The main Flask application file that defines the API routes and schedules background tasks.api/models
: All model classes.
/api/heartbeat
: Returns a heartbeat response to check if the server is running./api/v1/company-enrichment
: Retrieves company data from the Abstract API using a provided domain./api/v1/scrape
: Scrapes a web page using the Abstract API./api/v1/_system/daily_merge
: Performs a daily merge of data./api/v1/_system/ingestions
: Returns daily new registered domains (NRDs)./api/v1/companies/<id>
: Retrieves a company by its ID./api/v1/companies_by_name/<name>
: Retrieves a company by its name./api/v1/companies
: Inserts a new company./api/v1/subscribe
: Adds a new subscriber using the Notion API.
The repository includes a GitHub Actions workflow for daily data updates. The workflow is defined in .github/workflows/daily-updater.yml
and runs on a scheduled basis or can be triggered manually.
- Use GCloud Build secret keys for the service account credential file
- Finish building API governance system using Moesif
- Set MOESIF_APP_ID in GCP
- Migrate Cloud Build Python version to 3.12.1
This project is licensed under the MIT License.