A serverless Python backend for tracking competitor pricing and generating AI-powered competitive intelligence. Built with AWS Lambda, PostgreSQL, and flexible web scraping architecture.
- ** Flexible Web Scraping**: Choose between Playwright (FREE) or ScrapingBee (PAID)
- ** AI Battle Cards**: GPT-4 powered competitive analysis and positioning
- ** Intelligent URL Discovery**: Optimized workflow with confidence validation (NEW)
- ** Confidence Validation**: Prevents wrong results for lesser-known companies (NEW)
- ** Social Media Integration**: Automated tracking of LinkedIn, Twitter, Instagram, TikTok
- ** Serverless Architecture**: AWS Lambda + RDS PostgreSQL for scalability
- ** Automated Scheduling**: Regular competitor monitoring every 6 hours
- ** Historical Tracking**: Store and analyze pricing trends over time
- ** Multi-tenant Support**: Built for SaaS with user isolation
┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐
│ API Gateway │────│ Lambda Functions │────│ RDS PostgreSQL │
│ (REST API) │ │ (Handlers) │ │ (Database) │
└─────────────────┘ └─────────────────┘ └─────────────────┘
│ │ │
│ ┌─────────────────┐ │
└──────────────│ External APIs │──────────────┘
│ • ScrapingBee │
│ • OpenAI GPT-4 │
│ • Cohere AI │
│ • Social Media │
│ • Search APIs │
└─────────────────┘
The system now uses a streamlined 3-step process that's more efficient and reliable:
-
** Search Engines Do Implicit Categorization**
- Search "Company pricing" → pricing URLs
- Search "Company features" → features URLs
- Search "Company blog" → blog URLs
-
** LLM Ranks Top 10 Most Relevant URLs**
- Analyzes URL paths, titles, descriptions
- Returns URLs ordered by relevance with confidence scores
-
** LLM Selects Single Best URL**
- Chooses most valuable for competitive analysis
- Returns final URL per category with confidence validation
Problem Solved: Prevents wrong results for lesser-known companies and startups.
Multi-Layer Validation:
- Brand Recognition: AI validates if company is well-known enough for reliable results
- Domain Validation: Ensures discovered domains actually belong to the company
- URL Confidence: LLM can declare "NO_RELEVANT_URLS" if none are suitable
- Configurable Thresholds: Adjust precision based on use case