Dog API is a fun, open-source Flask application that fetches and serves random dog images from Reddit, with built-in content filtering, security features, and high-performance optimizations.
- Fetch random dog images from multiple subreddits
- Simple text-based content filtering
- Enhanced security with proper headers and input validation
- Advanced rate limiting to prevent abuse
- Parallel background image prefetching
- Response caching for improved performance
- Response compression for faster delivery
- Connection pooling for optimized network performance
- Thread pool for concurrent operations
- Metrics and monitoring endpoints
- Simple, easy-to-use REST API
- Python 3.8+
- Reddit API Credentials
- Redis (optional, for enhanced caching)
- Clone the repository
git clone https://github.com/scanash00/dog-api.git
cd dog-api
- Create a virtual environment
python -m venv venv
source venv/bin/activate # On Windows, use `venv\Scripts\activate`
- Install dependencies
pip install -r requirements.txt
- Create a
.env
file with the following:
REDDIT_CLIENT_ID=your_reddit_client_id
REDDIT_CLIENT_SECRET=your_reddit_client_secret
REDDIT_USER_AGENT=your_user_agent
REDIS_URL=redis://localhost:6379/0 # Optional
# Performance Tuning
THREAD_POOL_SIZE=10 # Number of worker threads for concurrent operations
CACHE_TTL=3600 # Cache time-to-live in seconds
REQUEST_TIMEOUT=5 # Timeout for external requests in seconds
COMPRESSION_THRESHOLD=1024 # Minimum size in bytes for response compression
# Prefetching Configuration
PREFETCH_BATCH_SIZE=5 # Number of subreddits to prefetch in parallel
PREFETCH_INTERVAL=600 # Base interval between prefetch operations in seconds
MIN_IMAGES_PER_SUBREDDIT=5 # Minimum number of images to maintain per subreddit
MAX_PREFETCH_ERRORS=3 # Maximum number of consecutive errors before temporary blacklisting
PREFETCH_RETRY_DELAY=30 # Base delay for retry after prefetch errors
# Security
ALLOWED_ORIGINS=https://example.com,https://anotherdomain.com # Optional, comma-separated
RATE_LIMIT=60 # Rate limit in requests per minute
LOG_LEVEL=INFO # Logging level
python app.py
gunicorn -c gunicorn_config.py app:app
Fetches a random dog image from Reddit.
subreddit
(optional): Specify a particular subreddit to fetch from. Must be one of the supported subreddits.
{
"title": "Cute dog sitting on a windowsill",
"url": "https://i.redd.it/cute-dog-image.jpg",
"subreddit": "dogs",
"upvotes": 1245,
"source": "reddit",
"response_time_ms": 120
}
{
"error": "No safe dog images found after multiple attempts",
"status": 404
}
{
"error": "Invalid request parameters",
"details": {"subreddit": ["Invalid value"]},
"status": 400
}
Health check endpoint. Returns application status and service health information.
Public statistics endpoint that provides basic usage metrics and performance information.
{
"uptime": 3600,
"requests": {
"total": 1500,
"random_dog": 1200
},
"performance": {
"threads": {
"active": 3,
"total": 10,
"utilization": 30.0
}
},
"cache": {
"enabled": true,
"prefetched_images": 45,
"prefetch_age_seconds": 300,
"prefetch_distribution": {
"dogs": 12,
"beachdogs": 15,
"DogsStandingUp": 8,
"dogpics": 10
},
"prefetch_config": {
"batch_size": 5,
"interval": 600,
"min_images_per_subreddit": 5
}
},
"subreddits": {
"total": 20,
"popular": {
"dogs": 156,
"beachdogs": 98,
"dogpics": 67,
"DogsStandingUp": 45,
"SupermodelDogs": 32
},
"errors": {
"DogsBeingDogs": 1
}
},
"timestamp": "2025-03-18T20:35:22.123456",
"response_time_ms": 15
}
Protected metrics endpoint for monitoring systems (only accessible from localhost or private networks).
The API uses connection pooling for HTTP requests to improve network performance and reduce connection overhead.
- Thread pool for concurrent operations
- Parallel prefetching of images from multiple subreddits
- Asynchronous health checks
The API implements multiple levels of caching:
- In-memory caching for prefetched images
- Redis-based response caching (when configured) for API responses
- Optimized cache key generation for faster lookups
Responses are automatically compressed using gzip when they exceed a configurable threshold and the client supports compression.
The API implements a smart, adaptive prefetching system:
-
Smart Subreddit Selection
- Prioritizes subreddits with low image counts
- Favors popular subreddits based on request frequency
- Includes random subreddits for exploration
- Temporarily avoids subreddits with repeated errors
-
Dynamic Scheduling
- Adjusts prefetch intervals based on demand and performance
- Triggers immediate prefetching when popular subreddits run low on images
- Implements exponential backoff for error recovery
-
Efficient Image Management
- Maintains minimum thresholds of images per subreddit
- Avoids duplicate images in the prefetch cache
- Tracks image usage patterns to optimize prefetching
-
Error Resilience
- Tracks error rates by subreddit
- Implements retry logic with exponential backoff
- Automatically recovers from temporary API failures
-
Performance Monitoring
- Detailed metrics on prefetch operations
- Subreddit popularity tracking
- Error rate monitoring by subreddit
All external API calls have configurable timeouts to prevent slow operations from blocking the API.
The API implements a simple text-based content filtering system to ensure that only appropriate cat images are served:
- Pattern-based filtering that checks for potentially unsafe words or phrases
- Filtering applied to post titles to screen out inappropriate content
- Configurable patterns that can be easily updated or extended
All API endpoints validate input parameters to prevent injection attacks and ensure proper data formatting.
- 100 requests per day per IP address
- 30 requests per hour per IP address
- 10 requests per minute per IP address
The API implements the following security headers:
- Content-Security-Policy
- Strict-Transport-Security
- X-Content-Type-Options
- X-Frame-Options
- X-XSS-Protection
- Referrer-Policy
The API supports Cross-Origin Resource Sharing (CORS) with configurable origins.
- By default, requests from any origin are allowed
- For production, configure the
ALLOWED_ORIGINS
environment variable to restrict access - Preflight requests are automatically handled
Note for Production: Always restrict CORS origins in a production environment for enhanced security.
Contributions are welcome! Make sure to follow common sense when contributing, and keeping the code clean.
MIT License
Images are sourced from Reddit and filtered for safety. However, content may occasionally slip through filtering.