A web application for comparing search results across multiple academic search engines, including ADS/SciX, Google Scholar, Semantic Scholar, and Web of Science.
- Compare search results from multiple academic search engines
- Analyze similarity and differences between result sets
- Experiment with boosting factors to improve search rankings
- Perform A/B testing of different search algorithms
- Debug tools for API testing and diagnostics
- Direct Solr proxy for ADS/SciX queries (no API key required)
The application now includes integration with Quepid, a search relevance testing platform. This integration allows you to:
- Connect to your Quepid cases containing relevance judgments
- Evaluate search results using industry-standard metrics like nDCG@10
- Compare performance across different search engines
- Test how changes to search algorithms affect relevance scores
To use the Quepid integration, you'll need to set the following environment variables:
QUEPID_API_URL=https://app.quepid.com/api/
QUEPID_API_KEY=your_api_key_here
The following endpoint has been added:
POST /experiments/quepid-evaluation
: Evaluate search results against Quepid judgments
Example request:
{
"query": "katabatic wind",
"sources": ["ads", "scholar", "semantic_scholar"],
"case_id": 123,
"max_results": 20
}
Example response:
{
"query": "katabatic wind",
"case_id": 123,
"case_name": "Atmospheric Sciences",
"source_results": [
{
"source": "ads",
"metrics": [
{
"name": "ndcg@10",
"value": 0.85,
"description": "Normalized Discounted Cumulative Gain at 10"
},
{
"name": "p@10",
"value": 0.7,
"description": "Precision at 10"
}
],
"judged_retrieved": 15,
"relevant_retrieved": 12,
"results_count": 20
}
],
"total_judged": 25,
"total_relevant": 18
}
The repository now includes a new feature that uses lightweight open-source LLMs to interpret user search queries, detect intent, and transform queries to be more effective. This feature is accessible through the "Query Intent" tab in the UI.
- Query analysis using local LLM models via Ollama
- Automatic query transformation based on detected intent
- Support for multiple lightweight models (Llama 2, Mistral, Gemma)
- Rule-based fallbacks when intent is clear
- Docker Compose setup for easy deployment
To use this feature:
- Set up the backend service following instructions in
backend/README.md
- Use the "Query Intent" tab in the UI for semantic query transformation
For details, see the backend documentation.
The project is structured as follows:
-
backend/
: FastAPI backend with search servicesapp/
: Application codeapi/
: API routes and modelscore/
: Core configuration and utilitiesservices/
: Search engine integration servicesutils/
: Utility functions
tests/
: Backend tests
-
frontend/
: React frontend applicationpublic/
: Static filessrc/
: React source codecomponents/
: React componentsservices/
: API service functions
- Python 3.9+
- Node.js 14+
- API keys for academic search engines (optional)
-
Navigate to the backend directory:
cd backend
-
Create and activate a virtual environment:
python -m venv venv source venv/bin/activate # On Windows: venv\Scripts\activate
-
Install dependencies:
pip install -r requirements.txt
-
Create a
.env.local
file in the project root with your API keys:ADS_API_TOKEN=your_ads_token SEMANTIC_SCHOLAR_API_KEY=your_ss_key WEB_OF_SCIENCE_API_KEY=your_wos_key
The application supports querying ADS/SciX directly through a Solr proxy, which offers faster results and doesn't require an API key. Configure this in your environment file:
# Solr proxy URL (default: https://scix-solr-proxy.onrender.com/solr/select)
ADS_SOLR_PROXY_URL=https://scix-solr-proxy.onrender.com/solr/select
# Query method (options: solr_only, api_only, solr_first)
# - solr_only: Only use Solr proxy
# - api_only: Only use ADS API
# - solr_first: Try Solr first, fall back to API if needed (default)
ADS_QUERY_METHOD=solr_first
-
Navigate to the frontend directory:
cd frontend
-
Install dependencies:
npm install
-
Start both frontend and backend servers:
./start_local.sh
Or run them separately:
-
Backend:
cd backend python -m uvicorn app.main:app --reload --host 0.0.0.0 --port 8000
-
Frontend:
cd frontend npm start
-
-
Open your browser and navigate to http://localhost:3000
- Run backend tests:
cd backend pytest
This application is configured for deployment on Render.com using the render.yaml
configuration file.
The application supports different environments:
local
: For local developmentdevelopment
: For development deploymentstaging
: For staging deploymentproduction
: For production deployment
Environment-specific configuration is loaded from:
.env.local
.env.dev
.env.staging
.env.prod
When running locally, the API documentation is available at:
- Swagger UI: http://localhost:8000/api/docs
- ReDoc: http://localhost:8000/api/redoc
A tool for comparing search results across different academic search engines.
- A Render account
- ADS API token
- Git repository with the code
- Create a new Web Service on Render
- Connect your Git repository
- Configure the following settings:
- Name: search-comparisons-backend
- Environment: Python
- Build Command:
pip install -r requirements.txt
- Start Command:
uvicorn app.main:app --host 0.0.0.0 --port $PORT
- Environment Variables:
LLM_PROVIDER
: ollamaLLM_MODEL_NAME
: llama2LLM_TEMPERATURE
: 0.7LLM_MAX_TOKENS
: 1000ADS_API_TOKEN
: (your ADS API token)SOLR_URL
: https://api.adsabs.harvard.edu/v1/search/queryCORS_ORIGINS
: https://search-comparisons-frontend.onrender.comENVIRONMENT
: production
- Create a new Web Service on Render
- Connect your Git repository
- Configure the following settings:
- Name: search-comparisons-frontend
- Environment: Node
- Build Command:
npm install && npm run build
- Start Command:
npm start
- Environment Variables:
REACT_APP_API_URL
: https://search-comparisons-backend.onrender.comNODE_ENV
: production
Alternatively, you can deploy using Docker:
- Create a new Web Service on Render
- Select "Docker" as the environment
- Point to your Dockerfile
- Configure the same environment variables as above
The backend service includes a health check endpoint at /api/health
. Render will automatically monitor this endpoint.
Make sure to set up the following environment variables in your Render dashboard:
ADS_API_TOKEN
: Your ADS API tokenLLM_PROVIDER
: The LLM provider to use (default: ollama)LLM_MODEL_NAME
: The model name to use (default: llama2)LLM_TEMPERATURE
: The temperature for LLM generation (default: 0.7)LLM_MAX_TOKENS
: Maximum tokens to generate (default: 1000)SOLR_URL
: The Solr API endpointCORS_ORIGINS
: Allowed CORS originsENVIRONMENT
: Set to "production" for production deployment
Render provides built-in monitoring and logging. You can view:
- Application logs
- Build logs
- Health check status
- Resource usage
The service can be scaled horizontally by:
- Going to the service settings
- Adjusting the instance count
- Setting up auto-scaling rules if needed
You can set up custom domains for both services:
- Go to the service settings
- Click on "Custom Domains"
- Follow the instructions to add your domain
- Clone the repository
- Create a virtual environment:
python -m venv venv
- Activate the virtual environment:
source venv/bin/activate
- Install dependencies:
pip install -r requirements.txt
- Set up environment variables in a
.env
file - Run the development server:
uvicorn app.main:app --reload
Run tests with:
pytest
The project uses:
- Ruff for linting
- Black for code formatting
- MyPy for type checking
Run these tools with:
ruff check .
black .
mypy .