Skip to content

llatato/ClinicalTrialMatching

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

21 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Clinical Trial Matching: Transcript to Trial Pipeline

Problem Overview and Scope of Deliverables

Matching patients to clinical trials is a challenge due to the complex eligibility criteria associated with each study. The specificity of clinical trial eligibility can exceed that of the structured data in EHR systems, meaning that matching often requires significant manual review and input.

The scope of this project is to create a clinical trial matching tool that helps bridge the specificity gap in EHR systems and clinical trial eligibility requirements by processing unstructured natural language patient-doctor conversations into structured search criteria and delivering a curated selection of trials that may be relevant to the patient with intuitive UI/UX to help users understand and navigate patient eligibility. The final product thus allows stakeholders to accelerate the matching process by acting as a context-driven search engine.

Assumptions, Core Challenges, and Design Philosphy

  • Pertinent EHR data that is not typically stated in a routine visit for the patient is assumed to be available (weight, age, height, medications, current conditions, etc...). I felt it would not be realistic for all of this information to be verbally exchanged in a typical visit.
  • The ClinicalTrials.gov API returns the eligibility criteria (which includes both inclusion/exclusion criteria) as plain text, so effective natural language processing of this information and using it to evaluate patient eligibility is a critical feature that sets this tool apart from just learning how to effectively query the ClinicalTrials.gov API.
  • Relevancy and eligibility are key: the goal is not to recommend specifically that the patient be matched to {X} trial, but to build software-driven tools to navigate the complex eligibility criteria in order to recommend patient eligibility/relevancy for {X, Y, Z...} trials.

Frameworks and Deployment:

Frontend: Vite/TS/React -> Vercel Backend: Python/FastAPI/GPT-4o mini api -> Railway

Overall Flow:

The backend delivers a transcript + EHR -> eligibility ranked trials pipeline and serves it using an API. Relevant trials, rankings (+ ranking reasoning), and extracted patient context are rendered in the frontend after a user submits the necessary context: alt text

Backend Features:

  • Extracts structured context from the transcript using an LLM (GPT-4o mini)
  • Context to query mapping: constructs queries using patient conditions, location, intervention preference, status (is the trial actively recruiting?), and filters (e.g., patient age, prognosis, risk tolerance, ...).
  • Progressive fallback trial querying: if the query is too specific/restrictive (no results returned), query fields are progressively relaxed until a certain threshold of trials are returned. Fallback flags are kept track so that users can be notified that search results may not entirely reflect the initially specified search criteria (e.g., proximity to patient home town).
  • Eligibility ranking: for each trial returned, an LLM (GPT-4o mini) is prompted to evaluate the eligibility of the patient for the clinical trial on a scale of 0-3 (minimizing output tokens). The LLM is provided with the extracted transcript context, EHR data, and the transcript itself as a static prefix in the prompt (enabling prompt caching for the all the context surrounding the patient), and then the specific trial's eligibility is passed as the prompt suffix.
  • REST API serving the frontend for matching, ranking, and providing detailed context for all returned trials.

Frontend Features:

  • Search construction: guided input for patient EHR data + doctor-patient transcript for sending trial match/rank requests to the API. A number of example patients are provided as drop-down fill-ins.
  • Results display: compact cards with relevant information for each trial are displayed, such as the status of the study, location, the eligibility scoring, and the eligibility scoring ranking. Clicking on a card expands it to show exhaustive details of the study.
  • User warnings: warns users that the search terms had to be loosened in order to find relevant trials, so the trials may be further away/have interventions they did not originally prefer.
  • Patient context sidebar: extracted transcript + patient data is displayed as a sidebar along with the trial results, allowing for quick manual comparison of patient information with eligibility criteria + intervention types.

Accessing the Deployed Application:

Backend: API Docs

Local Setup:

Frontend:

Copy the .env.example as the .env file. From the project root:

cd frontend
npm i
npm run dev

Backend:

Copy the .env.example as the .env; fill in a suitable OpenAI API key (with gpt-4o mini access). From the project root:

cd backend
python -m venv venv
.\venv\Scripts\activate
pip install -r requirements.txt
uvicorn src.main:app --host 0.0.0.0 --port 8000

Once the Vite localhost and API server are both up and running, the webapp should work as intended!

About

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published