projects/find-a-specialist.md
text/markdown · 11 KB
Find a specialist
2026-01-15
A friend of mine was recently diagnosed with endometriosis.
Endometriosis is a chronic condition where tissue similar to the lining of the uterus (endometrium) grows outside of it, typically on pelvic organs like the ovaries and fallopian tubes. This causes severe pain, among several other symptoms, and there is no known cure to the condition.
Apart from the gravity of the condition itself, I was struck by her journey getting to this point. Despite several years of emergency doctors visits and consultations, it was only when they found an endometrioma (cyst that forms in later stages of endometriosis) through an ultrasound that the narrative shifted from 'this is a painful period' or 'have you ever heard of a UTI?' to 'this is what you can do to feel better'.
In my opinion, there are several reasons why her journey manifested in this way.
- Endometriosis does not have any definitive, non-invasive gold-standard diagnostic tests. Confirmation of the condition usually requires a surgical biopsy.
- There is a high variance in how people report symptoms, and symptoms are often conflated with other, less-serious conditions.
- Superficial endometriosis usually cannot be identified via MRI or ultrasound. These imaging techniques are mostly only helpful for finding cysts.
- She has a life, and her instinct was to trust the doctor's recommendations.
- She is a woman, and women's health is systematically under-researched and underfunded.
Upon hearing about her endometrioma she was at a loss. The next step, according to her OB-GYN, was an MRI to confirm the cyst. After the MRI (highly likely to be confirmed), she had several options to manage the condition:
- Surgery - ablation / excision.
- Hormonal regulation - birth control.
Both options with their own set of lifestyle ramifications and considerations. She elected for the surgery.
Given the understudied nature of endometriosis, naturally, she wanted a specialist for endometriosis to advise her care journey. Unfortunately, the provider she was seeing was not suitable for this.
Another twist - for every period she continues to have, there's a risk of the cyst getting larger. At a certain point, that cyst would be much more at risk of rupturing. She was quickly approaching both the size at which that risk becomes concerning, and her next cycle.
She started searching around. First, as most people would do, tried her insurance in-network portal for providers. Next, Google searches. Through all of these attempts, she ran into the same fundamental issues:
- Provider availability - her insurance portal had no options to schedule with the provider, she had to call / email their office.
- Provider network status - for providers that showed up on Google, it was variable as to whether their website detailed the provider's contracted network, and again, she would've had to call / email their office.
- Provider specialty - most providers on her insurance portal had a high level specialty (let's say OB-GYN), and a swath of conditions/areas treated. There was nothing indicating whether they were truly a specialist in the area of endometriosis.
Given the urgency of the matter and the emotional toll of the diagnosis, she was at a complete loss - overwhelmed, scared, and deeply frustrated with her experience.
I was desparate to help, and thought this would be an interesting task for an AI agent.
First, I wanted to see what I could do manually to help.
I already understand that provider directories are abysmally inaccurate, so it would likel be futile to use her insurance portal recommendations, or plain-text descriptions of in network coverage from provider websites. Furthermore, those solutions didn't provide any notion of provider availability, and since this was urgent, we needed that.
Enter ZocDoc. I actually really like ZocDoc as a product, however, there were a bunch of issues that made me question whether we could trust it entirely for this use case.
The good:
- ZocDoc makes it easy to verify your insurance coverage, and surface in-network doctors accordingly (shout-out Availity).
- ZocDoc has a simple interface for booking appointments.
- ZocDoc is free for patients to use.
The bad:
- Specialty sprawl - the same issue with insurance portals saying a provider treats 60+ conditions exists on ZocDoc.
- Review depth - most provider reviews were anonymous, with little to no prose.
What I found myself doing was going to ZocDoc, searching for 'Endometriosis', setting the location to New York, then going to each doctor's profile, Googling them with some keywords, trying to find any signal that they were truly an endometriosis expert (e.g. articles, reviews about endometriosis care, etc.).
It was super painful and I was quickly striking out.
I started thinking about how an agent could approach this.
The task as an agent problem
Looking at what I was doing manually, the workflow was clear:
- Search ZocDoc for providers who list endometriosis as a condition they treat.
- Open each provider's profile. Check if they accept the right insurance. Check if they mention endometriosis meaningfully.
- Google the provider. Look for publications, hospital bios, fellowship training — anything that distinguishes a genuine specialist from someone who checked a box on ZocDoc.
- If the evidence is strong, save them. If not, move on.
This is a textbook agentic workflow: a loop of search, evaluate, research, decide. The key insight is that step 3 — the external verification — is where the real value is. Without it, you're just building a fancier version of the same directory that failed my friend in the first place.
I built the agent using LangGraph's Functional API, with five tools and a system prompt that encodes the research strategy.
Tools
The agent has five tools, each doing one thing:
search_zocdoc — searches Tavily scoped to zocdoc.com for provider profiles matching a condition and location. Supports pagination via a from_rank parameter so the agent can work through multiple pages of results.
scrape_zocdoc_profile — scrapes a ZocDoc profile via Firecrawl, then runs it through a deterministic parser that strips navigation boilerplate, extracts available appointment dates, and removes noise (base64 images, calendar grids, UI labels). The agent gets clean, token-efficient text: name, specialty, address, insurance list, about section, education, and reviews.
search_web — general web search via Tavily. The agent uses this to look up a provider's publications, hospital affiliations, and training. Good queries look like "Dr. Jane Smith" endometriosis pubmed or "Jane Smith" fellowship endometriosis.
scrape_page — scrapes any non-ZocDoc web page via Firecrawl. Used for reading PubMed abstracts, hospital faculty pages, and journal articles. Content is truncated at 15K characters to keep context manageable.
save_specialist — appends a confirmed specialist to a CSV file. Rejects duplicates, reports progress toward the target count, and requires an evidence summary that cites only external sources.
The evidence standard
This is the part I care about most. The agent's system prompt is explicit about what counts as specialist evidence and what doesn't.
What counts:
- Peer-reviewed publications on the condition (PubMed, journal articles)
- Fellowship training specifically related to the condition
- Hospital or university faculty page listing the condition as a clinical or research focus
- Conference presentations
What doesn't count:
- The condition appearing in a ZocDoc visit-reason checkbox
- The condition mentioned in a ZocDoc "About" section without external corroboration
- Generic directory listings (Healthgrades, Vitals, US News) that just echo ZocDoc
- Self-reported claims on a provider's own website without independent verification
This is deliberately strict. The whole point is to distinguish providers who have dedicated their career to a condition from providers who are willing to see patients with that condition. Both are fine doctors — but when you're facing surgery, you want the former.
The agent loop
The agent runs in a standard tool-calling loop:
User message (mission)
│
▼
┌─ call_model ◄──────────────────┐
│ │ │
│ tool_calls? │
│ yes │ no ──► return │
│ ▼ │
│ call_tool (for each call) │
│ │ │
│ compact_messages │
│ │ │
└───────┘
Each iteration: the LLM decides which tools to call, the tools execute, old messages are compacted, and the LLM is called again with the updated history. This repeats until the LLM responds with no tool calls — typically after it's hit the target specialist count or exhausted the search results.
The ZocDoc parser
ZocDoc pages have a consistent structure: navigation boilerplate, provider content, appointment calendar, and footer. Scraping them with Firecrawl returns a wall of markdown that's mostly noise.
Rather than burning LLM tokens to extract the useful parts, I wrote a deterministic parser that handles it without any AI:
- Extracts the profile section between the H1 heading and the "Back to top" footer
- Removes the appointment calendar grid (after extracting available dates into a structured list)
- Strips noise lines via compiled regex patterns — base64 image placeholders, standalone day names, UI labels like "New patient / Existing patient", sort dropdowns, etc.
- Collapses excessive blank lines
This reliably cuts the scraped content by 60–70% before the agent ever sees it.
Results
The agent works. I ran it for endometriosis in New York with Cigna insurance, targeting 5 specialists. It found providers with published research, fellowship training in minimally invasive gynecologic surgery, and faculty appointments at major academic medical centers — exactly the kind of signal that's invisible on ZocDoc or an insurance portal.
More importantly, it found them with appointment availability and confirmed insurance coverage. That's the whole package: a provider you can trust, who takes your insurance, and who you can actually see.
My friend ended up finding a specialist through this, so let's see how it goes! There's a lot of 'doomerism' about AI right now, this was a refreshing way to see how AI can be leveraged for simple, impactful, and highly positive outcomes.
Reflections
The healthcare information problem here isn't a technology problem — it's an incentive problem. Insurance portals aren't designed to help you find the best doctor. They're designed to show you who's in-network. ZocDoc is better, but their business model doesn't reward distinguishing between a true specialist and a generalist who checks the right boxes.
An agent like this works because it does something no existing product is incentivized to do: cross-reference multiple data sources to verify expertise, rather than just surfacing self-reported claims. It treats finding a specialist the way a diligent friend with medical knowledge would — searching PubMed, reading faculty pages, checking training backgrounds.
The code is on GitHub.