Back to Changelog
Feature

Semantic Search & Neural Similarity

Find medical concepts using natural language with AI-powered semantic search. Query with everyday terms like 'heart attack' to find 'Myocardial infarction'.

We've launched two AI-powered search capabilities that transform how you find and discover medical concepts in the OMOP vocabulary.

The new GET /v1/concepts/semantic-search endpoint uses neural embeddings to find concepts that are semantically similar to your query, even when exact keyword matches don't exist.

Query with everyday language:

  • "heart attack" → finds "Myocardial infarction"
  • "sugar diabetes" → finds "Type 2 diabetes mellitus"
  • "high blood pressure" → finds "Essential hypertension"
  • "belly pain" → finds "Abdominal pain"
curl "https://api.omophub.com/v1/concepts/semantic-search?query=heart%20attack&page_size=5" \
  -H "Authorization: Bearer YOUR_API_KEY"

How It Works

  1. Your query is converted to a vector using neural embeddings
  2. The query vector is compared against pre-computed concept embeddings
  3. Results are ranked by cosine similarity score (0.0-1.0)
  4. Optional filters for vocabulary, domain, and standard concept status are applied

The POST /v1/search/similar endpoint offers flexible similarity algorithms for different use cases:

AlgorithmBest ForExample
semanticConceptual similarity"heart attack" → "Myocardial infarction"
lexicalFuzzy text matching, typo tolerance"diabetis" → "diabetes"
hybrid (default)Balanced matchingCombines word + character similarity
curl -X POST "https://api.omophub.com/v1/search/similar" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "query": "myocardial infarction",
    "vocabulary_ids": ["SNOMED", "ICD10CM"],
    "algorithm": "semantic",
    "similarity_threshold": 0.7,
    "include_explanations": true
  }'

Features

  • Cross-vocabulary discovery: Find related concepts across SNOMED, ICD10CM, RxNorm, and more
  • Similarity scores: Quantified similarity from 0.0 to 1.0
  • Explanations: Optional human-readable explanations for why concepts match
  • Flexible filtering: Filter by vocabulary, domain, concept class, and standard concept status

Use Cases

Natural Language Processing

Process patient-reported symptoms and clinical notes:

# Patient says: "I've been having trouble breathing"
results = client.semantic_search(query="trouble breathing")
# Returns: Dyspnea, Shortness of breath, Respiratory distress

Clinical Decision Support

Map clinical observations to standard codes:

# Nurse notes: "pt appears confused and agitated"
results = client.semantic_search(
    query="confused and agitated",
    domain_ids="Condition"
)
# Returns: Delirium, Acute confusional state, Agitation

Code Mapping Assistance

Find mappings for non-standard terminology:

# Legacy code description: "DM2 uncontrolled"
results = client.semantic_search(
    query="DM2 uncontrolled",
    vocabulary_ids="SNOMED",
    standard_concept="S"
)
# Returns: Type 2 diabetes mellitus without complications

Cross-Vocabulary Concept Discovery

Find equivalent concepts across different coding systems:

results = client.search_similar(
    query="type 2 diabetes mellitus",
    vocabulary_ids=["SNOMED", "ICD10CM", "ICD9CM"],
    algorithm="semantic",
    include_explanations=True
)

Similarity Score Interpretation

Score RangeInterpretation
0.9 - 1.0Excellent match, high confidence
0.7 - 0.9Good match, likely relevant
0.5 - 0.7Moderate match, review recommended
0.3 - 0.5Weak match, may be tangentially related
< 0.3Poor match, likely not relevant

Documentation