The multi-query vector search with blocking urllib.request.urlopen calls
was stalling the single-threaded uvicorn event loop. Now uses async
httpx.AsyncClient with asyncio.gather for parallel requests.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
- Generate multiple query variants (entities, topic words, combined)
- Search with top_k=30 per sub-query for wider recall
- Boost results matching multiple topic words for relevance
- Deduplicate and merge across all sub-queries
- Return top 15 results (up from 10) for richer RAG context
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Show clear guidance when provider has no key configured instead of
cryptic 401. Add friendly messages for 429/403 errors during streaming.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
The vector-db API returns message content in a top-level "content" field
with "author" and "channel" also top-level (not nested under metadata).
Previous code read "text" and "metadata.author" which returned empty strings,
making all vector search results invisible to the LLM.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>