v2.7.0: Usage Dashboard redesign (OpenUsage-inspired), TCP_NODELAY streaming, Anthropic prompt caching

This commit is contained in:
Roman
2026-05-20 18:11:39 +04:00
Unverified
parent 8343837b3c
commit cf8ebd2064
5 changed files with 374 additions and 102 deletions

View File

@@ -1,5 +1,30 @@
# Changelog
## v2.7.0 (2026-05-20)
- **Usage Dashboard redesigned** (inspired by OpenUsage design patterns)
- Deep Space dark theme with Catppuccin-inspired color palette
- Header with animated status dots (OK/WARN/ERR provider health)
- KPI summary strip: total providers, requests, token volume, avg latency
- Provider cards with colored borders matching health status
- Status pills: OK (green), WARN (yellow), ERR (red)
- Colored section separators per metric type (Usage=yellow, Models=lavender)
- Model composition bar: stacked horizontal segments per model share
- Per-model breakdown with mini progress bars, percentage, request counts
- Per-model token breakdown (in/out) when available
- Token formatting: 1.2M, 45.3K instead of raw numbers
- Duration formatting: 1.5h, 3.2m instead of raw seconds
- Error section with warning icon
- **TCP_NODELAY streaming optimization**
- Disables Nagle's algorithm on streaming connections
- Reduces per-packet latency by up to 40ms on small SSE events
- Applied to all 4 streaming code paths (openai-compat, retry, command-code, generic)
- **Anthropic prompt caching**
- System prompts now sent as `cache_control: ephemeral` structured format
- Enables Anthropic's automatic prompt caching (saves tokens + cost on repeated prompts)
## v2.6.1 (2026-05-20)
- **Google OAuth rebuilt to emulate Gemini CLI**