v2.7.0: Usage Dashboard redesign (OpenUsage-inspired), TCP_NODELAY streaming, Anthropic prompt caching

2026-05-20 18:11:39 +04:00
parent 8343837b3c
commit cf8ebd2064
5 changed files with 374 additions and 102 deletions
--- a/CHANGELOG.md
+++ b/CHANGELOG.md
@@ -1,5 +1,30 @@
 # Changelog

+## v2.7.0 (2026-05-20)
+
+- **Usage Dashboard redesigned** (inspired by OpenUsage design patterns)
+  - Deep Space dark theme with Catppuccin-inspired color palette
+  - Header with animated status dots (OK/WARN/ERR provider health)
+  - KPI summary strip: total providers, requests, token volume, avg latency
+  - Provider cards with colored borders matching health status
+  - Status pills: OK (green), WARN (yellow), ERR (red)
+  - Colored section separators per metric type (Usage=yellow, Models=lavender)
+  - Model composition bar: stacked horizontal segments per model share
+  - Per-model breakdown with mini progress bars, percentage, request counts
+  - Per-model token breakdown (in/out) when available
+  - Token formatting: 1.2M, 45.3K instead of raw numbers
+  - Duration formatting: 1.5h, 3.2m instead of raw seconds
+  - Error section with warning icon
+
+- **TCP_NODELAY streaming optimization**
+  - Disables Nagle's algorithm on streaming connections
+  - Reduces per-packet latency by up to 40ms on small SSE events
+  - Applied to all 4 streaming code paths (openai-compat, retry, command-code, generic)
+
+- **Anthropic prompt caching**
+  - System prompts now sent as `cache_control: ephemeral` structured format
+  - Enables Anthropic's automatic prompt caching (saves tokens + cost on repeated prompts)
+
 ## v2.6.1 (2026-05-20)

 - **Google OAuth rebuilt to emulate Gemini CLI**