v2.2.0: per-provider reasoning controls (on/off + effort level)

- Add Reasoning On/Off toggle and Effort selector in endpoint editor
- Proxy sends enable_thinking=false when reasoning is OFF
- Proxy sends reasoning_effort level when reasoning is ON
- Strip reasoning_content from output, force max_tokens=64000 minimum
- Fixes Crof mimo-v2.5-pro and similar reasoning model token exhaustion
This commit is contained in:
Roman
2026-05-20 12:20:33 +04:00
Unverified
parent 77423c5c35
commit 9532ba40f3
5 changed files with 50 additions and 2 deletions

View File

@@ -1,5 +1,17 @@
# Changelog
## v2.2.0 (2026-05-20)
- **Added per-provider Reasoning controls in endpoint editor**
- Reasoning On/Off toggle — disable reasoning for models that exhaust output tokens (e.g., Crof mimo-v2.5-pro)
- Reasoning Effort selector: None, Minimal, Low, Medium, High, Max
- When reasoning is OFF: sends `enable_thinking=false` + `reasoning_effort=none` to upstream API
- When reasoning is ON: sends user-selected effort level (default: Medium)
- Settings stored per-endpoint, passed through proxy config to upstream requests
- Strip `reasoning_content` from proxy output — Codex doesn't use it, avoids token waste
- Force `max_tokens=64000` minimum for openai-compat providers — room for both reasoning and content
- Inspired by unsloth's reasoning control patterns for Qwen/GPT-OSS models
## v2.1.3 (2026-05-19)
- **Fixed Crof mimo-v2.5-pro stopping mid-response (finish_reason=length)**