v2.2.0: per-provider reasoning controls (on/off + effort level)

- Add Reasoning On/Off toggle and Effort selector in endpoint editor - Proxy sends enable_thinking=false when reasoning is OFF - Proxy sends reasoning_effort level when reasoning is ON - Strip reasoning_content from output, force max_tokens=64000 minimum - Fixes Crof mimo-v2.5-pro and similar reasoning model token exhaustion
2026-05-20 12:20:33 +04:00
parent 77423c5c35
commit 9532ba40f3
5 changed files with 50 additions and 2 deletions
--- a/CHANGELOG.md
+++ b/CHANGELOG.md
@@ -1,5 +1,17 @@
 # Changelog

+## v2.2.0 (2026-05-20)
+
+- **Added per-provider Reasoning controls in endpoint editor**
+  - Reasoning On/Off toggle — disable reasoning for models that exhaust output tokens (e.g., Crof mimo-v2.5-pro)
+  - Reasoning Effort selector: None, Minimal, Low, Medium, High, Max
+  - When reasoning is OFF: sends `enable_thinking=false` + `reasoning_effort=none` to upstream API
+  - When reasoning is ON: sends user-selected effort level (default: Medium)
+  - Settings stored per-endpoint, passed through proxy config to upstream requests
+- Strip `reasoning_content` from proxy output — Codex doesn't use it, avoids token waste
+- Force `max_tokens=64000` minimum for openai-compat providers — room for both reasoning and content
+- Inspired by unsloth's reasoning control patterns for Qwen/GPT-OSS models
+
 ## v2.1.3 (2026-05-19)

 - **Fixed Crof mimo-v2.5-pro stopping mid-response (finish_reason=length)**