Root causes:
1. uncaughtException/unhandledRejection called gracefulShutdown() -> process.exit(0)
Any minor error killed the entire bot. Changed to LOG ONLY (Hermes/OpenCode pattern).
2. User-level systemd service was running alongside system-level, fighting for port 3001.
Masked user service permanently.
3. Fragile new Promise(() => {}) keepalive replaced with setInterval-based keepalive.
4. Syntax error in uncaughtException handler (literal newline in single-quoted string).
Tested: 5 rapid consecutive restarts all pass. Uptime stable.
Co-Authored-By: zcode <noreply@zcode.dev>
- README: header now shows v2.0.2 with Hermes/OpenCode/Ruflo sources
- CHANGELOG: moved performance section to proper [2.0.2] version header
- Added files changed list with line counts
Co-Authored-By: zcode <noreply@zcode.dev>
The Telegram formatting improvement was split across [2.0.0] and [2.0.1].
Now all v2.0.1 changes (EADDRINUSE fix + styling) are under one section.
v2.0.0 section contains only Ruflo integration changes.
Co-Authored-By: zcode <noreply@zcode.dev>
Root cause: fuser-based EADDRINUSE handler killed the current process
due to a race condition during systemd restart cycles. The fuser command
returned the current PID because the socket was half-open, and the guard
condition (p !== process.pid) failed to filter it.
Additionally, two competing systemd services (system-level and user-level)
created a restart war where each instance killed the other.
Fix approach (inspired by Next.js, Vite, webpack-dev-server):
- Replace fuser with net.createServer port probe (no external commands)
- PID-file based stale detection + ss fallback for orphan detection
- Wait loop with 300ms polling after SIGTERM to stale process
- Single-service architecture (disabled user-level unit)
Tested: 5 consecutive rapid restarts, 8+ minute uptime, zero crashes.
Co-Authored-By: zcode <noreply@zcode.dev>