Commit Graph

4 Commits

  • perf: 3-tier conversation context with LRU cache, keyword relevance, debounced I/O
    UPGRADE from naive JSON to production-grade conversation memory:
    
    Tier 1 — Compressed Summary (max 600 chars):
      Incrementally built from evicted messages. Preserves conversation
      topics across 100+ messages in a tiny budget.
    
    Tier 2 — Relevant Snippets (BM25-style keyword matching):
      Scores older messages against current query, injects top 3 matches.
      Zero external deps — keyword extraction is ~0.1ms.
    
    Tier 3 — Sliding Window (last 12 exchanges verbatim):
      Recent context preserved word-for-word, fitting within token budget.
    
    Performance optimizations:
      - In-memory Map cache with lazy-load from disk (0ms reads)
      - Debounced async disk writes (3s, non-blocking, never stalls response)
      - LRU eviction for cache (max 50 chats, prevents memory leak)
      - Keywords stripped before saving (smaller JSON files)
      - Backward-compatible: loads old format without keywords, backfills on load
      - Graceful shutdown flushes all pending saves to disk
      - Token-aware budget allocation: summary 15% + relevant 15% + recent 70%
  • feat: persistent conversation history across sessions and restarts
    - ConversationStore: per-chat JSON files in data/, survives restarts
    - 6000 token budget per chat context (fits ~20-30 exchanges)
    - Auto-trims old messages, always includes most recent
    - Wired into message handler: loads history before AI call, saves after
    - /reset command to clear chat history per chat
    - Cross-session, cross-model, cross-chat isolation
  • feat: persistent self-learning memory + curiosity engine
    - New memory.js: JSON-backed MemoryStore with 5 categories (lesson, pattern, preference, discovery, gotcha)
    - Memory injected into system prompt — bot sees past learnings every session
    - Curiosity engine: auto-detects errors/fixes, corrections, successful patterns, new tool discoveries
    - New commands: /memory (stats), /remember (save), /recall (search), /forget (delete)
    - Runs AFTER response delivery — zero latency impact
    - 500 memory cap with smart eviction (keeps gotchas/lessons, evicts old discoveries)
    - data/ directory gitignored (memory is local to each deployment)