Skip to content

beilusaiying/always-accompany

Repository files navigation

beilu-always accompany

always accompany

Original Layered Memory Algorithm · Precision Attention Control · Commander-Level Prompts · Multi-AI Engine · Bot · IDE

A cross-generational multi-AI engine collaboration platform

Make AI truly remember you.

English | 中文

Discord   GitHub   Memory Presets

This entire project — design, architecture, and development — was completed independently by a university student, leveraging AI-assisted programming with skills spanning algorithm design, biomimicry principles, framework architecture, and logical thinking.

Chat Interface

Chat interface with fine-tuned controls, adaptable to various beautification styles


The Fundamental Problem with Current AI

Whether it's AI coding tools (Cursor, Copilot), AI chat applications (ChatGPT, Claude), or AI roleplay platforms (SillyTavern), they all face the same underlying limitations:

Problem Current State Consequence
Limited context window Even 128K-1M tokens overflow in long conversations Early messages get truncated; AI loses critical information
Attention degradation The longer the context, the less the model focuses on each segment Even if information exists in context, AI may "overlook" it
No persistent memory Closing a conversation = forgetting everything Every new session starts from zero

beilu-always accompany breaks through all three limitations at the fundamental level — not by working around them, but by solving them algorithmically.


Core Technical Breakthroughs

🧠 Original Layered Memory Algorithm — Theoretically Unlimited AI Memory

Designed after the human hippocampus memory formation mechanism and the Ebbinghaus forgetting curve, achieving theoretically unlimited AI memory. No external database or vector storage dependency — pure JSON files + pure prompt-driven.

Three-Layer Memory Architecture

🔥 Hot Memory Layer — Injected every turn
   User profile / Permanent memories Top-100 / Pending tasks / Recent memories about user

🌤️ Warm Memory Layer — On-demand retrieval, last 30 days
   Daily summaries / Archived temporary memories / Monthly index

❄️ Cold Memory Layer — Deep retrieval, beyond 30 days
   Monthly summaries / Historical daily summaries / Yearly index

Additionally, an L0 Memory Table Layer (10 highly customizable tables, fully injected every turn as CSV) provides structured immediate context.

Key Metrics

Metric Value
Hot layer injection per turn ~7,000-11,000 tokens (only 5-9% of a 128K window)
Retrieval AI context <5,000 tokens (100% attention focused on retrieval)
P1 retrieval efficiency Max 3 rounds to hit target (BM25 pre-filtering + regex exact match)
Retrieval tech stack BM25 + Regex Search (dual-engine collaboration, zero external deps)
Storage cost Zero (pure JSON files, no database dependency)
Single-character sustained operation 12+ years (at 5,000 files)
Theoretical duration 260+ years (at 100,000 files; NTFS/ext4 support far exceeds this)

Memory Decay Formula

score = weight × (1 / (1 + days_since_triggered × 0.1))

Inspired by the Ebbinghaus forgetting curve: important and recently triggered memories are prioritized for injection, rather than simple chronological order.

Pure Prompt-Driven — Zero Hardcoded Limitations

The most critical design feature of the memory system: all memory injection, retrieval, archival, and summarization operations are performed by AI through prompts, not traditional hardcoded logic.

This means:

  • Table meanings and purposes can be changed anytime: Simply modify the prompt descriptions for tables, and the AI will interpret and operate them accordingly — no code changes needed
  • Archival strategies are instantly adjustable: P2-P6 behaviors are entirely defined by prompts; modifying prompts changes archival rules, summary formats, and retrieval strategies
  • Zero technical barrier for migration: Users can edit prompts themselves to adapt to different scenarios (roleplay / coding assistant / game NPC) without programming skills
  • Naturally avoids technical debt: No complex parsers or state machines to maintain — the AI itself is the most flexible "parser"

🎯 Precision Attention Control — BM25 + Regex Dual-Engine Pre-Filtering

This is the project's most critical design advantage. Traditional approaches stuff all memories into a single AI context, causing attention to degrade rapidly as context grows. We solve this completely through AI role separation + dual-engine pre-filtering:

Traditional:  [All historical memory + current chat] → Single AI → Attention scattered
                              ↓
Our approach: [Index] → Retrieval AI (focused on finding) → [Selected memory + current chat] → Reply AI (focused on quality)
AI Role Context Content Context Length Attention Distribution
Retrieval AI (Gemini 2.0/2.5 Flash) User message + index files <5K tokens 100% focused on finding relevant memory
Reply AI (user's choice) Selected memory (~10K) + chat As needed 100% focused on reply quality

The Reply AI only sees precisely filtered memory fragments from the Retrieval AI. The context is clean, the signal-to-noise ratio is extremely high, and attention never degrades.

BM25 Semantic Retrieval + Regex Exact Matching

  • BM25 coarse filtering: TF-IDF-based statistical algorithm that quickly filters the most relevant candidates from massive memory files. Pure JS implementation, zero external dependencies
  • Regex precise targeting: After BM25 coarse filtering, regex provides exact matching — supports pattern matching, keyword combinations, fuzzy search
  • Dual-engine collaboration effect: P1 retrieval optimized from 5 rounds down to max 3 — 40% faster, 40% cheaper on API costs

👑 Commander-Level Prompt System — Full Control Over AI Output Direction

The preset engine completely takes over all module outputs through the TweakPrompt three-round mechanism, commanding the absolute direction of AI replies. This isn't simple System Prompt stacking — it's a complete message orchestration engine:

5-Segment Message Structure

[beforeChat]        — Preset header (system presets, defining AI core behavior)
[injectionAbove]    — @D≥1 injection (world book + memory + plugin data)
[chatHistory]       — Actual conversation history
[injectionBelow]    — @D=0 injection (P1 retrieval results + real-time data)
[afterChat]         — Preset footer (final instructions, e.g. jailbreak)

TweakPrompt Three-Round Mechanism

Round 1 (dl=2): Collect — Gather content from all modules, clear raw data, preset takes full control
Round 2 (dl=1): Rebuild — Construct 5-segment message structure, process world book/memory injection, macro replacement
Round 3 (dl=0): Snapshot — Capture debug snapshot for prompt viewer

Smart Preset Switching — AI Auto-Adapts to Interaction Modes

P1 Retrieval AI doesn't just retrieve memories — it analyzes conversation intent in real-time and automatically switches to the most suitable prompt preset:

  • Multi-mode adaptation: Casual chat, roleplay, coding, prompt engineering… the AI automatically switches to the optimal preset based on conversation content, with prompts and COT (Chain of Thought) changing accordingly
  • Seamless experience: No manual intervention needed — just say "help me write code" and the AI's behavior mode adjusts in real-time
  • Cooldown anti-oscillation: Built-in cooldown counter (default 5 turns) prevents rapid repeated switching
  • Fully customizable: Switching logic is guided by COT in prompts; users can define their own switching conditions and strategies

This means AI is no longer "one preset fits all" — it dynamically adapts to the optimal behavior mode based on the current context, making it a truly multi-mode intelligent agent.

13 Macro Variables

{{char}} / {{user}} / {{tableData}} / {{hotMemory}} / {{chat_history}} / {{lastUserMessage}} / {{presetList}} / {{current_date}} / {{time}} / {{date}} / {{weekday}} / {{idle_duration}} / {{lasttime}} + custom macro variables


📊 Highly Customizable Memory Tables — Solving the AI RP God's-Eye Problem + Universal Scenario Adaptation

10 fully customizable structured tables, injected every turn as CSV. Table meanings and purposes are entirely defined by prompts — change the prompt descriptions and the same table system serves completely different scenarios:

Scenario Example Table Usage
AI Roleplay Space-time settings / Character status / Social relations / Quest progress / Inventory — AI only knows what's recorded in the tables, perfectly solving the god's-eye problem
Programming Architecture decisions / Code conventions / Module dependencies / Bug tracking / TODO lists
Work Management Project progress / Meeting notes / Contacts / To-do items / Knowledge accumulation
Gaming Character attributes / Equipment list / Skill trees / World state / NPC relationships

Core value for AI RP: Tables achieve information isolation — the AI can only see information explicitly recorded in the tables. Information the character doesn't know won't appear in the tables, fundamentally eliminating the "god's-eye view" problem.

Table contents are automatically maintained by the Chat AI via <tableEdit> tags — the AI autonomously decides when to update which data during conversation, with no manual intervention needed.


🌍 World Book Dynamic Injection — Table-Driven Living World

World book entries support 3 activation modes — the dynamic mode can read real-time data from memory tables to decide whether to inject a world setting:

Mode Trigger Condition Use Case
Always-on Unconditional activation, injected every turn Core worldview, immutable settings
Regex Keyword matching against chat history Triggered when specific places/people mentioned
Dynamic Reads values/states from memory tables, conditional Affection thresholds, quest progress, item effects
Dynamic injection examples:
  → When affection > 80 in Table #2 "Social Relations", inject "Special dialogue unlocked" setting
  → When main quest = "Chapter 3" in Table #3 "Quest Progress", inject corresponding world description
  → When a specific item exists in Table #5 "Inventory", inject the item's usage effects

This table-driven dynamic world-building mechanism turns world settings from static text into a living system that evolves with conversation progress and character state. Easy to learn — modifying table data instantly affects AI behavior.


Platform Integration Capabilities

Built on the core technologies above, we've constructed a complete integrated platform ecosystem:

🤖 Multi-AI Collaboration Architecture — 7 AIs with Dedicated Roles, Extremely Low Cost

The system has 7 built-in AI roles, each with a dedicated responsibility. Each conversation only calls 2 AIs (Retrieval AI + Chat AI); the rest trigger on demand — no need to worry about usage:

AI Role Trigger Usage Notes
Chat AI Conversation with users, file operations User sends a message Called every conversation
P1 Retrieval AI Search relevant history from memory (up to 3 rounds) + Smart Preset Switching Automatic per turn Called every turn, can use free AI
P2 Archive AI Summarize and archive when temporary memories exceed threshold ~50 conversations per trigger Extremely infrequent
P3 Daily Summary AI Generate detailed daily summary Manual trigger Only when user clicks
P4 Hot→Warm AI Move expired hot-layer memories to warm layer Manual trigger Only when user clicks
P5 Monthly Summary AI Warm→Cold archival, generate monthly summaries Manual trigger Only when user clicks
P6 Repair AI Check and fix memory file format issues Manual trigger Only when user clicks

💰 Cost Analysis

Key insight: The Retrieval AI (P1) only needs to "find memories" — it doesn't require a high-intelligence model. It can run entirely on free or ultra-low-cost AI (e.g., Gemini 2.0/2.5 Flash free tier).

AI calls per conversation:
  ① P1 Retrieval AI — Find memories + determine preset switching (can use free AI like Gemini Flash)
  ② Chat AI         — Generate reply based on selected memories (use any model you prefer)

Infrequent AI:
  ③ P2 Archive    — Triggers roughly once per 50 conversations, barely noticeable
  ④ P3-P6         — All manual trigger, zero usage unless you click

Bottom line: If P1 uses free AI (Gemini Flash free tier is more than enough), then the actual cost per conversation = only one Chat AI call. The memory system runs at virtually zero cost.

💻 IDE-Level Workflow — VSCode-Style Three-Panel Layout

  • Left panel: Preset management / World book binding / Persona selection / Character editing
  • Center panel: Chat / File editor / Memory management — three-tab switching
  • Right panel: AI settings / Feature toggles / Memory AI operation panel

IDE includes built-in BM25 + Regex Search dual-engine file retrieval for quick project-wide file and content search — enter keywords or regex expressions for instant results. AI can directly read and write user project files via the beilu-files plugin.

🌐 Cross-Platform Bot Engine — Discord Node Deployment

Deploy your AI to Discord channels and chat with your AI anytime, anywhere:

  • Full memory access: The Discord Bot shares the local memory system — your AI remembers your history even on Discord
  • Visual management panel: Bot Token, Owner, message depth and other common settings displayed as form controls — no manual JSON editing needed
  • Real-time message log: View Bot's sent/received messages in real-time on the management interface (user messages / AI replies / error logs)
  • Multi-channel support: Bot can work in multiple Discord channels simultaneously, maintaining independent context per channel

🎭 Seamless SillyTavern Ecosystem Compatible

  • Direct import of SillyTavern format character cards, presets, and world books
  • Support for Risu formats (ccv3 / charx / rpack)
  • MVU variable system + EJS template rendering (SillyTavern helper script compatibility layer)
  • 14 AI service generators (proxy / gemini / claude / claude-api / ollama / grok, etc.)

🔌 18 Feature Plugins

Preset engine / Memory system / File operations / Desktop screenshot / Browser awareness / Logger / Feature toggles / Multi-AI collaboration / Regex beautification / World book / Web search / System info / MVU variables / EJS templates / GraphRAG knowledge graph / Vector database / User-level plugin host / Plugin container

🌐 Multi-Language Support (i18n)

Management home page supports 4 languages (Chinese / English / Japanese / Traditional Chinese) via a "translation overlay" approach.

🔬 Full-Stack Diagnostic Framework

  • 12-module diagnostic logs: Enable/disable per module independently, zero performance overhead when disabled
  • Console interception: Automatically captures all console.log/warn/error/info from both frontend and backend, 500-entry ring buffer
  • One-click export: Click "📦 One-Click Pack Logs" to generate a single JSON file — just attach it when reporting issues

Comparison with Existing Tools

vs AI Chat Applications (ChatGPT / Claude / Gemini)

Dimension ChatGPT etc. beilu-always accompany
Memory Simple summaries / conversation history Three-layer graded + BM25/Regex dual-engine retrieval + multi-AI collaboration, theoretically unlimited
Attention Degrades as context grows Retrieval AI pre-filters; Reply AI attention stays focused
Customization Limited System Prompt Full preset system + 10 customizable memory tables + dynamic world book injection
Data ownership Server-side storage Local JSON files, fully self-owned
Cross-platform Official clients only Web + Discord Bot, AI serves you on multiple platforms

vs AI Coding Tools (Cursor / Copilot / Windsurf)

Dimension Cursor etc. beilu-always accompany
Project memory Based on current file context Cross-session persistent memory (architecture decisions, code conventions, historical discussions)
Multi-AI collaboration Single model 7 AIs with dedicated roles; retrieval/summary/reply separated
Memory cost Relies on large context windows ~10K tokens covers the hot layer
File search IDE built-in BM25 + Regex Search dual engine + IDE file tree

vs AI Roleplay Platforms (SillyTavern)

Dimension SillyTavern beilu-always accompany
Memory No built-in memory system Original three-layer memory + BM25/Regex dual-engine retrieval + 6 auxiliary AIs
God's-eye problem No solution Memory table information isolation — AI only knows what's in the tables
File operations None Built-in IDE file management + AI file operations
Desktop capability None beilu-eye desktop screenshot → AI recognition
Cross-platform Web only Web + Discord Bot
Preset compatibility Native Fully compatible with ST presets/character cards/world books

Thoughts on the Future of LLMs

Even when context windows expand to 10M+ tokens, layered memory remains valuable:

  1. Attention problems won't disappear: No matter how large the window, model attention on massive text will still degrade. Pre-filtering + precise injection will always outperform "stuff everything in."
  2. Cost efficiency: Larger windows = higher costs. Replacing 100K+ tokens of full history with ~10K tokens of selected memory reduces API call costs by 10x or more.
  3. Structured > Unstructured: Tabular memory is easier for AI to accurately read and update than information scattered across conversations.

Layered memory is not a temporary workaround for limited context windows — it is a superior paradigm for information organization.


Roadmap

✅ Completed

  • Original three-layer memory algorithm (pure prompt-driven) — Permanent memory, theoretically unlimited
  • Precision attention control — BM25 + Regex Search dual-engine pre-filtering
  • Commander-level prompt system — 5-segment message structure + TweakPrompt three-round mechanism
  • Smart Preset Switching System — P1 real-time context analysis with auto preset switching, multi-mode adaptive COT
  • Highly customizable memory tables (10 tables) — Solving the god's-eye problem + universal scenario adaptation
  • World book dynamic injection — 3 activation modes (always-on / regex / dynamic), table-driven living world
  • Multi-AI collaboration engine (7 AI roles, P1 can use free AI)
  • 🆕 Smart Retrieval System — BM25 + Regex Search dual engine, P1 retrieval from 5 rounds down to max 3 (40% faster, 40% cheaper)
  • 🆕 Discord Bot — Cross-platform AI service with visual management panel + real-time message log
  • 🆕 Browser Page Awareness — Passive + on-demand architecture, userscript monitors DOM changes and reports page snapshots
  • 🆕 Claude API Full Adaptation — Native Anthropic format, prefill modes, Extended Thinking toggle
  • 🆕 User-Level Plugin Host (M1) — Scans user directory, supports Python/Node/executable child processes, receives external plugin push data via HTTP routes
  • IDE-style interface with file operations (including BM25 + Regex Search dual-engine file retrieval)
  • Desktop screenshot system (beilu-eye)
  • Rendering engine (SillyTavern helper script compatibility layer)
  • Management home page i18n (Chinese / English / Japanese / Traditional Chinese)
  • 18 feature plugins
  • Full-stack diagnostic framework with one-click log export

🔜 Near-term

  • More platform Bot integrations
  • Plugin ecosystem (Workshop-style high extensibility)
  • Live2D integration + AI-controlled models
  • AI game engine (chat interface = game interface, code-compatible, userscript-friendly)
  • TTS / Text-to-image integration
  • VSCode extension compatibility
  • Highly extensible core architecture

Getting Started

Requirements

  • Deno runtime
  • Modern browser (Chrome / Edge / Firefox)
  • At least one AI API key (Gemini API recommended — free tier available)

Installation & Launch

# Clone the project
git clone https://github.com/beilusaiying/always-accompany.git
cd always-accompany

# Launch (Windows)
run.bat

# Launch (Linux/macOS)
chmod +x run.sh
./run.sh

After launch, open your browser and navigate to http://localhost:1314

Basic Configuration

  1. Configure AI source: Home → System Settings → Add AI service source (proxy / gemini / claude, etc.)
  2. Import character card: Home → Usage → Import (supports SillyTavern PNG/JSON format)
  3. Configure memory presets: Home → Memory Presets → Set up API for P1-P6 (recommend P1 using Gemini Flash free tier)
  4. Start chatting: Click a character card to enter the chat interface

Using the Memory System

  • Automatic operation: Memory tables are automatically maintained by the Chat AI (via <tableEdit> tags); Retrieval AI (P1) triggers automatically each turn
  • Manual operations: Chat interface right panel → Memory AI Operations → P2-P6 manual buttons
  • Daily archival: At the end of each day, click the "End Today" button to trigger the 9-step daily archival process
  • Memory browsing: Chat interface → Memory Tab → Browse/edit/import/export memory files

Discord Bot Setup

  1. Create a Bot application on Discord Developer Portal
  2. In beilu-chat interface → Bot tab at the top → Enter Bot Token and Owner username
  3. Click "Start Bot" → @your Bot in a Discord channel to start chatting

Tech Stack

Component Technology
Runtime fount (based on Deno)
Backend Node.js compatibility layer + Express-style routing
Frontend Vanilla JavaScript (ESM modules)
AI integration 14 ServiceGenerators (OpenAI / Claude / Gemini / DeepSeek / Ollama, etc.)
Smart retrieval BM25 + Regex Search dual engine (pure JS, zero deps)
Desktop screenshot Python (mss + tkinter + pystray)
Cross-platform discord.js v14
Storage Pure JSON file system

🎁 Community & Resources

💬 Join the Discord Community

Discord

Discussion, resource sharing, prompt exchange, bug reports — come join us!

📦 Ready-to-Use Memory Prompt Presets

The project includes a carefully crafted P1-P6 Memory AI prompt preset, ready to use out of the box:

beilu-presets_2026-02-23.json — Complete prompt configurations for P1 Retrieval AI, P2 Archive AI, P3 Daily Summary AI, P4 Hot→Warm AI, P5 Monthly Summary AI, and P6 Repair AI

How to use: Home → Memory Presets → Click "Import" → Select this JSON file to import all presets in one click.

🤝 How to Contribute

We welcome everyone to participate in building this project! You can:

  • 🃏 Share character cards — Create and publish your character cards to enrich the community
  • 📝 Publish prompt presets — Share your tuned memory presets and chat presets to help others
  • 🌍 Contribute world books — Build world settings for other users to import
  • 🐛 Report bugs — Use the one-click log export feature and attach the diagnostic report
  • 💡 Suggest features — Feature requests, UI improvements, plugin ideas — all welcome
  • 🔧 Contribute code — Fork & PR, let's build together

The community has many more great prompts and character card resources — feel free to explore and share!


Acknowledgments

This project would not be possible without the contributions of the following open-source projects and communities:

  • fount — The foundational framework providing AI message handling, service source management, module loading, and other core infrastructure, saving significant development time on low-level implementation
  • SillyTavern — The pioneering project in AI roleplay, whose preset format, character card specification, and world book system have become community standards. This project is fully compatible with its ecosystem
  • SillyTavern Plugin Community — Thanks to all open-source plugin authors for their exploration and sharing. Their work on rendering engines, memory enhancement, and feature extensions provided valuable references and inspiration for this project's design

Screenshots

🖥️ IDE AI Editor — VSCode-inspired, easy to get started

IDE-style AI coding and file editing interface, inspired by VSCode for a familiar experience. Plugin integration and management coming soon.

If you're unfamiliar with AI coding or a beginner, please use the designated sandbox space for AI file capabilities: 📖 Read / ✏️ Write / 🗑️ Delete / 🔄 Retry / 🔌 MCP / ❓ Questions / 📋 Todo. You can disable write and delete for safety.

IDE Editor

🧠 Memory Files — View and edit memory data in real-time

Manually edit content anytime, observe memory AI operations in real-time. You can also make requests to the memory AI directly.

Memory Files

🎨 Regex Editor — Sandbox & Free modes

Manage regex rules at different levels, modify conversations, with Sandbox and Free modes. Protects against potentially malicious scripts from unknown character cards.

⚠️ We cannot guarantee effectiveness against all malicious scripts. Please review character card code for malicious content before use. We are not responsible for any damages.

Regex Editor

📋 Commander-Level Prompts — Full control over all sent content

Commander-level prompts that control all sent content, maximizing prompt effectiveness.

Preset Manager

🧠 Memory Presets P1-P6 — Fully prompt-driven, zero technical barrier

P2-P6 behaviors can all be modified through prompts — no coding required, highly adaptable.

Memory Presets

📖 System Guide — Detailed documentation for quick onboarding

Detailed system documentation to help you get started quickly.

System Guide

🔬 System Diagnostics — One-click log export for rapid troubleshooting

Comprehensive system self-diagnosis with one-click log packaging. Captures both browser console and server logs into a single JSON file — just attach it when reporting issues.

System Diagnostics


License

This project is built on the fount framework, with direct authorization from the original author.

About

AI companion platform with original layered memory algorithm, multi-AI collaboration engine & IDE interface. SillyTavern compatible. | 原创分层记忆算法 + 多AI协作引擎的AI平台

Topics

Resources

Stars

Watchers

Forks

Packages

 
 
 

Contributors