Alice Project | jeffliulab

This release is a core upgrade. Until now, ALICE's characters still ran on the classic 2023 recipe: memory was an append-only ledger, dialogue was mechanical and turn-based. The result — characters that contradicted themselves, forgot things like goldfish, and talked like they were reading off a script.

v1.0.7 has a single goal: make the characters feel more alive — remembering, staying consistent, acting with purpose, and speaking like people. We didn't start over; we borrowed ideas from three modern (2024–2026) frameworks and rewrote three layers: memory, cognition, and dialogue.

1. Before the upgrade: what a character's brain looked like

Every character runs the same pipeline on every step:

The loop itself is fine (even Stanford's own 2024 "1000-agent simulation" still uses it). The problems were in the implementation underneath:

Memory was a flat ledger: it could hold "Lina trusts Gus" and "Lina fears Gus" at the same time, with no way to tell which one is currently true.
Retrieval used the wrong model: a Chinese-language world was encoding memories with an English model, so "what it recalled" was often irrelevant.
Dialogue rebuilt context on every line, remembering only the last 6 turns → goldfish memory: four hours after a chat, the next encounter starts again with "Wait — how did you find this place?"
Emotion was an English number string (Feeling anxious (valence=-0.20)) stuffed into Chinese dialogue → characters sounded like they were reciting.

2. Who we borrowed from

We ran two rounds of deep research (40+ modern frameworks) and settled on three complementary ones:

Zep / Graphiti (memory): turns memory from a "diary" into a relationship graph that changes over time — every fact is an edge with a "validity window"; when a contradiction arrives it is invalidated, not deleted, so the system knows both the present and the history.
GATSim (cognitive algorithms): fixes several miscomputed algorithms from the 2023 recipe — recency decays by real elapsed time, expired memories are actually deleted, reflection happens at three scales.
SOTOPIA (dialogue): makes dialogue goal-driven — a character always strikes up a conversation carrying a private intent, and remembers what was said last time.

They happen to cover three different layers without colliding, so they compose into one system.

3. How the four systems were upgraded

The core piece is memory — from "ledger" to "bi-temporal relationship graph":

The four-layer upgrade at a glance:

System	Upgrade	Fixes
Memory	Chinese embedding; bi-temporal graph (contradictions invalidated, not deleted); vector + keyword + graph-traversal triple retrieval; real deletion on expiry	inaccurate retrieval, self-contradiction, unbounded store
Cognition	fixed reflection bug; reflection at three scales; the self (Ego) evolves and converges; scoring by character identity	divergent reflection, idle self
Dialogue	emotion rendered as Chinese tone; goal-driven; cross-conversation summary memory; colloquial, with action subtext	goldfish memory, aimless chatter, scripted voice
Core loop	shape unchanged, new parts hung onto existing stages	doesn't break replay or parallelism

Engineering note: everything ships behind feature flags — all new systems are off by default and can be A/B'd with one switch, while the old version stays frozen as a control.

4. In practice: the goldfish memory is gone

We ran a full in-game day (144 steps) as a formal experiment with a real LLM. The clearest evidence is Gus visiting Lina twice:

Over that day, Lina accumulated 126 structured facts about Gus (some of them "invalidated" as the story moved on), and her dialogue gained action, tone, and subtext — for instance, fingering the hem of her apron as she murmured, "Viviane… you knew all along." Remembering, staying consistent, sounding alive — this release is the first time all three happen together.

5. Honestly, what's still missing

This release is the first prototype of the core upgrade, not the finish line. Known things still on the way:

Smarter contradiction arbitration: for now we only handle "the object changed" fact updates safely; subtler contradictions like an "attitude reversal" still need stronger semantic judgment.
Free multi-character dialogue: only "probe-style" dialogue has been stress-tested in the current world; two self-aware characters walking up to each other and chatting hasn't been fully exercised yet.
Personality data: characters haven't been given "Big Five" traits, so personality-driven behavior like "who speaks up first" isn't active yet.

All of these are filed into the next release's backlog. But for this release — the characters' brains have been swapped for a modern core that remembers, changes its mind, and talks like a person.