AIMM Session — July 24, 2025: Self-Evolving Memory and Context Engineering

“Let’s see where we are in 6 months — I’m going to be so much smarter.” — Kasimir Hedstrom, describing his self-evolving AI memory system

30-Second Summary

Kasimir demoed a self-evolving memory system that makes Claude progressively sound more like him. Jay got hands-on guidance for architecting a healthcare research database from scratch. Lou unpacked the most important vibe coding lesson you probably aren’t applying yet: context engineering from day one. And Bally accidentally coined the AI term of the week.

1. Your AI Gets Smarter Every Session — If You Build It Right

Kasimir’s local MCP memory project: A local MCP database that stores high-impact learnings from each Claude chat session automatically. After every ~20 messages, it extracts lessons and persists them. Every few sessions, it pulls the last three conversations and synthesizes meta-learnings across them.

The clever piece: a voice authenticator — checks whether output sounds like Kasimir, targeting a 70–80% match score. If it doesn’t pass that bar, it flags and rewrites. Side effect: the system started identifying AI-isms he hadn’t noticed himself. AI loves to list things in threes and sixes — so now the system defaults to 2, 4, or 5, and requires a documented reason to use 3. Building a blacklist of AI patterns organically.

Lou’s addition: Mem0 (mem0ai) does something similar and now has an MCP version. → https://github.com/mem0ai/mem0-mcp

The bigger vision: a portable identity layer that travels with you across any AI system that supports MCP. ChatGPT now supports MCP as of mid-2025.

2. The Multi-LLM Workflow That’s Actually Working

Don Back’s two-model workflow: ChatGPT (deep accumulated memory = best-personalized first draft) → Claude (analytical precision = refinement) → Don with a pencil (final pass).

The insight: different models have different superpowers. ChatGPT’s persistent memory makes it unbeatable for personalization. Claude’s writing quality makes it unbeatable for refinement. Stop picking a favorite — use them as a pipeline.

Donald Kihenja in the chat: “I gave Claude my ‘big’ story, and it can cleverly weave parts of it into many other stories or articles.”

That sparked discussion on the Hook–Story–Offer framework. Lou pointed to Frank Kern and Rob Lennon as practitioners.

3. Presentations, Data Viz, and the Slide Deck Shortlist

Gamma.app — Bally’s and Lou’s top pick. Fastest path from idea to polished deck.
Chronicle HQ (chroniclehq.com) — Less automated but excellent component library. Good for dashboards, data-heavy slides.
Genspark — Strong for structured content with AI-generated visuals.
Claude Artifacts + CSV — Underrated. Send your spreadsheet, ask for a specific visualization.
Grok 3/4 — Surprisingly good at interactive data visualization.
Hard skip: ChatGPT for presentations. “Very terse. Puts things in a weird PowerPoint format. Very disappointed.” — Don Back.

4. Building a Healthcare Research Database — A Live Architecture Session

Jay is building a troponin biomarker database to discover correlations in post-surgical cardiac outcomes. Lou’s core architectural question: are you storing semantic similarity (RAG) or structured values (SQL)? For Jay’s use case — defined parameters, numeric ranges, known schema — SQL/Postgres is the right foundation.

Recommendations:

n8n self-hosted AI starter kit → https://github.com/n8n-io/self-hosted-ai-starter-kit (watch the video first — there’s a known undocumented config issue: https://youtu.be/V_0dNE-H2gw)
Lovable.dev + Supabase — Fast front-end with direct database integration for MVP prototyping → https://lovable.dev/
Firebase Studio — Free, Google-backed → https://firebase.studio/
ChromaDB — Easiest local database to spin up; great for prototyping

Alex F’s framing: this is a traditional ML problem, not a generative AI one.

5. The Vibe Coding Truth Nobody Talks About

Lou’s hard-earned lesson: when he asked Claude Code to modify Open WebUI’s ingestion pipeline, it was ready to fork core files, create maintenance debt, and couple his custom logic to the app’s internals. One better prompt — with the full documentation context loaded upfront — and Claude suggested a pipeline extension instead. No code changes. Upgrade-safe.

The framework:

Load the docs first. Give Claude links to the project’s documentation from the start. Say: read this before I tell you what I want.
Ask for the architectural option, not just the code. How do I do this in a way that’s compliant with the design and doesn’t require code changes?
Break it into phases and features. Use GitHub-style feature requests. Scaffold incrementally.

Donald echoed this: He had ChatGPT write a Word macro that kept making the same mistake through 20 iterations. The moment Donald applied one original human insight and suggested a different approach — GPT agreed instantly and solved it. “You can outsource the doing. You can’t outsource the thinking.”

Hot Takes

“Hallucination Layering” — Bally Binning’s accidental coinage. Context: Kasimir referenced a new Anthropic study finding that giving models more time to think actually produces worse results. Lou’s hypothesis: each incremental hallucination in a long chain of reasoning compounds the ones before it. Bally said it simply: “Hallucination layering.” Lou: “We’re going to have to tell Andrej Karpathy about that.”

Jay Abraham AI Clone (Delphi.ai): Lou demoed a RAG agent built in about 5 minutes. The moat isn’t the technology anymore. It’s your content and your training data. Business model implication: AI avatar as a lead magnet tier — let people chat with “you” on demand, upsell access to the real you for live coaching.

Try This Before Next Session

Build one RAG agent this week. Pick a domain where you have existing content. Upload it to Custom GPT, Delphi, or Raggy. Give it a clear system prompt. Share the link with one person and see what happens.