Mastermind recap

AIMM Session — September 11, 2025: Live RAG Debugging — Pinecone, Qdrant, and the Full Self-Hosted Stack

· AIMM 2025 · 120 min

Facilitators: Lou D'Alo

30-Second Summary

Extended hands-on session (two full hours) tackling: connect Pinecone vector database to ChatGPT Custom GPT via API. When Pinecone wouldn’t cooperate, Lou pivoted to Qdrant. What looked like failure was gold mine — live demonstration of debugging mindset and limits of vibe coding. Second half became deep roundtable on RAG strategies, vector database architecture, VPS hosting, and privacy compliance.

Topic 1: Building a Custom GPT Action Against Pinecone

Key prompt Lou shared: “I want to build a custom GPT for ChatGPT that accesses data in Pinecone. It’s a Pinecone assistant rather than Index — important distinction…”

When schema errored, Lou went to Perplexity, found raw Python API calls, fed back to ChatGPT: “Don’t abstract to SDK — give me direct API.”

Hot Take: Pinecone Assistant is beautiful for internal use but may not sit behind another interface. For client-facing RAG, go straight to Pinecone Index or Qdrant.

Topic 2: Pivoting to Qdrant

When Pinecone stalled, Lou switched live to Qdrant. Spun up free Qdrant cloud cluster (AWS us-east-1), hit environment errors, missing dependencies.

Lou called it: “This is what turns 5-minute fix into 5-hour experience. N8N puts all this under the hood.”

Full self-hosted stack Lou building toward:

  • Qdrant (vector database)
  • Open Web UI (chat interface)
  • N8N (ingestion, automation, orchestration)
  • Dockling (document loading)
  • Coolify (server management, SSL, reverse proxy)
  • VPS host (SSD Nodes or Hostinger)

Topic 3: RAG Depth Dive

“There are 20 chunking variants and 20 RAG strategies. That’s 400 combinations. Determining factor is always the use case.”

Key distinctions:

  • Naive RAG: chunk by size, retrieve by similarity
  • Semantic chunking: variable-size chunks respecting logical boundaries
  • Hybrid/fusion retrieval: keyword + vector simultaneously
  • Agentic RAG: different strategies per document type

Lou’s insight: tested 6-7 databases and 8-10 chunking strategies for one legal case. That’s the real cost of AI projects.

Topic 4: VPS Setup for ~$5-15/Month

SSD Nodes (Lou’s host):

  • Buy longest term (2-3 year pricing locks promo rate)
  • Recommend 4+ CPUs; Lou runs 12
  • Get IPv4
  • Skip daily snapshots to start

Hostinger: N8N VPS option at $5.99/month (single-app only)

Coolify (strong recommend): handles SSL, reverse proxy, domain routing

For AIMM use case: ~$15/month gets N8N + Qdrant + Open Web UI + Coolify. Own infrastructure, zero recurring SaaS fees.