Skip to main content
Back to Projects
Live 2023 - Present (Side Project)

Zeitgaist

Cross-Lingual Social Intelligence

Ask in English. Find Chinese, Russian, Arabic takes you'd never discover otherwise.

89%

Relevance@10

6

Social Platforms

20+

Languages

<200ms

p95 Latency

Zeitgaist
Visit Zeitgaist

The Problem

Decision-makers in finance, marketing, and research need to understand public opinion across Twitter/X, Reddit, Hacker News, Mastodon, Bluesky, and other platforms. Manual monitoring is time-consuming, critical insights in non-English sources are missed, and traditional search lacks temporal context and source attribution. Ask "why did real estate prices surge recently?" against Chinese sources and you get completely different answers than English - geopolitical drivers, local policies, cultural factors that Western media doesn't cover.

The Solution

Built a unified backend serving two complementary products - an AI chatbot for conversational queries and an analytics dashboard for trend visualization. The production Hybrid RAG system features two-stage retrieval (dense embedding search + cross-encoder re-ranking) — a Corrective RAG pattern that improved relevance@10 from 72% to 89% compared to dense-only retrieval. Initially tried single-stage retrieval but found accuracy degraded on cross-lingual queries - the two-stage approach with language-specific re-ranking solved this at acceptable latency (~200ms p95).

Tech Stack

Backend

Python 3.12FastAPIPostgreSQLpgvectorRedisSupabase

AI/ML

Sentence TransformersCross-EncoderGPT-4 / ClaudeFastTextRAG Pipeline

Frontend

SvelteKitTypeScriptTailwind CSSDaisyUIChart.js

Infrastructure

Docker SwarmTraefikOpenTelemetryWebSocketMessagePack

My Role: Founder & Lead Developer

  • Designed and built complete Hybrid RAG architecture with pgvector
  • Implemented Corrective RAG pattern: Sentence Transformers + cross-encoder re-ranking
  • Built multi-language support for 20+ languages with automatic query translation
  • Created real-time WebSocket streaming with MessagePack binary serialization
  • Developed conversation memory with context-aware query reformulation
  • Built two SvelteKit frontends (Chat + Social dashboard)
  • Deployed production infrastructure with Docker Swarm and Traefik

Key Differentiators

Multi-Platform Aggregation: Unified search across 6 diverse social networks

Hybrid RAG with Corrective Retrieval: Dense embedding + cross-encoder re-ranking for 89% relevance

Cross-Lingual Intelligence: Ask in English, get insights from Chinese, Arabic, Russian sources

Temporal Awareness: LLM-powered understanding of time-based queries

Full Source Attribution: Every AI response includes numbered citations with timestamps

Interested in a Similar Project?

Discuss a similar RAG project

Other Projects