Skip to main content
Back to Projects
Live 2021 - Present (Side Project)

Foretale

Real-Time NLP & Multi-Source Data Fusion Platform

Production streaming architecture for social sentiment at scale

100K+

Daily Events

<100ms

NLP Latency

25+

Automation Nodes

24/7

Uptime

Foretale
Visit Foretale

The Problem

Decision-makers need real-time intelligence from social media, news, and market signals — but data is fragmented across platforms with no unified processing pipeline. Traditional approaches require coding expertise, expensive infrastructure, and can't handle the volume and velocity of modern social data streams. Most importantly, insights arrive too late to act on.

The Solution

Built a production streaming architecture with Kafka at the core, ingesting data from 7+ sources 24/7. The distributed ML backend (FastAPI/Ray/PyTorch) performs real-time sentiment analysis, emotion detection, and OCR extraction from images. A custom Node-RED fork provides visual workflow automation, allowing non-technical users to create complex data processing pipelines through drag-and-drop. Multi-tenant isolation with HashiCorp Vault ensures enterprise-grade security. **Lesson learned**: Initially deployed NLP models directly in the main API process, which caused latency spikes during high-volume periods. Moving to Ray distributed workers with dedicated GPU allocation solved the contention issues and brought p99 latency from 800ms to under 100ms.

Tech Stack

Backend

PythonFastAPIRayApache KafkaRedisPostgreSQLTimescaleDB

AI/ML

PyTorchTransformersspaCyEasyOCRSentiment AnalysisNER

Frontend

Node-RED (Custom Fork)SvelteSvelteKitTailwind CSS

Infrastructure

Docker SwarmHashiCorp VaultTraefikGitLab CI/CDSentry

Data Processing

CCXTGDELTMonte Carlo SimulationTimescaleDB

My Role: Co-founder & Lead Developer

  • Co-founded and architected the complete streaming platform from data ingestion to action execution
  • Built Kafka-based sensor network processing multi-source data streams 24/7
  • Developed distributed ML backend (FastAPI/Ray/PyTorch) for real-time NLP inference
  • Created custom Node-RED fork with 25+ automation nodes and Svelte-based UI components
  • Implemented OCR pipeline for extracting text and signals from images at scale
  • Built multi-tenant orchestration layer with Docker Swarm and HashiCorp Vault
  • Designed Monte Carlo simulation engine for strategy validation

Platform Components

Streaming Pipeline

Kafka-based real-time data pipeline ingesting signals from social media, news APIs, and market data sources 24/7 with sub-second latency.

  • Real-time Twitter/X stream processing
  • Multi-source data aggregation (7+ sources)
  • Event-driven architecture with exactly-once semantics
  • Horizontal scaling for burst traffic
Apache Kafka FastAPI Ray Redis

NLP Inference Engine

Distributed ML backend performing real-time sentiment analysis, emotion detection, and text extraction from images at scale.

  • roBERTa-based sentiment/emotion/irony classification
  • EasyOCR for image text extraction
  • Named entity recognition (NER) for signal detection
  • 100K+ daily inferences with <100ms latency
PyTorch Transformers Ray spaCy

FlowStudio

Visual workflow builder enabling non-technical users to create complex data processing and automation pipelines through drag-and-drop.

  • 25+ custom automation nodes
  • Real-time flow execution with live data
  • Svelte-based custom UI components
  • Built-in simulation and validation
Node-RED (Custom Fork) Svelte WebSocket

Key Differentiators

Production Streaming Architecture: Kafka-based pipeline handling 100K+ daily events with sub-second latency

Real-Time NLP Inference: Distributed sentiment, emotion, and irony detection using roBERTa models

Visual Workflow Automation: No-code interface for complex data processing pipelines

Multi-Source Data Fusion: Unified ingestion from social media, news, and market data APIs

Enterprise Security: HashiCorp Vault integration with per-tenant secrets isolation

Interested in a Similar Project?

Discuss streaming architecture

Other Projects