Building Twiscope
How I designed and led the development of a real-time social intelligence platform that processes 5M+ data points daily — with ML-driven trend detection and sub-second alerting.
01The Problem
Intelligenceteamsdrowninsignalnoise.
Modern intelligence operations require tracking thousands of signals across social media simultaneously. The challenge isn't ingestion — platforms provide APIs. The challenge is making the firehose actionable.
Raw social data arrives in bursts, often millions of events per day. Without a processing layer that can keep up in real-time, analysts are always looking at stale data. And without ML models that understand context, every spike looks the same — noise and signal are indistinguishable.
The team had a working prototype but it couldn't handle the volume. At peak loads, the queue fell behind, alerts arrived hours late, and the ML inference ran synchronously in the request cycle — killing API response times.
Core requirements
- Process 5M+ social data events per day without queue backlog
- Deliver alerts in under 2 seconds from event occurrence
- Run ML inference (trend + anomaly) without blocking API responses
- Scale horizontally as data volume grows
02Architecture
Fivelayers.Onecleardataflow.
The system was redesigned around a clear separation of concerns: ingest, queue, process, serve, visualize.
Data Ingestion
Twitter/X · Telegram · RSS feeds · Webhooks
Celery Workers
Distributed async task processing across multiple workers
Redis
Message broker + hot-path cache for frequent queries
Django REST API
Business logic, authentication, data enrichment
Dashboard + Real-time Alerts
WebSocket-powered live updates, <2s delivery
ML Engine (async)
Trend prediction · Anomaly detection · Runs as Celery tasks — never in the request cycle
03Key Decisions
Whatwechoseandwhy.
Celery over threading or asyncio
Celery tasks are retryable, distributable across multiple machines, and integrate natively with Django. Threading breaks under high concurrency; asyncio requires rewriting the entire Django stack. Celery gave us horizontal scale without changing the existing codebase.
Redis as both broker and cache
Rather than running a separate RabbitMQ instance for the broker and Memcached for caching, Redis handled both. This reduced infrastructure complexity and ops overhead. The 25% latency reduction came from caching frequent query patterns in Redis — no DB hit for hot paths.
ML inference as async Celery tasks, never in-request
Running ML models synchronously in the request cycle is the classic mistake. A 300ms inference call turns every API response into a 300ms+ wait. Moving ML to async Celery tasks kept API p95 latency predictable at under 50ms, while inference ran on dedicated workers.
WebSocket for real-time dashboard, not polling
Long-polling the API for new alerts would mean 1–10 second delays depending on poll interval. WebSocket connections gave us a push model — when the Django API processes a new alert, it pushes immediately. This is how we hit the <2 second delivery target.
04Outcomes
Numbersthatmovedthebusiness.
Data points processed daily
Without queue backlog
Latency reduction
Via Redis hot-path caching
Earlier trend detection
vs. manual monitoring
Alert delivery time
End-to-end latency
05Leadership & Team
Engineeringtheteam,notjustthesystem.
Mentorship
Coached junior engineers on distributed systems, async patterns, and production debugging. Pair-reviewed complex Celery tasks until the patterns clicked.
Security-First
Enforced authenticated endpoints, encrypted data at rest, and regular dependency audits. Security was a gate, not an afterthought.
Alignment
Translated technical constraints into product trade-offs for non-technical stakeholders. Kept ML feature scope realistic given inference infrastructure cost.