When we started building Twiscope, the first prototype worked fine at a few thousand events per day. By the time we hit production load — five platforms, multiple data types, continuous ingestion — we had rebuilt the core architecture twice. Here's what the final version looks like and why.
The naive version
Version one was straightforward: a Django view received incoming data, processed it synchronously, and wrote to PostgreSQL. For a prototype, this was fine. For production, it had three fatal problems.
- Synchronous processing meant API response time was proportional to data volume — spikes in incoming data caused request timeouts
- ML inference ran in the request cycle — a 300ms model call turned every ingest endpoint into a 300ms+ wait
- No queue meant no backpressure — when a data source burst, the system either processed everything immediately or dropped it
Separating ingestion from processing
The key architectural shift was decoupling ingestion from processing completely. The ingest endpoint does exactly one thing: validates the incoming payload, writes a task to the Celery queue, and returns 200. That's it. Processing happens asynchronously.
# Ingest endpoint — fast path only
@api_view(["POST"])
def ingest_event(request):
serializer = EventSerializer(data=request.data)
serializer.is_valid(raise_exception=True)
# Enqueue — never process inline
process_event.delay(serializer.validated_data)
return Response({"queued": True}, status=202)
# Processing happens in a Celery worker
@shared_task(bind=True, max_retries=3)
def process_event(self, event_data):
try:
enriched = enrich(event_data)
store(enriched)
if should_trigger_alert(enriched):
dispatch_alert.delay(enriched)
except Exception as exc:
raise self.retry(exc=exc, countdown=2 ** self.request.retries)Redis doing double duty
We use Redis for two purposes simultaneously: as the Celery broker (task queue) and as a hot-path cache for frequent queries. This was a deliberate infrastructure simplification — running a separate RabbitMQ instance for the broker added ops overhead with no meaningful benefit at our scale.
The 25% latency reduction we measured came almost entirely from Redis caching. The most expensive database queries — trending keywords, top influencers by platform — were being executed thousands of times per hour with identical results. Caching them with a 60-second TTL eliminated 90% of those hits.
Rule of thumb: if a query result is the same for any user in any session within a time window, cache it. The hard part is deciding the right TTL — too short and the cache is ineffective, too long and users see stale data.
ML inference: never in the request cycle
This deserves its own section because it's the most common mistake I see in systems that bolt on ML. Running model inference synchronously in an API handler feels natural — you have the data, you call the model, you return the result. But inference latency is non-deterministic. A model that averages 100ms can spike to 800ms under load. That spike becomes your API's p99 latency.
In Twiscope, all ML work (trend prediction, sentiment analysis, anomaly detection) runs as dedicated Celery workers. They consume from a separate queue, process at their own pace, and write results back to the database. The API layer reads pre-computed ML results — it never calls a model directly.
Real-time delivery via WebSocket
When an alert is generated, users need to see it within seconds — not the next time they refresh. We use WebSockets for all real-time delivery. When a Celery task detects an alert-worthy event, it publishes to a Redis channel. A Django Channels consumer forwards that to connected WebSocket clients.
The end-to-end latency from event ingestion to alert delivery is under 2 seconds in normal operation. That number is dominated by Celery task queue time, not by network or processing time.
What I'd do differently
- Start with the async queue from day one — retrofitting it onto a synchronous system is painful and introduces subtle bugs during the transition
- Define alert thresholds with stakeholders before building the alerting system — we changed them four times in the first month
- Invest in replay tooling earlier — the ability to replay historical events through updated processing logic is invaluable and we built it six months too late
This architecture powers Twiscope — an Arabic OSINT social intelligence platform processing 5M+ data points daily across Twitter, Instagram, YouTube, TikTok, and news sources. Full case study at /case-studies/twiscope