emitter clone
An Emitter-style web app with Twilio voice underneath, so users dial from the browser straight to a phone line. Live SMS layered on top — the whole thing held together by a tight WebSocket layer.
Overview
A browser-native softphone. Sign in, type a number, hit dial — the call lands on a real phone line in under a second. SMS shares the same threads, the same UI, the same WebSocket. Built at Web Stacking on a team of four; I owned the realtime spine end-to-end.
The point of the project wasn't novelty — Twilio has had voice in the browser for years. The point was *production*. Most demos work. This one had to work at 3 AM on a corporate VPN with someone's bluetooth headphones flapping in and out.
The Problem
Three problems that the demo flow hides.
**Symmetric NAT.** WebRTC tutorials assume STUN works. In real corporate networks it doesn't, and you end up paying for TURN traffic. Budget for it or wear it.
**Webhook chaos.** Twilio status callbacks arrive late, out of order, sometimes duplicated, sometimes never. Treat them as hints, not truth.
**Echo.** The browser does its best. The moment a user joins from a phone on speaker mode, the best isn't enough. You need data on every call so you can argue with support tickets, not vibes.
The Approach
Two services, one realtime hub.
**voice-svc.** Owns the Twilio TwiML flow, the call state machine, the TURN credential mint. State machine is the **source of truth**, not the webhook. Webhooks merely poke the state machine to re-fetch from Twilio's REST API and reconcile.
**sms-svc.** Pure Node + Twilio's API for inbound/outbound. Stores every message in Postgres with the same `conversation_id` as the call thread, so voice and text live in one timeline.
**ws-hub.** A thin WebSocket fanout. Browser clients subscribe to their own conversation channel; both services publish into it. Heartbeats every 5s, drop after 15. Reconnect with exponential backoff plus jitter so a flaky network doesn't thunder-herd the hub.
// the only webhook handler that matters
async function onCallStatus(req) {
const { CallSid } = req.body;
const truth = await twilio.calls(CallSid).fetch();
await stateMachine.transition(CallSid, truth.status);
}Stack Deep-Dive
Critical metric we paged on: `peer_connection_failed` per minute, not CPU. Every Twilio incident showed up there 90 seconds before any human-facing error.
Results
Concurrent call capacity rose 150% over the prior implementation simply by moving from a single-node WS to a Redis-backed fanout. Median media setup landed under 400ms. Webhook recovery — the percentage of calls whose final state matched truth without manual reconciliation — rose to 99.4% after the state-machine refactor, from somewhere in the low 80s before.
The win we didn't expect: support ticket volume on call quality dropped because the team finally had `getStats` data per call. Most "the call was bad" tickets resolve faster when you can show the customer their own jitter graph.