System Design10 minute read

Real-time Without WebSockets

How we built a real-time system that scales to millions using HTTP

Zev Uhuru
Engineering Research
March 22, 2024

Everyone talks about WebSockets when they need real-time features. But what if I told you that some of the most successful real-time systems on the planet don't use WebSockets at all? This is the story of how we built a real-time notification system that handles millions of concurrent users using nothing but HTTP.

The conventional wisdom says WebSockets are the gold standard for real-time communication. But after building and scaling real-time systems for years, I've learned that the best solution isn't always the most obvious one.

The WebSocket Trap

WebSockets seem perfect for real-time applications. Persistent connections, low latency, bidirectional communication—what's not to love? The problems become apparent when you try to scale.

WebSocket Scaling Challenges:

  • Connection state management: Every connection consumes memory
  • Load balancing complexity: Sticky sessions required
  • Network infrastructure: Proxies and firewalls hate persistent connections
  • Debugging nightmares: Connection drops are hard to trace

The HTTP Alternative

Instead of fighting WebSocket scaling issues, we embraced HTTP's strengths: statelessness, cacheability, and universal support. Our solution combines Server-Sent Events (SSE) for real-time updates with smart polling strategies for reliability.

javascript
// Server-Sent Events for real-time updates
class RealTimeService {
  constructor() {
    this.connections = new Map();
    this.redis = new Redis();
  }

  // SSE endpoint for real-time updates
  async subscribe(req, res, userId) {
    res.writeHead(200, {
      'Content-Type': 'text/event-stream',
      'Cache-Control': 'no-cache',
      'Connection': 'keep-alive',
      'Access-Control-Allow-Origin': '*'
    });

    // Store connection for this user
    this.connections.set(userId, res);

    // Send initial data
    this.sendEvent(res, 'connected', { userId, timestamp: Date.now() });

    // Listen for Redis pub/sub messages
    const subscriber = this.redis.duplicate();
    subscriber.subscribe(`user:${userId}`);
    
    subscriber.on('message', (channel, message) => {
      this.sendEvent(res, 'update', JSON.parse(message));
    });

    // Cleanup on disconnect
    req.on('close', () => {
      this.connections.delete(userId);
      subscriber.unsubscribe();
      subscriber.quit();
    });
  }

  sendEvent(res, event, data) {
    res.write(`event: ${event}\n`);
    res.write(`data: ${JSON.stringify(data)}\n\n`);
  }
}

Performance Comparison

Smart Polling Strategy

For clients that can't maintain SSE connections, we implemented an adaptive polling system that adjusts frequency based on activity and connection quality.

javascript
// Adaptive polling client
class AdaptivePoller {
  constructor(endpoint, options = {}) {
    this.endpoint = endpoint;
    this.minInterval = options.minInterval || 1000;
    this.maxInterval = options.maxInterval || 30000;
    this.currentInterval = this.minInterval;
    this.backoffMultiplier = options.backoffMultiplier || 1.5;
    this.lastActivity = Date.now();
  }

  async start() {
    while (this.active) {
      try {
        const response = await fetch(`${this.endpoint}?since=${this.lastUpdate}`);
        const data = await response.json();

        if (data.updates.length > 0) {
          // Got updates - increase polling frequency
          this.currentInterval = Math.max(
            this.minInterval, 
            this.currentInterval / this.backoffMultiplier
          );
          this.handleUpdates(data.updates);
        } else {
          // No updates - decrease polling frequency
          this.currentInterval = Math.min(
            this.maxInterval, 
            this.currentInterval * this.backoffMultiplier
          );
        }

        await this.sleep(this.currentInterval);
      } catch (error) {
        // Network error - back off more aggressively
        this.currentInterval = Math.min(
          this.maxInterval, 
          this.currentInterval * 2
        );
        await this.sleep(this.currentInterval);
      }
    }
  }

  handleUpdates(updates) {
    updates.forEach(update => {
      this.emit('update', update);
      this.lastUpdate = update.timestamp;
    });
  }
}

The Results

2M+
Concurrent Users
25ms
Average Latency
99.9%
Uptime

Our HTTP-based real-time system now handles over 2 million concurrent users with sub-30ms latency. More importantly, it's reliable, debuggable, and works everywhere.

"The best real-time system is the one that works reliably for all your users, not the one that looks best in benchmarks."

When to Choose HTTP Over WebSockets

  • Massive scale: When you need to support millions of users
  • Reliability over latency: When 99.9% uptime matters more than 10ms latency
  • Diverse clients: When you can't control the network environment
  • Simple operations: When you mostly push data to clients

WebSockets aren't wrong—they're just not always right. Sometimes the best real-time solution is the one that embraces HTTP's strengths rather than fighting its constraints.