System Design Guides

Design a Video Streaming Platform

A senior-level guide to designing a YouTube-style video streaming system with uploads, transcoding, CDN delivery, metadata, search, and recommendations.

videocdntranscodingobject storagerecommendations

Interview Prompt

Design a platform where users can upload videos and viewers can stream them reliably across web and mobile clients.

Separates upload, processing, metadata, and playback delivery.

Understands adaptive bitrate streaming and CDN caching.

Uses asynchronous transcoding and durable object storage.

Mentions copyright, moderation, recommendations, and observability.

Step 1

Clarify functional and non-functional requirements first.

Functional Requirements

  • Users can upload videos with title, description, thumbnail, and visibility settings.
  • The system transcodes videos into multiple resolutions and bitrates.
  • Viewers can stream videos with seeking and adaptive quality.
  • Users can like, comment, subscribe, and view engagement counts.
  • The platform supports search, recommendations, and basic creator analytics.

Non-Functional Requirements

  • Playback startup should be fast, ideally under 2 seconds for hot videos.
  • Uploads can take minutes to become fully processed.
  • The system should serve viral videos globally without overloading origin storage.
  • Video files must be durable and not lost after upload acknowledgement.
  • Processing failures should be retryable and visible to creators.

Scale Assumptions

  • 10 million daily active viewers.
  • 500,000 video uploads per day.
  • Average uploaded video is 300 MB before transcoding.
  • Average viewer watches 30 minutes per day.

Upload ingest

~150 TB/day raw

500k uploads times 300 MB before derivatives and replicas.

Playback traffic

Multiple PB/day

10M viewers times 30 minutes can dominate all other traffic.

Transcoding output

2x to 5x raw size

Multiple renditions, thumbnails, audio tracks, and manifests increase storage.

Metadata reads

High QPS

Every page load reads title, author, counters, recommendations, and permissions.

Step 2

Identify the key entities before picking storage.

EntityFields and RelationshipsInterview Notes
Videovideo_id, owner_id, title, description, visibility, status, duration, created_atMetadata row should be small and frequently cached.
VideoAssetvideo_id, rendition, codec, bitrate, manifest_url, segment_prefixTracks generated playback assets and processing state.
EngagementCountervideo_id, views, likes, comments, updated_atUse sharded counters or streaming aggregation for hot videos.
WatchEventuser_id, video_id, watch_ms, position, device, timestampFeeds analytics and recommendation models.

Step 3

Define the APIs around the user flows.

InterfaceRequest / ResponseContract Notes
POST /v1/videos/upload-session{ fileSize, contentType } -> { uploadUrl, uploadId }Use resumable multipart upload directly to object storage.
POST /v1/videos/{uploadId}/complete{ title, description, visibility } -> { videoId, status }Enqueues validation, moderation, thumbnail extraction, and transcoding.
GET /v1/videos/{videoId}/playbackReturns manifest URL, captions, DRM fields if needed, and tracking IDsThe manifest points clients to CDN-hosted segments.
GET /v1/search?q=...Returns ranked video resultsSearch index is updated asynchronously after metadata and processing changes.

Step 4

Trace the critical data flow step by step.

01

Upload session

API returns signed object-storage URLs so large video bytes do not flow through application servers.

02

Processing pipeline

A durable queue triggers validators, virus scan, copyright checks, thumbnail extraction, and transcoding workers.

03

Package for playback

Workers produce HLS or DASH manifests and small media segments at multiple bitrates.

04

Distribute globally

Segments live in object storage behind a CDN. Popular segments stay at edge locations near viewers.

05

Serve experience

Playback API returns metadata, permissions, manifest URL, recommendations, and event tracking configuration.

Step 5

Convert the flow into a high-level design.

Final Design

Video Streaming final architecture

Loading Diagram

Serving Layer

Start with clients, routing, APIs, and the main synchronous path users depend on for this problem.

State Layer

Anchor the design around the key entities: Video, VideoAsset, EngagementCounter, WatchEvent.

Async Layer

Move slow, high-volume, or failure-prone work behind queues, workers, streams, caches, or background reconciliation.

Step 6

Deep dives interviewers are likely to probe.

Adaptive bitrate streaming

  • Clients fetch a manifest and switch between renditions based on bandwidth and buffer health.
  • Short segments improve adaptation but increase request overhead.
  • Encoding ladder design depends on content type, device mix, and cost targets.

Hot video protection

  • CDN absorbs most playback traffic for popular segments.
  • Origin shield reduces duplicate cache misses across edge locations.
  • Prewarming helps for scheduled premieres or expected traffic spikes.

Counters and analytics

  • View counts should use deduplication and delayed aggregation to reduce fraud.
  • Creator analytics can be eventually consistent.
  • Playback quality metrics need client beacons for startup time, stalls, and bitrate changes.

Step 7

Tradeoffs to explain out loud.

Synchronous vs asynchronous processing

Use When

Use asynchronous processing because video transcode work is slow and failure-prone.

Watch Out

Users need clear processing status and retry behavior.

Store original uploads forever vs discard after processing

Use When

Keep originals if future re-encoding, copyright review, or creator downloads matter.

Watch Out

Original retention increases storage cost significantly.

Exact counters vs approximate counters

Use When

Approximate counters are fine for public view counts and reduce write pressure.

Watch Out

Billing or revenue sharing may need stronger accounting controls.

Avoid

Common mistakes that weaken the answer.

  • Routing video uploads through normal API servers.
  • Trying to transcode synchronously before acknowledging upload completion.
  • Serving video segments directly from origin storage without CDN.
  • Ignoring seek behavior and only optimizing first playback.
  • Treating view counts as simple increment operations on one row.

Step 8

Follow-up questions with strong answers.

What happens when a video goes viral immediately after upload?

Prioritize processing lower-resolution renditions first, put generated segments behind CDN, use origin shield, and scale metadata/counter paths separately.

How do you support resumable uploads?

Create an upload session, accept multipart chunks into object storage, track uploaded parts, and finalize only after all required parts are present.

How would you detect playback quality problems?

Collect client beacons for startup time, rebuffer count, errors, bitrate, CDN region, and correlate them with server and CDN logs.

Step 9

What a strong answer should signal.

Media pipeline

Explains direct upload, durable queues, transcoding, manifests, and CDN delivery.

Scalability

Recognizes playback traffic dominates and designs around CDN edge caching.

Reliability

Handles retryable processing failures and durable asset storage.

Product completeness

Covers search, recommendations, moderation, counters, and creator visibility.

Practice this problem under interview conditions.

Read the guide, then run the prompt live with LeetSys so you can practice requirements, key entities, API design, data flow, whiteboarding, tradeoff narration, and follow-up handling.

Practice Now

Related Guides