System Design Guides

Design a URL Shortener

A practical system design interview guide for building a URL shortener with redirects, custom aliases, analytics, rate limits, and high availability.

hashingcachingredirectsanalyticsrate limiting

Interview Prompt

Design a service like bit.ly that converts long URLs into short links and redirects users with low latency.

Separates the write path for link creation from the read-heavy redirect path.

Chooses an ID generation strategy and explains collision handling clearly.

Uses cache and CDN behavior without making analytics inaccurate.

Mentions abuse controls, expiration, custom aliases, and operational visibility.

Step 1

Clarify functional and non-functional requirements first.

Functional Requirements

  • Create a short URL for a long destination URL.
  • Redirect users from the short URL to the original destination.
  • Support optional custom aliases and optional expiration times.
  • Expose basic analytics such as click counts, referrers, and coarse geography.
  • Allow users to manage or delete links they own.

Non-Functional Requirements

  • Redirect p95 latency should stay under 50 ms from the edge for hot links.
  • The system should be highly available because redirects are user-facing.
  • Short codes should be hard to guess if links are private or unlisted.
  • Creation can be slightly slower than redirects but should remain reliable.
  • Analytics should be eventually consistent and should not block redirects.

Scale Assumptions

  • 100 million new links per month.
  • 10 billion redirects per month.
  • Average long URL is 120 bytes, plus metadata and indexes.
  • Redirect traffic is roughly 100 times heavier than creation traffic.

Write QPS

~40/sec average

100 million links per month is modest, but plan for bursty marketing campaigns.

Read QPS

~3,900/sec average

10 billion monthly redirects, often with sharp peaks from viral links.

Code space

62^7 ~= 3.5T codes

Base62 with 7 characters is enough for years of growth at this scale.

Storage

~50 to 100 GB/month

URL rows are small; analytics events dominate if stored at click-level detail.

Step 2

Identify the key entities before picking storage.

EntityFields and RelationshipsInterview Notes
Linkcode, long_url, owner_id, created_at, expires_at, statusPrimary lookup is by code. Keep the redirect row compact.
AliasReservationalias, owner_id, created_atUseful if custom aliases need separate validation or moderation workflows.
ClickEventcode, timestamp, referrer, user_agent_hash, ip_prefixPublish asynchronously; avoid storing raw IPs longer than necessary.
ClickAggregatecode, bucket_start, dimension, countPowers dashboards without scanning raw click events.

Step 3

Define the APIs around the user flows.

InterfaceRequest / ResponseContract Notes
POST /v1/links{ longUrl, customAlias?, expiresAt? } -> { shortUrl, code }Validate URL, enforce quotas, reserve alias atomically.
GET /{code}302 or 301 redirect to destinationThe hot path should do one cache lookup or one indexed database lookup.
GET /v1/links/{code}/analyticsReturns aggregate clicks by time bucket, device, referrer, and countryAggregates should be precomputed or served from an analytics store.

Step 4

Trace the critical data flow step by step.

01

Create link

API service validates the destination, checks quota, generates or reserves a code, writes the link row, and returns the public short URL.

02

Generate code

Use a unique numeric ID converted to Base62, or generate random Base62 and retry on collision. ID generation is easier to reason about at interview scale.

03

Redirect

Edge or app server looks up code in cache, falls back to the link database, checks status and expiration, then redirects.

04

Emit analytics

Redirect service publishes a click event to a queue and never waits for analytics writes before returning the redirect.

05

Aggregate

Stream processors roll click events into minute or hour buckets for dashboards and anomaly detection.

Step 5

Convert the flow into a high-level design.

Final Design

URL Shortener final architecture

Loading Diagram

Serving Layer

Start with clients, routing, APIs, and the main synchronous path users depend on for this problem.

State Layer

Anchor the design around the key entities: Link, AliasReservation, ClickEvent, ClickAggregate.

Async Layer

Move slow, high-volume, or failure-prone work behind queues, workers, streams, caches, or background reconciliation.

Step 6

Deep dives interviewers are likely to probe.

ID generation

  • A database sequence is simple but can become a coordination bottleneck.
  • Snowflake-style IDs avoid central database dependency and still sort roughly by time.
  • Random codes work if the code space is large and insertion handles collision retries.

Caching redirects

  • Cache code to destination mappings aggressively because reads dominate.
  • Use short TTLs or explicit invalidation for deleted and expired links.
  • Do not rely only on CDN caching if you need click analytics for every redirect.

Abuse protection

  • Scan destinations for known malware and phishing domains.
  • Rate limit link creation per account, IP prefix, and payment status.
  • Keep a takedown workflow that can disable links quickly across caches.

Step 7

Tradeoffs to explain out loud.

301 vs 302 redirects

Use When

Use 302 by default if destinations can change or analytics behavior matters.

Watch Out

Browsers and crawlers may cache 301 redirects, making destination changes hard.

Random codes vs sequential IDs

Use When

Use random codes when unguessability matters and collision retry is acceptable.

Watch Out

Sequential IDs leak growth and make private links easier to enumerate.

Raw click events vs aggregates only

Use When

Use aggregates only for cheaper storage and simpler privacy posture.

Watch Out

Raw events are useful for fraud detection and retroactive analytics fixes.

Avoid

Common mistakes that weaken the answer.

  • Putting analytics writes in the redirect critical path.
  • Ignoring custom alias collision and moderation rules.
  • Using a short code space that runs out after a few years.
  • Forgetting delete, expiration, and cache invalidation behavior.
  • Treating all links as public when the product may need unlisted or private links.

Step 8

Follow-up questions with strong answers.

How would you handle a celebrity posting a short link that goes viral?

Keep the redirect path cache-first, prewarm popular links when detected, use edge caching carefully, and decouple analytics through a durable queue.

How do you prevent users from creating phishing links?

Add domain reputation checks, rate limits, abuse reports, automated scanning, and a fast disable path that invalidates caches.

How would you support editable destination URLs?

Keep the code stable, update destination metadata with authorization checks, and use cache invalidation or short TTLs so redirects converge quickly.

Step 9

What a strong answer should signal.

Core flow

Explains creation and redirect paths separately and optimizes for read-heavy traffic.

Data design

Chooses a compact primary lookup model and handles alias uniqueness.

Scale

Uses cache, async analytics, and realistic capacity math.

Product judgment

Covers expiration, abuse, privacy, and operational controls.

Practice this problem under interview conditions.

Read the guide, then run the prompt live with LeetSys so you can practice requirements, key entities, API design, data flow, whiteboarding, tradeoff narration, and follow-up handling.

Practice Now

Related Guides