Interview Prompt
Design a service like bit.ly that converts long URLs into short links and redirects users with low latency.
Separates the write path for link creation from the read-heavy redirect path.
Chooses an ID generation strategy and explains collision handling clearly.
Uses cache and CDN behavior without making analytics inaccurate.
Mentions abuse controls, expiration, custom aliases, and operational visibility.
Step 1
Clarify functional and non-functional requirements first.
Functional Requirements
- Create a short URL for a long destination URL.
- Redirect users from the short URL to the original destination.
- Support optional custom aliases and optional expiration times.
- Expose basic analytics such as click counts, referrers, and coarse geography.
- Allow users to manage or delete links they own.
Non-Functional Requirements
- Redirect p95 latency should stay under 50 ms from the edge for hot links.
- The system should be highly available because redirects are user-facing.
- Short codes should be hard to guess if links are private or unlisted.
- Creation can be slightly slower than redirects but should remain reliable.
- Analytics should be eventually consistent and should not block redirects.
Scale Assumptions
- 100 million new links per month.
- 10 billion redirects per month.
- Average long URL is 120 bytes, plus metadata and indexes.
- Redirect traffic is roughly 100 times heavier than creation traffic.
Write QPS
~40/sec average
100 million links per month is modest, but plan for bursty marketing campaigns.
Read QPS
~3,900/sec average
10 billion monthly redirects, often with sharp peaks from viral links.
Code space
62^7 ~= 3.5T codes
Base62 with 7 characters is enough for years of growth at this scale.
Storage
~50 to 100 GB/month
URL rows are small; analytics events dominate if stored at click-level detail.
Step 2
Identify the key entities before picking storage.
| Entity | Fields and Relationships | Interview Notes |
|---|---|---|
| Link | code, long_url, owner_id, created_at, expires_at, status | Primary lookup is by code. Keep the redirect row compact. |
| AliasReservation | alias, owner_id, created_at | Useful if custom aliases need separate validation or moderation workflows. |
| ClickEvent | code, timestamp, referrer, user_agent_hash, ip_prefix | Publish asynchronously; avoid storing raw IPs longer than necessary. |
| ClickAggregate | code, bucket_start, dimension, count | Powers dashboards without scanning raw click events. |
Step 3
Define the APIs around the user flows.
| Interface | Request / Response | Contract Notes |
|---|---|---|
| POST /v1/links | { longUrl, customAlias?, expiresAt? } -> { shortUrl, code } | Validate URL, enforce quotas, reserve alias atomically. |
| GET /{code} | 302 or 301 redirect to destination | The hot path should do one cache lookup or one indexed database lookup. |
| GET /v1/links/{code}/analytics | Returns aggregate clicks by time bucket, device, referrer, and country | Aggregates should be precomputed or served from an analytics store. |
Step 4
Trace the critical data flow step by step.
Create link
API service validates the destination, checks quota, generates or reserves a code, writes the link row, and returns the public short URL.
Generate code
Use a unique numeric ID converted to Base62, or generate random Base62 and retry on collision. ID generation is easier to reason about at interview scale.
Redirect
Edge or app server looks up code in cache, falls back to the link database, checks status and expiration, then redirects.
Emit analytics
Redirect service publishes a click event to a queue and never waits for analytics writes before returning the redirect.
Aggregate
Stream processors roll click events into minute or hour buckets for dashboards and anomaly detection.
Step 5
Convert the flow into a high-level design.
Final Design
URL Shortener final architecture
Serving Layer
Start with clients, routing, APIs, and the main synchronous path users depend on for this problem.
State Layer
Anchor the design around the key entities: Link, AliasReservation, ClickEvent, ClickAggregate.
Async Layer
Move slow, high-volume, or failure-prone work behind queues, workers, streams, caches, or background reconciliation.
Step 6
Deep dives interviewers are likely to probe.
ID generation
- A database sequence is simple but can become a coordination bottleneck.
- Snowflake-style IDs avoid central database dependency and still sort roughly by time.
- Random codes work if the code space is large and insertion handles collision retries.
Caching redirects
- Cache code to destination mappings aggressively because reads dominate.
- Use short TTLs or explicit invalidation for deleted and expired links.
- Do not rely only on CDN caching if you need click analytics for every redirect.
Abuse protection
- Scan destinations for known malware and phishing domains.
- Rate limit link creation per account, IP prefix, and payment status.
- Keep a takedown workflow that can disable links quickly across caches.
Step 7
Tradeoffs to explain out loud.
301 vs 302 redirects
Use When
Use 302 by default if destinations can change or analytics behavior matters.
Watch Out
Browsers and crawlers may cache 301 redirects, making destination changes hard.
Random codes vs sequential IDs
Use When
Use random codes when unguessability matters and collision retry is acceptable.
Watch Out
Sequential IDs leak growth and make private links easier to enumerate.
Raw click events vs aggregates only
Use When
Use aggregates only for cheaper storage and simpler privacy posture.
Watch Out
Raw events are useful for fraud detection and retroactive analytics fixes.
Avoid
Common mistakes that weaken the answer.
- Putting analytics writes in the redirect critical path.
- Ignoring custom alias collision and moderation rules.
- Using a short code space that runs out after a few years.
- Forgetting delete, expiration, and cache invalidation behavior.
- Treating all links as public when the product may need unlisted or private links.
Step 8
Follow-up questions with strong answers.
How would you handle a celebrity posting a short link that goes viral?
Keep the redirect path cache-first, prewarm popular links when detected, use edge caching carefully, and decouple analytics through a durable queue.
How do you prevent users from creating phishing links?
Add domain reputation checks, rate limits, abuse reports, automated scanning, and a fast disable path that invalidates caches.
How would you support editable destination URLs?
Keep the code stable, update destination metadata with authorization checks, and use cache invalidation or short TTLs so redirects converge quickly.
Step 9
What a strong answer should signal.
Core flow
Explains creation and redirect paths separately and optimizes for read-heavy traffic.
Data design
Chooses a compact primary lookup model and handles alias uniqueness.
Scale
Uses cache, async analytics, and realistic capacity math.
Product judgment
Covers expiration, abuse, privacy, and operational controls.
Practice this problem under interview conditions.
Read the guide, then run the prompt live with LeetSys so you can practice requirements, key entities, API design, data flow, whiteboarding, tradeoff narration, and follow-up handling.
Related Guides
Mid-Level
Rate Limiter
A focused system design guide for distributed rate limiting with token buckets, sliding windows, Redis, local caches, multi-region behavior, and abuse controls.
Mid-Level
Search Autocomplete
A system design interview guide for typeahead search suggestions with tries, prefix indexes, ranking, freshness, personalization, and abuse filtering.
Senior
News Feed
A complete system design guide for building a personalized social news feed with fanout, ranking, privacy, and timeline freshness tradeoffs.