System Design Guides

Design a File Storage and Sync Service

A Dropbox-style system design guide for file upload, metadata, block storage, deduplication, sharing, sync conflicts, and consistency.

object storagesyncmetadatadeduplicationsharing

Interview Prompt

Design a cloud file storage product where users can upload files, sync them across devices, and share folders with collaborators.

Separates file metadata from blob or block storage.

Uses chunking and resumable uploads for large files.

Handles sync cursors, versioning, sharing permissions, and conflict resolution.

Understands that metadata consistency is more important than blob read consistency.

Step 1

Clarify functional and non-functional requirements first.

Functional Requirements

  • Users can upload, download, rename, move, delete, and restore files.
  • Clients can sync changes across multiple devices.
  • Users can share files or folders with permissions.
  • The system stores file versions and supports conflict copies.
  • Large file uploads can resume after network failure.

Non-Functional Requirements

  • File bytes must be durable after upload commit.
  • Metadata changes should be strongly consistent per user namespace.
  • Sync should be efficient and avoid scanning the whole tree repeatedly.
  • Large files should not overload application servers.
  • Permission changes should take effect quickly.

Scale Assumptions

  • 20 million users.
  • Average user stores 20 GB.
  • 1 million file changes per minute at peak.
  • Large files can be several GB each.

Stored data

Hundreds of PB

20M users times 20 GB before replication, compression, and deduplication.

Metadata writes

~16k/sec peak

1M changes per minute during busy windows.

Blob upload path

Bandwidth heavy

Direct-to-object-storage upload avoids app server bottlenecks.

Sync state

Per namespace log

Change logs let clients fetch deltas instead of re-listing every file.

Step 2

Identify the key entities before picking storage.

EntityFields and RelationshipsInterview Notes
FileNodenode_id, namespace_id, parent_id, name, type, current_version, deleted_atRepresents the file tree and folder hierarchy.
FileVersionnode_id, version, content_hash, size, created_by, created_atAllows restore and conflict handling.
BlobBlockcontent_hash, storage_url, size, ref_countContent-addressed blocks enable deduplication.
NamespaceChangenamespace_id, sequence, node_id, action, versionAppend-only log for sync cursors.

Step 3

Define the APIs around the user flows.

InterfaceRequest / ResponseContract Notes
POST /v1/upload-session{ path, size, contentHash } -> { uploadId, partUrls[] }Supports resumable multipart upload and optional dedup lookup.
POST /v1/files/commit{ uploadId, path, parentVersion } -> { fileId, version }Commits metadata after blob parts are present.
GET /v1/sync?cursor=...Returns ordered namespace changes and next cursorPrimary API for desktop and mobile sync clients.
POST /v1/shares{ resourceId, grantee, permission }Permission checks must apply to both metadata and download URLs.

Step 4

Trace the critical data flow step by step.

01

Chunk upload

Client splits large files into blocks, uploads directly to object storage, and retries failed parts independently.

02

Commit metadata

Metadata service validates uploaded blocks, checks parent folder version, creates a new file version, and appends a namespace change.

03

Sync clients

Clients keep a cursor and poll or receive notifications, then fetch only changed nodes and missing blocks.

04

Sharing

Permission service evaluates user access on metadata operations and when issuing signed download URLs.

05

Garbage collection

Background jobs remove unreferenced blob blocks after retention windows and preserve versions required for restore.

Step 5

Convert the flow into a high-level design.

Final Design

File Storage Sync final architecture

Loading Diagram

Serving Layer

Start with clients, routing, APIs, and the main synchronous path users depend on for this problem.

State Layer

Anchor the design around the key entities: FileNode, FileVersion, BlobBlock, NamespaceChange.

Async Layer

Move slow, high-volume, or failure-prone work behind queues, workers, streams, caches, or background reconciliation.

Step 6

Deep dives interviewers are likely to probe.

Conflict handling

  • Use parent version or file version preconditions during commit.
  • If two devices edit offline, keep both versions and create a conflict copy.
  • Never silently overwrite remote changes from another device.

Efficient sync

  • An append-only change log per namespace lets clients resume from a cursor.
  • Push notifications can wake clients, but the sync API should repair missed notifications.
  • Folder moves should be represented compactly instead of emitting changes for every descendant when possible.

Deduplication and privacy

  • Content hashes can identify duplicate blocks and save storage.
  • Cross-user deduplication can leak whether another user has a file if APIs are not careful.
  • Encryption changes deduplication choices depending on key scope.

Step 7

Tradeoffs to explain out loud.

Whole-file storage vs block storage

Use When

Block storage helps large files, resumability, and partial deduplication.

Watch Out

Block manifests and garbage collection add complexity.

Polling sync vs push notifications

Use When

Use push for fast wake-up and polling/cursors for correctness.

Watch Out

Push notifications are lossy and cannot be the only source of truth.

Strong folder consistency vs eventual consistency

Use When

Strong per-namespace metadata consistency avoids confusing file trees.

Watch Out

Global consistency across shared folders may need careful partitioning.

Avoid

Common mistakes that weaken the answer.

  • Uploading large files through API servers.
  • Using recursive directory scans for every sync.
  • Ignoring offline edits and conflict copies.
  • Treating permission checks as a UI-only concern.
  • Deleting blob data immediately when a file is deleted, breaking restore.

Step 8

Follow-up questions with strong answers.

How do you sync a folder with one million files?

Use namespace change logs, cursors, pagination, and subtree metadata summaries so clients fetch deltas rather than listing the whole folder repeatedly.

How do you handle two devices editing the same file offline?

Commit with version preconditions. If the base version changed, create a conflict version and surface both to the user.

How do you revoke a shared link?

Mark the share token revoked, stop issuing new signed URLs, and use short-lived download URLs so old access expires quickly.

Step 9

What a strong answer should signal.

Storage architecture

Separates metadata, versions, block manifests, and durable object storage.

Sync correctness

Uses cursors, change logs, version checks, and conflict handling.

Scale

Supports resumable uploads, deduplication, and efficient large-folder sync.

Security

Covers sharing permissions, signed URLs, revocation, and privacy tradeoffs.

Practice this problem under interview conditions.

Read the guide, then run the prompt live with LeetSys so you can practice requirements, key entities, API design, data flow, whiteboarding, tradeoff narration, and follow-up handling.

Practice Now

Related Guides