FastAPI File Uploads You Can Ship: Streaming, Validation, and S3-Compatible Storage

FastAPI File Uploads You Can Ship: Streaming, Validation, and S3-Compatible Storage

File uploads look simple until your first “someone uploaded a 2GB video and your server died” incident. In this hands-on guide, you’ll build a production-ready upload API in FastAPI that:

  • Validates file type and size
  • Streams to disk (so you don’t load the whole file into memory)
  • Optionally pushes to S3-compatible object storage (AWS S3, MinIO, DigitalOcean Spaces, etc.)
  • Returns metadata you can store in your DB later

This is aimed at junior/mid developers who want a practical pattern they can reuse.

What We’re Building

You’ll implement two endpoints:

  • POST /uploads: Accept a file, validate, stream it to a temporary file, then upload to object storage.
  • GET /uploads/{key}: Return a pre-signed URL so clients can download directly from storage.

Why pre-signed URLs? Your API stays fast and cheap: downloads don’t go through your app server.

Setup: Project and Dependencies

Create a folder and install dependencies:

python -m venv .venv source .venv/bin/activate # Windows: .venv\Scripts\activate pip install fastapi uvicorn python-multipart boto3 pydantic-settings

python-multipart enables form uploads, and boto3 talks to S3-compatible storage.

Project layout:

app/ main.py settings.py storage.py

Configuration with Environment Variables

We’ll use pydantic-settings to keep config clean.

# app/settings.py from pydantic_settings import BaseSettings, SettingsConfigDict class Settings(BaseSettings): model_config = SettingsConfigDict(env_file=".env", extra="ignore") # Upload validation max_upload_mb: int = 25 allowed_mime_types: str = "image/png,image/jpeg,application/pdf" # S3-compatible storage s3_endpoint_url: str | None = None # e.g. http://localhost:9000 for MinIO s3_region: str = "us-east-1" s3_bucket: str = "uploads" s3_access_key_id: str | None = None s3_secret_access_key: str | None = None s3_public_base_url: str | None = None # optional if you serve via CDN settings = Settings()

Example .env (optional):

MAX_UPLOAD_MB=25 ALLOWED_MIME_TYPES=image/png,image/jpeg,application/pdf S3_ENDPOINT_URL=http://localhost:9000 S3_BUCKET=uploads S3_ACCESS_KEY_ID=minioadmin S3_SECRET_ACCESS_KEY=minioadmin

Storage Layer: Upload and Pre-Sign

Put S3 logic in one place. Your API handlers stay readable.

# app/storage.py import boto3 from botocore.config import Config from botocore.exceptions import ClientError from .settings import settings def _s3_client(): return boto3.client( "s3", endpoint_url=settings.s3_endpoint_url, region_name=settings.s3_region, aws_access_key_id=settings.s3_access_key_id, aws_secret_access_key=settings.s3_secret_access_key, config=Config(signature_version="s3v4"), ) def put_file(local_path: str, key: str, content_type: str) -> None: s3 = _s3_client() extra = {"ContentType": content_type} s3.upload_file(local_path, settings.s3_bucket, key, ExtraArgs=extra) def presign_get_url(key: str, expires_seconds: int = 300) -> str: # If you have a public base URL/CDN, you might return that instead. if settings.s3_public_base_url: return f"{settings.s3_public_base_url.rstrip('/')}/{key}" s3 = _s3_client() try: return s3.generate_presigned_url( "get_object", Params={"Bucket": settings.s3_bucket, "Key": key}, ExpiresIn=expires_seconds, ) except ClientError as e: raise RuntimeError(f"Failed to presign URL: {e}")

Main API: Validate + Stream + Upload

The key trick: stream from the incoming file to a temporary file in chunks. Don’t do await file.read() for large uploads.

# app/main.py import os import uuid import tempfile from typing import Iterable from fastapi import FastAPI, UploadFile, File, HTTPException from fastapi.responses import JSONResponse from .settings import settings from .storage import put_file, presign_get_url app = FastAPI(title="Uploads API") def allowed_mime_types() -> set[str]: return {t.strip() for t in settings.allowed_mime_types.split(",") if t.strip()} def max_bytes() -> int: return settings.max_upload_mb * 1024 * 1024 def _read_chunks(upload: UploadFile, chunk_size: int = 1024 * 1024) -> Iterable[bytes]: # UploadFile.file is a SpooledTemporaryFile-like object (sync reads). while True: chunk = upload.file.read(chunk_size) if not chunk: break yield chunk @app.post("/uploads") def upload(file: UploadFile = File(...)): # 1) Basic validation: MIME type if file.content_type not in allowed_mime_types(): raise HTTPException( status_code=415, detail=f"Unsupported media type: {file.content_type}", ) # 2) Stream to a temp file and enforce size limit total = 0 suffix = os.path.splitext(file.filename or "")[1] key = f"{uuid.uuid4().hex}{suffix}" try: with tempfile.NamedTemporaryFile(delete=False) as tmp: tmp_path = tmp.name for chunk in _read_chunks(file): total += len(chunk) if total > max_bytes(): raise HTTPException( status_code=413, detail=f"File too large. Max {settings.max_upload_mb}MB.", ) tmp.write(chunk) # 3) Upload to object storage put_file(tmp_path, key=key, content_type=file.content_type) return JSONResponse( { "key": key, "original_filename": file.filename, "content_type": file.content_type, "size_bytes": total, "download_url": presign_get_url(key), } ) finally: # Always close and remove temp file try: file.file.close() except Exception: pass try: if "tmp_path" in locals() and os.path.exists(tmp_path): os.remove(tmp_path) except Exception: pass @app.get("/uploads/{key}") def get_download_url(key: str): # In real apps, check authorization and ownership before returning a URL. return {"key": key, "download_url": presign_get_url(key)}

Run it:

uvicorn app.main:app --reload

Try It with curl

Upload a PDF:

curl -F "file=@./example.pdf;type=application/pdf" http://127.0.0.1:8000/uploads

You’ll get JSON back with a storage key and a download_url.

Notes That Save You from Real-World Bugs

  • MIME types can lie. Browsers send content_type, but it’s not guaranteed. For higher security, inspect file signatures (magic bytes) using a library like python-magic, especially for executable-risk uploads.
  • Temp file cleanup matters. The finally block ensures you don’t leak disk space if something fails midway.
  • 413 vs 415 responses. Use 413 for too-large payloads and 415 for unsupported types. These details help clients handle errors properly.
  • Don’t proxy downloads through your API. Return pre-signed URLs (or CDN URLs) so your app doesn’t become a bandwidth bottleneck.

Optional Upgrade: Direct-to-S3 Uploads (Even Better)

The approach above uploads via your API server. That’s already safe (streaming + limits), but for large files, a common next step is:

  • Client requests a pre-signed PUT URL from your API
  • Client uploads directly to S3
  • Client notifies your API that upload finished (so you can record metadata)

This is how many production systems scale file uploads without expensive app servers.

A minimal endpoint for a pre-signed PUT URL could look like:

# Add to app/main.py from botocore.client import Config import boto3 @app.post("/uploads/presign-put") def presign_put(filename: str, content_type: str): if content_type not in allowed_mime_types(): raise HTTPException(status_code=415, detail="Unsupported media type") suffix = os.path.splitext(filename)[1] key = f"{uuid.uuid4().hex}{suffix}" s3 = boto3.client( "s3", endpoint_url=settings.s3_endpoint_url, region_name=settings.s3_region, aws_access_key_id=settings.s3_access_key_id, aws_secret_access_key=settings.s3_secret_access_key, config=Config(signature_version="s3v4"), ) url = s3.generate_presigned_url( "put_object", Params={ "Bucket": settings.s3_bucket, "Key": key, "ContentType": content_type, }, ExpiresIn=300, ) return {"key": key, "upload_url": url}

Now the browser can upload directly to storage. Your API just coordinates.

Wrap-Up

You now have a practical FastAPI upload pattern that juniors can understand and mids can confidently ship:

  • Validate type and size
  • Stream to disk to avoid memory spikes
  • Upload to S3-compatible storage
  • Return pre-signed URLs for fast downloads

If you want to extend this, the next “real product” steps are: authentication/authorization (ownership checks), virus scanning (async job), and storing metadata in a DB table keyed by key.


Leave a Reply

Your email address will not be published. Required fields are marked *