AWS For Newbies: S3, Deduplication, and Presigned URLs

Stop building toy apps. Start building production systems.

If your server is disposable, your files cannot live on its disk. You need AWS S3. S3 is object storage. It lives independently from your servers. This ensures your files survive even if your server crashes or disappears.

Here is how to build a professional file upload flow:

  • Use S3 Buckets and Keys A bucket is your container. A key is the full path to your file. S3 does not have real folders. It uses prefixes in a flat structure. You can organize files by type, such as images/ or documents/, to keep things clean.

  • Implement Content Deduplication Do not pay for the same file twice. Use the SHA-256 algorithm to create a unique fingerprint for every file. If two users upload the exact same image, the hash will be identical. Check your database for this hash before uploading to S3. If the hash exists, reuse the existing file.

  • Stream Large Files Never load a 200MB video into your server's RAM just to hash it. Use Node.js streams to process files in small chunks. This keeps your server fast and prevents crashes.

  • Enforce File Size Limits A frontend check is only for user experience. It is not security. You must enforce size limits in three layers: • Client-side for UX. • Backend validation to reject bad requests early. • S3 conditions via presigned URLs to stop oversized uploads at the source.

  • Use Presigned URLs for Security Do not make your bucket public. Keep "Block all public access" turned on. Instead, generate a presigned URL. This gives a user temporary permission to upload one specific file. You can set an expiry time. Use short windows for small files and longer windows for large video uploads.

  • Verify the Upload Never trust the client. After an upload, use the HeadObject command to check if the file actually exists in S3 and if the size matches your records.

The Production Flow:

  1. Client requests an upload URL.
  2. Backend validates size, type, and checks for duplicates.
  3. Backend generates a scoped presigned URL.
  4. Client uploads the file directly to S3.
  5. Backend confirms the file exists via HeadObject.

Build systems that are secure by default.

Source: https://dev.to/surajrkhonde/aws-for-newbies-episode-2-3jg5