AWS S3 eventual consistency causing 404 on newly uploaded objects and the read-after-write workaround with versioned object promotion that removed flakiness

Amazon S3 (Simple Storage Service) has become the backbone of modern cloud storage infrastructure, supporting data-driven applications at scale. Yet, under certain conditions, developers and engineers run into elusive issues like HTTP 404 errors for objects that were just uploaded. This inconsistency isn’t arbitrary—it’s a side-effect of AWS S3’s consistency model, which defaults to *eventual consistency* for overwrite and delete operations in some contexts. While S3 guarantees *read-after-write consistency* for new object creates in most cases, subtle exceptions exist, especially in distributed or highly concurrent environments. These rare inconsistencies can introduce unnecessary flakiness into software systems, pipelines, or data ingestion workflows.

TL;DR

AWS S3 provides *read-after-write consistency* for new objects, but still displays *eventual consistency* in edge cases such as overwrites or heavily parallel uploads. This can lead to temporary 404 errors after uploading newly created files. A tested workaround is to use *object versioning* combined with *promoted reads*, ensuring the newest version is reliably served. This pattern eliminates race conditions, offering a stable and deterministic object access experience.

The Nature of Eventual Consistency in AWS S3

At the heart of AWS S3’s reliability lies its distributed system design. It replicates data across multiple availability zones for durability and uptime. However, this architectural strength also introduces *eventual consistency* in certain operations:

Overwrites: Writing a new object with the same key as an existing one may not overwrite that object instantly in all replicas.
Deletes: An object that is deleted might still appear available for a short while in some regions of the system.

Even though AWS S3 now provides *strong consistency* for most operations starting from December 2020, inconsistencies can still be observed in rare edge cases such as rapid succession writes, multi-part uploads, or cross-region replication plays that don’t synchronize as expected. For applications that immediately read a file after an upload, the sporadic appearance of 404 Not Found errors creates significant challenges.

The Problem: 404 Errors on Newly Uploaded Files

Imagine a workflow in a reactive pipeline where an application uploads a JSON definition file to S3, waits for the operation to complete, and then issues a read request to fetch that file for validation. Occasionally, this read request will return a 404 error—informing the system that the object doesn’t exist. Yet, if the system waits for a few seconds and tries again, the object is suddenly available.

This non-deterministic behavior—successful WRITE followed by a failed READ—is symptomatic of *eventual consistency*. It occurs because the metadata about the newly uploaded object hasn’t propagated completely across the distributed storage fleet within milliseconds of the write call returning success.

Why It Matters

These edge-case inconsistencies can cause serious issues in systems expecting deterministic behavior:

Flaky test results: Integration and unit tests that rely on S3 for setup or assertions randomly fail due to temporary 404 errors.
Broken automation: CI/CD workflows relying on immediate confirmation of object presence break silently or throw costly exceptions.
Inconsistent data ingestion: Data pipelines attempting to read freshly prepared source files may fail sporadically, compromising time-sensitive ETL operations.

Common Workarounds That Fall Short

Many teams attempt brute-force or rudimentary strategies to bypass 404 inconsistencies, such as:

Introducing sleep timers or fixed delays before reading the object.
Implementing retries with exponential backoff in their client code.
Checking object listings via list commands before issuing a read.

While these methods may mask the problem temporarily, they introduce non-determinism and inefficiency. They also hide legitimate failures, such as actual object non-existence, by obfuscating them with retry logic. The real solution lies in leveraging a feature native to AWS S3: object versioning.

Reliable Pattern: Versioned Write + Promoted Read

S3’s support for object versioning isn’t just useful for tooling or backups—it’s a powerful mechanism to create consistency guarantees where none existed. When versioning is enabled on a bucket:

Every write operation results in the creation of a new immutable version of that object.
Each version is uniquely identified by a version ID associated with the same S3 key.

This feature facilitates a design where the upload and retrieval of objects are decoupled from metadata consistency delays. The key mechanism here is the *read-after-write matchup by version ID*.

How It Works

Here’s a simplified form of the pattern:

Application uploads a file to a versioned S3 bucket using `PutObject`. Once the upload is complete, it immediately captures the response’s VersionId.
The system passes this `VersionId` downstream or stores it securely.
All subsequent reads reference the versioned object directly using the URL or API call with the `VersionId=` parameter.

This means the read operation is no longer reliant on the propagation delay of object listings or keys across the fleet; it targets a specific immutable version that S3 guarantees to be available once the upload completes successfully.

A pseudocode example:

# Upload object to a versioned bucket
response = s3.put_object(
    Bucket='example-bucket',
    Key='data/report.json',
    Body=json_data
)

version_id = response['VersionId']

# Later read using the captured version
s3.get_object(
    Bucket='example-bucket',
    Key='data/report.json',
    VersionId=version_id
)

Advantages of Versioned Reads

By explicitly referencing a versioned object, this approach delivers several key benefits:

Deterministic reads: Every read request references a known-good version and retrieves it exactly, without delay or error.
Flakiness eliminated: No more random 404s after upload. You interact with S3 objects in a predictable, testable manner.
Error visibility: Actual miswrites or storage issues become visible instead of being masked by retries or delays.

This pattern is especially useful when used in strongly typed systems, asynchronous job workflows, or continuous integration pipelines, where deterministic behavior is critical to avoid cascading errors.

Room for Caution

Using versioning introduces operational complexity that teams should be aware of:

Increased Storage Costs: Since every version is stored separately, costs can add up. Implement lifecycle policies to manage old versions.
Object Indexing: Managing and indexing multiple versions programmatically can be complex without proper tooling.
Compatibility: Some third-party tools don’t natively support versioned reads via `VersionId`, requiring custom clients.

Still, in critical workflows where stability trumps minimal cost, the trade-off favors determinism and error removal over monetary savings.

Conclusion

Amazon S3 is a cornerstone of resilient cloud architectures, but its consistency model presents subtle challenges that can result in frustrating 404 errors—even right after a successful upload. While AWS continues improving default consistency guarantees, edge cases remain in race-prone or distributed environments. Leveraging *object versioning* with *promoted reads* using the captured `VersionId` is a robust and future-proof way to eliminate such inconsistencies reliably. As with any design trade-off, understanding the nuances helps developers engineer more deterministic, scalable, and trustworthy systems in the cloud.

Sophia Willson

I’m Sophia, a front-end developer with a passion for JavaScript frameworks. I enjoy sharing tips and tricks for modern web development.