The GIF clip of Rachel dancing from “Friends” grew to hundreds of gigabytes, crushing Discourse’s backup copies

The GIF clip of Rachel dancing from “Friends” grew to hundreds of gigabytes, crushing Discourse’s backup copies

4 hardware

Short summary

Discourse is a popular platform for online discussions, currently hosting more than 22,000 communities.

During a recent site backup, a critical issue arose: one GIF file (1.6 MB) was copied by users 246,173 times, exceeding the hard link limit in the ext4 filesystem and causing the backup size to balloon to 377 GB.

Below is a detailed breakdown of the situation, causes, and solutions.

1. What happened?
ElementDataPlatformDiscourseNumber of communities>22,000File‑issueGIF “Rachel from Friends”, 1.6 MBCopies 246,173 (hard links)ext4 limit~65,000 hard links per inodeResulting backup size 377 GB
Why did this happen?
Discourse allows emojis and GIF files to be inserted into any posts.

When moving a file from one context to another (e.g., from a private chat to a public post), the system creates a new copy with a random SHA‑1 hash. This means that even if the content is identical, Discourse treats it as a new object.

Thus, one GIF can appear in tens of thousands of posts and private chats—each time generating a separate file. In total, 246,173 copies exceeded the ext4 limit, and the system began creating new files instead of hard links, resulting in the “loss” of 181,000 backup copies.

2. First solution – hash aggregation
Discourse first tried to solve the problem by grouping uploads by SHA‑1:

1. During backup all files were grouped by identical hash.
2. Only the first copy from each group was uploaded.
3. Hard links were created for the rest.

It looked elegant—but it didn’t account for ext4’s link limit. As soon as the limit was reached, the system automatically created new files instead of links, and the backup size spiked.

3. New solution – “switch” on EMLINK error
Discourse developed a more flexible strategy:

1. A hard link to the file is created as usual.
2. If the filesystem returns an EMLINK error (link limit exceeded), the next copy becomes the “primary” file.
3. From that point onward new links are again made to this new primary version.

Thus, each time the limit is exceeded a switch to a new “parent” file occurs, and the system continues to operate without errors. This solution works on any filesystem and requires no additional configuration.

4. Results and conclusions
- One popular GIF (Rachel’s dance from Friends) caused the backup size to grow to 377 GB.
- The ext4 limit of ~65,000 hard links proved to be a critical factor.
- The first hash‑aggregation solution didn’t consider filesystem limits, leading to data loss.
- The new EMLINK‑error “switch” strategy allows proper handling of large numbers of copies while maintaining backup efficiency.

> *“Now we know that Jennifer Aniston can perform stress testing on infrastructure,”* — Discourse noted with irony in its blog.

Comments (0)

Share your thoughts — please be polite and stay on topic.

No comments yet. Leave a comment — share your opinion!

To leave a comment, please log in.

Log in to comment