𝗕𝗮𝗰𝗸𝘂𝗽 𝗥𝗲𝘀𝘁𝗼𝗿𝗲 𝗜𝘀 𝗔 𝗟𝗶𝗲

I ran a system with hundreds of nodes. Standard backup tools failed. Restores left the system in a mess. Some nodes had old data. Others had new data. This caused crashes.

I tried Veritas NetBackup. It failed too. The tool missed nodes. It saved wrong data. The system scale was too large.

I changed the approach. Do not backup your whole system at once. Backup individual parts instead. I used rsync for nodes. I used etcd for state and consistency. I wrote custom scripts to automate the process.

The results:

You should learn from my mistakes:

Source: https://dev.to/dev-architecture-blog/backup-restore-is-a-lie-how-i-learned-to-hate-false-promises-of-data-recovery-in-large-scale-2p80