Man Accidentally Deletes His Entire Company With One Wrong Command

Man Accidentally Deletes His Entire Company With One Wrong Command

SysAdmin deletes server

SysAdmins often have this nightmare when they run the dreadful and deadly command ‘rm -rf /’ as root. How horrifying!

If you didn’t know already, / represents root. And running ‘rm -rf /’ will delete root directory and all of its content. In Linux file hierarchy, root contains everything. Deleting root means your system is gone, forever.
No wonder this is compared to drunken driving in the Linux world.

Sh*t happens

But shit happens in the IT world. And apparently it happened with this hapless SysAdmin Marco Marsala who runs a web hosting company serving over 1500 customers.
As per the question posted on Serverfault few days back, Marsala tried to run a Bash script that had the following command in it: rm -rf {foo}/{bar}. But it turned out to be ‘rm -rf /’ due to undefined variables and the inevitable happened.
In Marsala’s own words:
I run a small hosting provider with more or less 1535 customers and I use Ansible to automate some operations to be run on all servers. Last night I accidentally ran, on all servers, a Bash script with a rm -rf {foo}/{bar} with those variables undefined due to a bug in the code above this line.
All servers got deleted and the offsite backups too because the remote storage was mounted just before by the same script (that is a backup maintenance script).
How I can recover from a rm -rf / now in a timely manner?
Oh, poor guy!! What did you just do?
sudo rm -rf funny Linux command

What next?

What next? This is what Marsala wanted to know. Is there a way to recover from ‘rm -rf /’?
But chances of recovering all the data from a rm -rf / are thin. No wonder, this post started getting sarcastic (but honest) comments like:
If you really don’t have any backups I am sorry to say but you just nuked your entire company
Another one went like:
You’re going out of business. You don’t need technical advice, you need to call your lawyer.
Few people suggested to shutdown everything, don’t overwrite anything and use data recovery tools to get at least some data back.
And it seems like, it did work to a larger extent for Marsala as he did mention “luckily we recovered almost all data” later on.

Lessons to learn

As some people are speculating that it’s a hoax, there are still few lessons to learn for all of us.

  • Backup everything. If it’s a professional server, have multiple, offline backups
  • Don’t use a random tool or script from the internet and use it on a production machine directly
  • Have test machines identical to that of production for testing out new stuff without risking the production system
Anything to add to this scary incident?
Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s