I never thought I would write a blog post about this topic. But after the simple task “Delete Glacier-Vaults” cost me quite some time, I want to share the suffering at this point.
For outsiders who still like to read articles, one thing in advance:
What is Amazon S3 Glacier?
Amazon Glacier offers the possibility to archive data at low cost. One gigabyte of storage space currently costs 0.004 USD per month. If I would use Amazon S3, I would currently pay 0.023 per gigabyte. The disadvantage is that loading the data stored there can take up to 24 hours. Therefore it is cheap. So it’s perfect for large backups that you only want to use in case of a system crash.
I personally used Glacier together with my NAS. Years ago, I bought a synology with only one hard drive, so I wanted to be on the safe side and “frozen” the data on Amazon-Glacier. At some point I switched to a new version. But I didn’t care about the ice blocks in the Amazon data center anymore.
Over 42,000 archives, or 290 GB, lay around Amazon for years and cost me about 6 € per month.
This circumstance finally got on my mind and I wanted to delete the archives without further ado. Then I remembered why I didn’t do that years ago:
You cannot delete an archive using the Amazon S3 Glacier (S3 Glacier) management console. To delete an archive you must use the AWS Command Line Interface (CLI) or write code to make a delete request using either the REST API directly or the AWS SDK for Java and .NET wrapper libraries. The following topics explain how to use the AWS SDK for Java and .NET wrapper libraries, the REST API, and the AWS CLI.
AWS Dokumentation (Quelle)
I shied away from the effort and the costs were simply ausgeblendet?.
Can’t be that hard! I have installed the AWS CLI. In short: Hack in a command and good!
THINK:
AWS wiping consists of three steps.
- Start Inventory retrieval Job
- Status check and wait (about 4 hours for me)
- Delete archives individually (>42,000 commands via the CLI)
All right! This is being scripted! There’ll be something ready.
I also found it quite quickly: https://superuser.com/questions/687785/how-to-delete-all-glacier-data
The first thing I tried was “glacier-vault-remove”. This thing “deleted” all my archives – which looked really cool – but in the end I could still see the things in my console.
The PHP-script posted in the forum then delivered the right results, but still needed some manual work, because it only takes the last step (delete archives one by one).
So I simplified it a bit and share it here.
The prerequisite for execution is that the CLI is logged in with a user who has full access to AWS Glacier. (instructions here)
You can find the script on GitLab: https://gitlab.com/TobiSell/php-remove-glacier
Glacier reacts very slowly, so the script may not be able to delete the Vault directly. Either run it again hours later (really hours!) or delete the Vault in the console after a few days. The number of archives should then be 0.
Cover Photo by Adam Jang on Unsplash
Hi Tobias, it’s not clear for me … does your script delete archives in a vault AND (after) delete the vault itself ? Best regards. Fred
It clears the vault and tries to delete the vault in the end. ( a vault can only be deleted, once its empty )
Hi Tobias,
Thanks for this helpful article. This solves half of my problem.
But the next question coming in my mind is “Will I be charged for deleting my archives?”(Not deleting the vault completely, but removing some very old archives)
As you mentioned, first, we have to run an inventory retrieval job. Is this inventory retrieval going to charge me?
Please help.
Thanks & Regards,
Mohit
Really good question. Not quite sure. It might actually count as retrieval. Didn’t check my bill that closely. Not using glacier anymore 🙁