Amazon doesn’t only sell books and loads of other convenient stuff it also provides easy to use webservices. One of the more impressive ones I think is the Amazon Simple Storage Service (Amazon S3). It is a webservice based interface to an infinite large disk which is secure and always available. By the way did I mention that it is even cheaper to use than external disks/tapes/dvd’s. This guy did the math and I must say that I agree with his findings.

The difficulty is that you need a program or utility to use it. I’ve found s3sync, a ruby program, that works quite like rsync.

On Leopard you don’t need to upgrade the ruby installation, Leopard comes with version 1.8.6 and s3sync requires 1.8.4 or above so we are safe.

If you haven’t already done so you need to sign up with Amazon S3 to be able to use the service. You will get an Access Key ID and a Secret Access Key which you both need later on.

To test and verify the workings of your new account there ae several free GUI based tools available like Cyberduck to check your S3 account and to check that the scripts we will be using later on are working.

Next you’ll need to download the s3sync program if you haven’t already done so from the website.

I’ve created a directory in /usr/local/share called s3sync where I stored all the ruby scripts for use by the shell script that I made and will be put in /etc/periodic. You must set the executable bit on s3cmd.rb and s3sync.rb to be able to use them as an executable. You can do this with:

sudo chmod +x /usr/local/share/s3sync/s3cmd.rb
sudo chmod +x /usr/local/share/s3sync/s3sync.rb

Now use Cyberduck or the s3cmd command to create a test bucket on S3. You can compare the use of buckets to the use of folders on your harddisk. To do this with s3cmd use the following example:

First you need to create the necessary environment variables to let the scripts know your S3 access keys. You can also edit the ruby scripts and put your keys in them but that is not recommended. Read the README.txt that comes with the package for more info on that !

export AWS_ACCESS_KEY_ID=”youraccesskey”
export AWS_SECRET_ACCESS_KEY=”yoursecretaccesskey”
/usr/local/share/s3sync/s3cmd.rb createbucket YourBucket

Now you can use that bucket to store some information in it using the s3sync command.

/usr/local/share/s3sync/s3sync.rb -d -r -v testdir YourBucket

To get the information back to your hardisk just issue the same command with the locations in reverse order:

/usr/local/share/s3sync/s3sync.rb -d -r -v YourBucket testdir2

Just play around with the s3sync.rb command to find out how it works.

s3sync.rb sends your data in the clear over the internet which isn’t something you would enjoy or I would advise. To be more secure it has the option of sending the information over a SSL encrypted connection. To get this working you’ll need to do the following. Create a directory ‘certs‘ in the ‘/usr/local/share/s3sync/‘ directory. Download this file with all certificates of know sites into the certs directory you’ve just created. Then execute:

sh ssl.certs.shar

To use these certificates s3sync.rb needs another environment variable:

export SSL_CERT_DIR=/usr/local/share/s3sync/certs/

Now you can use the extra option -s or –ssl to encrypt all the traffic between your computer and the Amazon S3 servers.

Here is my script which I use to backup all the websites on my server. Here is a script you can use to restore what you’ve uploaded.

Please be careful what you store on the Amazon S3 servers. It is a secure environment but it is located in the USA and they will turn over your data if asked for it by a law or government official.