Converting a VirtualBox VM to an EC2 AMI

There are many ways to create a new AMI for use with Amazon EC2, the best solution is to simulate EC2 with a local Xen install, however this sometimes isn’t possible, so the next best thing is to create a VirtualBox based VM, then generate an EC2 AMI from it .

Preparing the Virtual Machine

The main special requirement for the VirtualBox based VM is that it needs to be configured to boot from a Xen-based kernel. Obviously VirtualBox can’t boot Xen-dependant kernels, so I usually have two similarly configured kernels, one with Xen support, one without.

Compiling a Linux kernel with Xen support ( at least on Gentoo ) results in a kernel image named ‘vmlinuz’ whereas the normal kernel without Xen support creates a kernel ( assuming compressed with bzip ) called ‘bzImage’. I have a Grub config file which tries the Xen-based kernel , then falls back to the normal one if this fails. The only disadvantage to this is I have to compile a kernel twice, once with Xen support, once without, but it allows me to test a kernel with my own configuration before I upload it to the cloud:

default 0
timeout 1
fallback 1

title Xen
root (hd0,0)
kernel /boot/vmlinuz spinlock=tickless root=/dev/sda1

title Local
root (hd0,0)
kernel /boot/bzImage spinlock=tickless root=/dev/sda1

Next, remove all data which you don’t want to upload yourself ( all my sites I am hosting on the image are stored in SVN, which I will just ‘svn co’ when the instance is running ), then clean the filesystem (  by filling the empty space with zeros ) to reduce the data we have to upload:

cat /dev/zero > /tmp/zerofill
sleep 1
sync
rm /tmp/zerofill
sleep 1
sync

Power off the VM, and we can now create an image we can create an AMI from.

Creating the Image

Next, delete all the VirtualBox snapshots for the VM, so the changes get merged into the ‘root’ image – if you do not do this then you get a version of your image before snapshots have happened ( I found this out the hard way when the VM booted with software running that I had long removed ).
Convert the virtualbox VDI to a raw image ( the VDI file is found in ~/.VirtualBox/Machines/machinename/ ):

VBoxManage internalcommands converttoraw vboximage.vdi ec2.img

Compress the image, this reduced my image down from 10GB to ~400MB:

tar cjf image.tar.bz2 ec2.img

Delete the uncompressed image:

rm ec2.img

Turning the Image into an AMI

Start up a new Amazon EC2 instance ( you can use any standard EC2 Linux VM for this ), and upload the image using scp (or another method):

scp image.tar.bz2 user@imagingserver:

Through the AWS control panel, create a new EBS volume ( ideally the same size as the VirtualBox VM ) and attach it to the instance. Make a note of the volume device name ( in my case /dev/sdi ), and overwrite the EBS volume with the image we have just uploaded:

tar xf image.tar.bz2 -O | dd of=/dev/sdi bs=10M

The EC2 volume now has our VM on it. You may notice that although the data has completed being written to the disk, there is no entries in /dev for the partitions. We need to tell Linux to update it’s partition listing, simply run fdisk against the volume, print the partition table, then write the ( unchanged )  layout:

# fdisk /dev/sdi

Command (m for help): p

Disk /dev/sdi: 10.7 GB, 10737418240 bytes
255 heads, 63 sectors/track, 1305 cylinders, total 20971520 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0xeb52e322

   Device Boot      Start         End      Blocks   Id  System
/dev/sdi1              63    20964824    10482381   83  Linux

Command (m for help): w
The partition table has been altered!

Calling ioctl() to re-read partition table.
Syncing disks.

/dev/sdi1 ( in this case ) now exists. Mount the volume to check that it looks sane, then unmount it:

mkdir /mnt/newimage
mount /dev/sdi1 /mnt/newimage
ls /mnt/newimage
...
umount /mnt/newimage

Through the AWS control panel, detach the volume, take a snapshot and make a note of the snapshot id. You can delete the volume once the snapshot is complete. Also find the correct snapshot id for your instance from this guide then register the image, replacing $SNAPSHOTID, $KERNELID, name and description with your values:

ec2-register -a x86_64 -n "Name" -d "Description" --root-device-name /dev/sda1 -b /dev/sda=$SNAPSHOTID:10:true --kernel $KERNELID

You should now be able to start the new instance and have it boot your own customized Linux EC2 AMI.