Date Tags EC2 / Gentoo

This is a guide to setting up an Amazon EC2 instance running Gentoo using Elastic Block Storage (EBS). In particular, to allow the use of Gentoo on a micro-instance. With Amazon’s new free tier, it is much more practical to experiment with having your own web server on the internet. The free tier only allows for the use of a micro-instance, which can be harder to set up due to the requirement of using EBS as the root partition, though this guide aims to reduce much of that difficulty. This guide will be much easier to follow though if you have some experience with EC2, or at least read through some of its documentation, as well as experience installing Gentoo. If you don’t though, I have tried to write this guide assuming as little knowledge as possible, and providing as much detail as possible. If you are more experienced, you can just skip to the code blocks. Additionally, this guide will assume you are running Gentoo on your system, though it should work with little modification on other versions of Linux.

This guide sets up Gentoo with any programs you would install in Gentoo, though the kernel is a kernel provided by Amazon. Additionally, this guide is based on guides by Rich0 for installing Gentoo on a normal EC2 instance and dkavanagh guide boot from ebs and ami conversion.

What you need:

  • An Amazon AWS account

  • A Linux system1 (Gentoo recommended)

  • Loop device file system support in your kernel2

The first step is to decide how big you want to make your install. For a basic web server, I have found 5-6GB3 of space to be adequate4, though the specific size is up to you. For this example, I will be making a 6GB volume. Secondly, micro-instances can run as either 32-bit or 64-bit systems. This guide assumes you will be setting up a 32-bit system.

First, you should download a stage3 tarball and portage snapshot. Instructions can be found in the [Gentoo Handbook].5 Next, the following code6 readies and mounts our image to work on, and can be run while we wait for the downloads to complete.

$ dd if=/dev/zero of=image.fs bs=1M count=6000
# losetup /dev/loop1 image.fs
# mkfs.ext3 /dev/loop1
# mount /dev/loop1 /mnt/gentoo

The first line creates a blank file7 that acts as the image for your new install using dd8. Next, losetup sets up the loop device9. Then mkfs sets up a file system10 on the loop device, and finally, the loop device is mounted to work on.

Once the downloads11 finish, and our loop device is mounted and ready, we can install the Gentoo base system.

# cd /mnt/gentoo
# tar xjpf (location-to-download)/stage3-*.tar.bz2
# tar xjf (location-to-download)/portage*.tar.bz2 -C /mnt/gentoo/usr

Next, you want to modify your make.conf, setting the use flags to your specific needs. Of particular importance is that the “-mno-tls-direct-seg-refs” flag is set in CFLAGS. This is needed in the Xen12 environment of Amazon’s EC2. My make.conf is below, you can adjust yours13 as you need, in particular the use flags. Below is my make.conf:

CFLAGS="-O2 -mno-tls-direct-seg-refs -pipe"

CXXFLAGS="${CFLAGS}"
# WARNING: Changing your CHOST is not something that should be done lightly.
# Please consult http://www.gentoo.org/doc/en/change-chost.xml before changing.
CHOST="i686-pc-linux-gnu"

MAKEOPTS="-j3"

# These are the USE flags that were used in addition to what is provided by the
# profile used for building.
USE="-X -qt -gtk -gnome -kde bash-completion -snmp
zsh-completion imap libwww sockets threads latin1 -berkdb
emacs mysql ssl spell maildir sasl tls python snmp perl"

Next we also want to set up our fstab file. Below is mine:

#fstab for Amazon EC2 EBS 32-bit
/dev/sda1   /           ext3    defaults,noatime  1   1
tmpfs       /dev/shm    tmpfs   defaults          0   0
devpts      /dev/pts    devpts  gid=5,mode=620    0   0
sysfs       /sys        sysfs   defaults          0   0
proc        /proc       proc    defaults          0   0

Next we are going to chroot into our new environment using the following commands14.

# cp -L /etc/resolv.conf /mnt/gentoo/etc/resolv.conf
# mount -t proc none /mnt/gentoo/proc
# mount -o bind /dev /mnt/gentoo/dev
# chroot /mnt/gentoo /bin/bash
# env-update
# source /etc/profile
# export PS1="(chroot) $PS1"

Once we are in our new environment, we are going to want to get portage ready[15] including configuring our locale. When editing locale.gen, most English speakers will un-comment the first 2 entries. See the Gentoo Localization Guide for more information on locales.

# emerge --sync
# nano /etc/locale.gen
# locale-gen

Next you will want to set the root password using the passwd command.

# passwd

Amazon EC2 is reported to have problems with newer versions of udev, so it is recommended that you mask them in portage, using the following commands:

# mkdir -p /etc/portage/package.mask
# echo ">=sys-fs/udev-125" > /etc/portage/package.mask/udev

Now that portage is ready, we should rebuild the system, because we added “-mno-tls-direct-seg-refs” the CFLAGS16

# emerge -eav system

Next you will want to install the basic packages for your instance. “dhcpcd” or some other dhcp client is required, while the other programs listed are optional. The Amazon ami and api tools are helpful for interacting with the service, and should probably be installed on your local machine as well as the instance. After you install your programs, running a world update is a good idea to ensure the entire system is up to date (in total around 30 packages will be installed). Then update any changed config files using dispatch-conf (or whichever system you prefer).

# emerge -avu dhcpcd metalog vixie-cron screen ec2-ami-tools ec2-api-tools
# emerge -avuDN world
# dispatch-conf

Now we set the programs that will run at start-up, the most important of which are net.eth0 to set up the internet connection, and sshd, so we can connect to the instance.

# rc-update add net.eth0 default
# rc-update add sshd default
# rc-update add metalog default
# rc-update add vixie-cron default

Now is a good time to install the packages for any applications that you plan on running, such as web-server programs (apache, nginx, etc.), databases, or any other programs you would like. Emerging programs is much faster here than on the micro-instance. Therefore, you should avoid having to install programs there as much as possible.

Once you have at least installed all the packages you plan on using, you can free up some space17 my cleaning out the packages downloaded by portage.

# rm /usr/portage/distfiles/*

Since it is not recommended to log in as root, making a user account is advised, though not required. The Gentoo Handbook has a section on creating a user account, or you can follow these instructions (where “myuser” is the username):

# useradd -m -G users,wheel -s /bin/bash myuser
# passwd myuser

Now you should set up ssh access for your user account, and optionally root account. I recommend setting up the system using key files, and no root access, though your situation may be different. If you are not used to using ssh key files for access, then I would leave password login enabled.

From Rich0′s guide18, there is a start-up file you can add that places your EC2 key into root’s ssh accepted keys file, as well as kill the nash-hotplug program, which is not needed after start-up. You can edit the code to place the EC2 key file somewhere else, such as the user account’s accepted keys file19. This code should be placed at the end of /etc/conf.d/local.start

[ ! -e /root ] && cp -r /etc/skel /root
if [ ! -d /root/.ssh ] ; then
mkdir -p /root/.ssh
chmod 700 /root/.ssh
fi
curl http://169.254.169.254/2008-02-01//meta-data/public-keys/0/openssh-key > /tmp/my-key
if [ $? -eq 0 ] ; then
cat /tmp/my-key >> /root/.ssh/authorized_keys
chmod 600 /root/.ssh/authorized_keys
rm /tmp/my-key
fi
killall nash-hotplug

Next we want to install the kernel modules for the kernel that you will use to boot your instance. You can use the ec2-api-tools to find all the available kernels with this command.

# ec2-describe-images -o amazon | grep aki

I am currently using the “amazon/vmlinuz-2.6.18-xenU-ec2-v1.5-i686″ kernel, with id aki-cc06f3a5. The kernel modules can be found at http://ec2-downloads.s3.amazonaws.com/ec2-modules-2.6.18-xenU-ec2-v1.5-i686.tgz. This URL can be modified to get the kernel modules for different kernels. The following steps20 download and install the modules.

# cd /
# wget http://ec2-downloads.s3.amazonaws.com/ec2-modules-2.6.18-xenU-ec2-v1.5-i686.tgz
# tar xf ec2-modules-*.tgz
# depmod -a
# rm ec2-modules-*.tgz

If there are any more steps you wish to complete before uploading this image, now is the last convenient chance before we prepare the image for upload.

We are now ready to pack up our image for upload. The first step is to exit the chroot environment.

# exit

In order to make the compressed size of image as small as possible, we zero out the empty space in the image using dd to create a file of zeros in the unused space, then deleting it.

# cd /mnt/gentoo
# dd if=/dev/zero of=blankfile bs=1M count=6000
# rm blankfile

Now we need to unmount everything we mounted earlier to set up our chroot environment, and detach the loop device.

# umount /mnt/gentoo/proc
# umount /mnt/gentoo/dev
# umount /mnt/gentoo/
# losetup -d /dev/loop1

With our file now unattached to anything, we can upload our image to Amazon’s S3 using the ec2-ami-tools21 function ec2-bundle-image22. The cert and private key are the X.509 Certificate found in Security Credentials in your account settings, while the user will be the account number found in the upper right in your account, by your name, and “your-custom-name” is a name you choose for the image.

$ ec2-bundle-image --image image.fs --prefix your-custom-name --cert cert-XXXXXXXXXX.pem --privatekey pk-XXXXXXXXXX.pem --user XXXX-XXXX-XXXX --destination out/ --arch i386

Once the image is compressed and bundled, we can use the ec2-upload-bundle function to upload the image to S323. The bucket parameter needs to be a unique name not used by anyone else on S324. The access-key and secret-key you can copy and paste from the Access Keys section of the Security Credentials page of your account settings. The –retry argument causes the program to try the upload again if a part fails, which i found to happen a few times with an image.

$ ec2-upload-bundle --manifest out/your-custom-name.manifest.xml --bucket unique-bucket-name --access-key YOURACCESSIDKEY --secret-key YOURSECRETKEY --retry

The next steps will be to take our S3 image, and convert it into an EBS based volume that can be used by a micro-instance25. We will accomplish this by starting a generic Amazon micro-instance, then attaching a blank EBS volume to it. We can then download our S3 bundle to this instance, and unpack it into the blank EBS volume. Finally, we take a snapshot of that volume, and use that as the base of our Gentoo instance.

First we start up a Basic Amazon Linux AMI (ami-08728661). We can do this either by using the graphical AWS Management Console, or we can use the ec2-run-instances command.

$ ec2-run-instances --cert cert-XXXXXXXXXX.pem --privatekey pk-XXXXXXXXXX.pem

While our instance is starting, we also want to create a new volume. It will need to be at least as large as the image you created, though it can be larger. This will be the base of our Gentoo image. This volume needs to be in the same availability zone as the running instance, and will be in the form of ‘us-east-1a’, though yours may be different. The size is specified with the -s flag, and in this example, is 6GB. Also note the name of the volume, in the form of ‘vol-XXXXXXXX’.

$ ec2addvol -s 6 -z AVAILABILITY-ZONE

Once the instance starts, we then connect using ssh. It may take a few minutes for the instance to fully start up. Its progress can be monitored in the AWS Management Console, or using the command ec2-describe-instances26.

$ ec2-describe-instances

In the description that is returned, one item should be the word ‘running’27 once it starts. While it is starting, it will be the word ‘pending’. You can run the command every minute or so until the instance starts. When the instance starts, we will then want to attach the volume we made to it. The instance id is in the description, in the form of i-XXXXXXXX.

$ ec2attvol vol-XXXXXXXX -i i-XXXXXXXX -d /dev/sdh

Once the instance starts, the description will also have the public DNS address. This will have the form of ‘ec2-XXX-XXX-XXX-XXX.compute-1.amazonaws.com’ though yours may differ slightly. With this, we have enough information to connect to our instance. You will need your private key28 on the amazon instance, which we can copy with scp.

$ scp -i KEYPAIRNAME.pem pk-XXXXXXXXXX.pem [email protected]:
$ ssh -i KEYPAIRNAME.pem [email protected]

In order to make it more clear where commands are executing, I will prefix all the commands done remotely with a *, such as ‘*$ echo hello world’. In order to make working with our files easier, we will switch to root using ‘sudo su’. Then we will download our image bundle from S3.

*$ sudo su
*# ec2-download-bundle --bucket unique-bucket-name --access-key YOURACCESSIDKEY --secret-key YOURSECRETKEY --privatekey pk-XXXXXXXXXX.pem --prefix your-custom-name --directory /mnt

Once the download is complete, we are going to unpack it to get our image file. If you don’t have enough space in this image29, you may need to create and attach another volume for extra space, though if you have used the same sizes, you should be fine.

*# ec2-unbundle --privatekey pk-XXXXXXXXXX.pem --manifest /mnt/your-custom-name.manifest.xml --source /mnt --destination /mnt

Unbundling the image might take some time. Once it is finished, we are going to use ‘dd’ to copy the image onto the volume. This step will also take some time.

*# dd if=/mnt/your-custom-name of=/dev/sdh

Once the dd is finished, you can mount the volume and look to see if everything is as you expected, or make any changes you need to before you build AMI.

*# mkdir /mnt/test
*# mount /dev/sdh /mnt/test
* Check or change anything you need to
*# umount /mnt/test
*# exit
*$ exit

Now we are going to detach the volume from the instance, terminate the instance, and take a snapshot of the volume. This snapshot will be the base of our Gentoo system.

$ ec2-detach-volume vol-XXXXXXXX
$ ec2-terminate-instances i-XXXXXXXX
$ ec2-create-snapshot vol-XXXXXXXX --description "Write a description for the snapshot here"

The snapshot will take some time to complete. You can monitor its progress, with ec2-describe-snapshots. When it is ‘completed’, we can use ec2-register to turn our snapshot into a bootable machine image. We also need to tell it to use the kernel we selected earlier30. That kernel will also have a ramdisk, which you can find using the same method as we used to find the kernel31. The ramdisk is prefixed with ‘ari-’.

$ ec2-register --snapshot snap-XXXXXXXX --name "Pick a short name for the AMI" --description "Write a longer description of this AMI" --architecture i386 --kernel aki-cc06f3a --ramdisk ari-f606f39f

When this step is finished, you should now have a working Gentoo install!

If you have any questions, tips, or ticks, let me know.


  1. A Linux virtual machine should also work, though I haven’t tested this  

  2. If you are unsure you probably have it  

  3. Amazon free tier allows up to 10GB total for free  

  4. The size of the volume can be changed later, and increasing the size is easier than decreasing it  

  5. Where specifically the files are downloaded to is unimportant  

  6. Most code should be run as root using su, and if it is easier, all code can be run as root.  

  7. The file is created in your current directory  

  8. You may want to cd to an adequate working directory with enough space before starting these commands  

  9. You may need to ‘modprobe loop’ to load the loop device module  

  10. If you use less than 4GB, you should consider increasing the inode size by using the -T small option with mkfs?. For reference, a base install of Gentoo uses about 220,000 inodes in my experience.  

  11. You should also check the md5sum of the downloads if you haven’t already  

  12. more information in the Gentoo Wiki article on Xen  

  13. Make sure you change the file in /mnt/gentoo, not your system’s!  

  14. More information can be found in the Gentoo Handbook.  

  15. More info in the Gentoo Handbook  

  16. Thanks Will B. from the comments!  

  17. Freeing up space also makes the compressed image smaller for an easier upload  

  18. Step 21 here  

  19. This is an easy way to make sure you don’t get locked out of your system image by loosing your keys. It is possible to edit this image after it is uploaded, so that if you do need to change keys, you can.  

  20. This step is based on the post by Daniele Madama  

  21. If you don’t have ec2-ami-tools and ec2-api-tools installed, you should install them now. They are available in portage, with further setup instructions by Amazon here. You should also set your environment variables as per that guide.  

  22. You can use ec2-bundle-image –help to better understand what some of the arguments mean  

  23. Though the image will be larger than allowed allowed in the free tier, it will be there for only a few hours at most. Storage is charged in GB/months, and so will only be a small fraction of a unit.  

  24. Using your actual name can help make it specific, if you need help with ideas  

  25. EBS based volumes can be used by all instance types, not just micro  

  26. You can add the instance ID (returned by the run instance command, in the form of i-XXXXXXXX) if you want to me more specific about the results returned  

  27. To make it more clear when the instance has started, you can use ‘ec2-describe-instances | grep running’  

  28. From the X.509 Certificate  

  29. Use ‘df’ to check space remaining  

  30. In this example, we are using aki-cc06f3a5  

  31. For this example, we are using ari-f606f39f