Noah Meyerhans

The weblog

Using FAI to Customize and Build Your Own Cloud Images

At this past November’s Debian cloud sprint, we classified our image users into three broad buckets in order to help guide our discussions and ensure that we were covering the common use cases. Our users fit generally into one of the following groups:

  1. People who directly launch our image and treat it like a classic VPS. These users most likely will be logging into their instances via ssh and configuring it interactively, though they may also install and use a configuration management system at some point.
  2. People who directly launch our images but configure them automatically via launch-time configuration passed to the cloud-init process on the agent. This automatic configuration may optionally serve to bootstrap the instance into a more complete configuration management system. The user may or may not ever actually log in to the system at all.
  3. People who will not use our images directly at all, but will instead construct their own image based on ours. They may do this by launching an instance of our image, customizing it, and snapshotting it, or they may build a custom image from scratch by reusing and modifying the tools and configuration that we use to generate our images.

This post is intended to help people in the final category get started with building their own cloud images based on our tools and configuration. As I mentioned in my previous post on the subject, we are using the FAI project with configuration from the fai-cloud-images. It’s probably a good idea to get familiar with FAI and our configs before proceeding, but it’s not necessary.

You’ll need to use FAI version 5.3.4 or greater. 5.3.4 is currently available in stretch and jessie-backports. Images can be generated locally on your non-cloud host, or on an existing cloud instance. You’ll likely find it more convenient to use a cloud instance so you can avoid the overhead of having to copy disk images between hosts. For the most part, I’ll assume throughout this document that you’re generating your image on a cloud instance, but I’ll highlight the steps where it actually matters. I’ll also be describing the steps to target AWS, though the general workflow should be similar if you’re targeting a different platform.

To get started, install the fai-server package on your instance and clone the fai-cloud-images git repository. (I’ll assume the repository is cloned to /srv/fai/config.) In order to generate your own disk image that generally matches what we’ve been distributing, you’ll use a command like:

sudo fai-diskimage --hostname stretch-image --size 8G \
--class DEBIAN,STRETCH,AMD64,GRUB_PC,DEVEL,CLOUD,EC2 \
/tmp/stretch-image.raw

This command will create an 8 GB raw disk image at /tmp/stretch-image.raw, create some partitions and filesystems within it, and install and configure a bunch of packages into it. Exactly what packages it installs and how it configures them will be determined by the FAI config tree and the classes provided on the command line. The package_config subdirectory of the FAI configuration contains several files, the names of which are FAI classes. Activating a given class by referencing it on the fai-diskimage command line instructs FAI to process the contents of the matching package_config file if such a file exists. The files use a simple grammar that provides you with the ability to request certain packages to be installed or removed.

Let’s say for example that you’d like to build a custom image that looks mostly identical to Debian’s images, but that also contains the Apache HTTP server. You might do that by introducing a new file to package_config/HTTPD file, as follows:

PACKAGES install
apache2

Then, when running fai-diskimage, you’ll add HTTPD to the list of classes:

sudo fai-diskimage --hostname stretch-image --size 8G \
--class DEBIAN,STRETCH,AMD64,GRUB_PC,DEVEL,CLOUD,EC2,HTTPD \
/tmp/stretch-image.raw

Aside from custom package installation, you’re likely to also want custom configuration. FAI allows the use of pretty much any scripting language to perform modifications to your image. A common task that these scripts may want to perform is the installation of custom configuration files. FAI provides the fcopy tool to help with this. Fcopy is aware of FAI’s class list and is able to select an appropriate file from the FAI config’s files subdirectory based on classes. The scripts/EC2/10-apt script provides a basic example of using fcopy to select and install an apt sources.list file. The files/etc/apt/sources.list/ subdirectory contains both an EC2 and a GCE file. Since we’ve enabled the EC2 class on our command line, fcopy will find and install that file. You’ll notice that the sources.list subdirectory also contains a preinst file, which fcopy can use to perform additional actions prior to actually installing the specified file. postinst scripts are also supported.

Beyond package and file installation, FAI also provides mechanisms to support debconf preseeding, as well as hooks that are executed at various stages of the image generation process. I recommend following the examples in the fai-cloud-images repo, as well as the FAI guide for more details. I do have one caveat regarding the documentation, however: FAI was originally written to help provision bare-metal systems, and much of its documentation is written with that use case in mind. The cloud image generation process is able to ignore a lot of the complexity of these environments (for example, you don’t need to worry about pxeboot and tftp!) However, this means that although you get to ignore probably half of the FAI Guide, it’s not immediately obvious which half it is that you get to ignore.

Once you’ve generated your raw image, you can inspect it by telling Linux about the partitions contained within, and then mount and examine the filesystems. For example:

admin@ip-10-0-0-64:~$ sudo partx --show /tmp/stretch-image.raw
NR START      END  SECTORS SIZE NAME UUID
 1  2048 16777215 16775168   8G      ed093314-01
admin@ip-10-0-0-64:~$ sudo partx -a /tmp/stretch-image.raw 
partx: /dev/loop0: error adding partition 1
admin@ip-10-0-0-64:~$ lsblk 
NAME      MAJ:MIN RM    SIZE RO TYPE MOUNTPOINT
xvda      202:0    0      8G  0 disk 
├─xvda1   202:1    0 1007.5K  0 part 
└─xvda2   202:2    0      8G  0 part /
loop0       7:0    0      8G  0 loop 
└─loop0p1 259:0    0      8G  0 loop 
admin@ip-10-0-0-64:~$ sudo mount /dev/loop0p1 /mnt/
admin@ip-10-0-0-64:~$ ls /mnt/
bin/   dev/  home/        initrd.img.old@  lib64/       media/  opt/   root/  sbin/  sys/  usr/  vmlinuz@
boot/  etc/  initrd.img@  lib/             lost+found/  mnt/    proc/  run/   srv/   tmp/  var/  vmlinuz.old@

In order to actually use your image with your cloud provider, you’ll need to register it with them. Strictly speaking, these are the only steps that are provider specific and need to be run on your provider’s cloud infrastructure. AWS documents this process in the User Guide for Linux Instances. The basic workflow is:

  1. Attach a secondary EBS volume to your EC2 instance. It must be large enough to hold the raw disk image you created.
  2. Use dd to write your image to the secondary volume, e.g. sudo dd if=/tmp/stretch-image.raw of=/dev/xvdb
  3. Use the volume-to-ami.sh script in the fail-cloud-image repo to snapshot the volume and register the resulting snapshot with AWS as a new AMI. Example: ./volume-to-ami.sh vol-04351c30c46d7dd6e

The volume-to-ami.sh script must be run with access to AWS credentials that grant access to several EC2 API calls: describe-snapshots, create-snapshot, and register-image. It recognizes a --help command-line flag and several options that modify characteristics of the AMI that it registers. When volume-to-ami.sh completes, it will print the AMI ID of your new image. You can now work with this image using standard AWS workflows.

As always, we welcome feedback and contributions via the debian-cloud mailing list or #debian-cloud on IRC.

Call for Testing: Stretch Cloud Images on AWS

Following up on Steve McIntyre’s writeup of the Debian Cloud Sprint that took place in Seattle this past November, I’m pleased to announce the availability of preliminary Debian stretch AMIs for Amazon EC2. Pre-generated images are available in all public AWS regions, or you can use FAI with the fai-cloud-images configuration tree to generate your own images. The pre-generated AMIs were created on 25 January, shortly after Linux 4.9 entered stretch, and their details follow:

ami-6d017002 ap-south-1
ami-cc5540a8 eu-west-2
ami-43401925 eu-west-1
ami-870edfe9 ap-northeast-2
ami-812266e6 ap-northeast-1
ami-932e4aff sa-east-1
ami-34ce7350 ca-central-1
ami-9f6dd8fc ap-southeast-1
ami-829295e1 ap-southeast-2
ami-42448a2d eu-central-1
ami-98c9348e us-east-1
ami-57361332 us-east-2
ami-03386563 us-west-1
ami-7a27991a us-west-2

As with the current jessie images, these use a default username of ‘admin’, with access controlled by the ssh key named in the ec2 run-instances invocation. They’re intended to provide a reasonably complete Debian environment without too much bloat. IPv6 addressing should be supported in an appropriately configured VPC environment.

These images were build using Thomas Lange’s FAI, which has been used for over 15 years for provisioning all sorts of server, workstation, and VM systems, but which only recently was adapted for use generating cloud disk images. It has proven to be well suited to this task though, and image creation is straightforward and flexible. I’ll describe in a followup post the steps you can follow to create and customize your own AMIs based on our recipes. In the meantime, please do test these images! You can submit bug reports to the cloud.debian.org metapackage, and feedback is welcome via the debian-cloud mailing list or #debian-cloud on IRC.

2016 Bavarian Bike & Brew Race Report

This post could easily have been titled “Why did this hurt so much?” or even simply, “What happened!?” It’s nice to come away from a race having learned something. Unfortunately, the lessons aren’t always pleasant. The best way I can phrase the lesson from this year’s Bavarian Bike & Brew race is like this If you want to find out whether your training is working, stop doing it for a year and see what happens. You’ll learn something, I promise. That’s basically what happened here.

Last year at this time, I was well into preparations for the BC Bike Race. I had spent time doing core work in the gym, focused rides on the trainer, and endurance rides on both the road and the dirt. I was in reasonably good shape, and my race results reflected that. Once the BC Bike Race was over, though, I basically began an early and very unfocused offseason. With no major event planned for 2016, I did nearly nothing to maintain the fitness I’d built up in the first half of 2015. I barely rode my bike at all, and certainly didn’t race.

So, jumping forward to this season, I missed the first couple of races with excuses that wouldn’t have worked last year. When the time did finally come to race, I really didn’t have a clue how I’d do. I knew that the weather would be similar to last year (HOT), and I knew I did well last year. So, maybe I’d do well this year, right? Eh, not so much, as it turned out. There are a lot of numbers to point at to tell the story, but most notably are these: My average speed for the race dropped by 0.8 mph from last year, and my time increased by more than 14 minutes. Not only that, but it hurt more and was less fun.

So at this point, what’s next? The next two weekends are race weekends, and both races are longer than Bike & Brew. I can’t expect my results to look like they did last year, that much is clear. My best chance to salvage something respectable out of this season is to effectively start from scratch. There are some worthwhile races later in the season (July, August, and September). It will be interesting to see if I’m able to rebuild any of that fitness in the next two to three months. It’s awfully discouraging, though, knowing that you’re starting from scratch in June something that you should have been working on since February.

This is going to be an interesting lesson to look back on during those dark, cold winter nights this coming winter. We’ll see if it provides sufficient motivation to get back on the trainer. Maybe committing to Singletrack Six or the the BC Bike Race again will help there, too.

Echo Valley Race Report

A couple of notable things stand out from last weekend’s race.

It was a super fast course and the weather was beautiful. It had rained the day/night before, so there was no dust and the traction felt infinite. (unlike a couple years ago, where a rider 20 feet in front of you would literally vanish in a dust cloud)

I didn’t take any time to warm up before the race, and it starts with a mile of climbing right out of the gate. Bad idea. I felt ok for the first half mile, which was fire road, and entered the singletrack in 6th position. Unfortunately, the next half mile was a disaster, as my lack of warmup caught up with me and I blew the engine completely. Several riders passed me, and I dropped to probably 12th to 15th position. Once I recovered from this mess, I rode strong for the rest of the day. I never caught the lead group, but was never passed again either.

Unfortunately, the course, while extensively marked, somehow got really confusing for a lot of riders. Lots of people, myself included, ended up taking wrong turns at various points. At one point, despite following guidance from course marshalls and signs, I wound up doing two laps of a mile-long section that should only have been ridden once. Other people had similar stories. While stopped at the second aid station, I had about 25 miles logged so far, and heard from another rider who had just crossed the 30 mile point! So the results of the race a largely meaningless. Oh well. Fortunately it’s not like this was my big priority for the season (That one is still coming up!)

Technically, this was my first race (first dirt ride, really) on the new XTR build on the Niner. Holy crap, it did not fail to impress! Despite only having done a short shakedown ride on pavement with this setup previously everything worked well. I don’t think I could have said that about the previous build at any point in its lifetime.

The one technical drawback was that the fork, which was recently serviced by a nearby shop, had spewed all its oil and had thus completely lost all its rebound damping functionality. That was seriously annoying, and I expected much higher quality work out of this shop. I’ve since replaced the seals and oil in the fork myself, so I’m back in business. Took a lot less than the week-and-a-half turnaround at the shop, too…

Stottlemeyer Race Report

A whole mess of stuff has changed since the Beezley Burn race. Mostly this involved training and bike fit stuff. I had a new saddle and freshly dialed geometry, and this was also the first race in which I rode with Time’s ATAC pedals. I was interested to see how things would go. This is also the race I’ve done the most in my time in Washington, so there’s a lot of data to compare it with. The field is larger than the Beezley Burn race, especially since it’s not broken up by categories, only by age. This means that I’m going up against everyone from first-time racers to Really Fast Guys.

I’ve never been really good about starting near the front in large mass-start races, and I’ve run in to trouble because of this in the past. This race was different, though, as I managed to sneak in to the first line at the start. It quickly became apparent, though, that while this tactic worked at the Beezley Burn, it was a different story against these Really Fast Guys. My immediate feeling was of being left behind. I hadn’t done any warming up before the start, since the race is fairly long and I didn’t want to expend too much energy too soon. This likely impacted my explosiveness (ha!) at the start.

On the initial doubletrack climb, I found a reasonably comfortable and fast pace, though it was no match for the group up the road. My only hope there was that they went too hard and the beginning and would fade later. Meanwhile I had to contend with some difficult singletrack. Stottlemeyer is a fun course, with a bit of an East Coast feel. It’s really tight, wooded, and relatively flat. All it needs to be a proper East Coast trail is a whole lot of granite. I like riding in this type of terrain, but I really am not super fast at it. It really rewards good efficient bike handling skills. It’s a real challenge to keep the speed up. I tend to do better in more open trails where I can kick in to time-trial mode. So all this while, I was pretty sure the lead group of riders was getting further and further away. Turns out I was right.

I carried one water bottle with me and planned to refill it at the aid stations as needed. At the third aid station, just at the end of the first lap, I did so. Unfortunately, the process was really slow. There were volunteers there filling bottles, and they worked as fast as they could at their jobs. Unfortunately, there were just two big orange gatorade coolers with filler spigots, and they don’t fill nearly fast enough to satisfy a racer! I feel like I lost at least 3 minutes there, though in reality I bet it was only one. Either way, it was a slow process! While waiting for my bottle to fill, I grabbed a mini Clif bar and ate the whole thing in one bite. This was not a great idea. My bottle finished filling just as I put the bar in my mouth, and I took off into the singletrack with 3 other racers. Working to stay with them and eventually pass one of them didn’t leave me much time to drink from my bottle, but I really had to! I was having a hard time chewing and swallowing the Clif bar, and had to choke back gags before I finally was able to grab a quick shot of water and get everything moving properly.

In past years, lap 2 of this race has been really challenging. Particularly the second half, after passing the fourth aid station. The remaining miles have always seemed to go on longer than they should have, and my legs felt weak. This year was different, though. I felt strong throughout the second lap and was able to pass some other riders with authority. Certain sections of trail even seemed faster, probably due to familiarity. Unfortunately here is where I ran in to some mechanical issues.

A few relatively minor things went wrong mechanically this year. None was catastrophic, but all were annoying. First, likely due to a seized pivot bearing, a linkage bolt worked itself free and fell out. I should have dealt with the seized bearing sooner, as I knew it was a problem, so this is my own fault. Ultimately it didn’t stop me from finishing the race, but it certainly could have. The inconvenience of having to now replace both the bearing and the fancy custom bolt could have been far worse had the lack of support damaged the frame or swingarm. Fortunately that didn’t happen!

The second mechanical issue was that the shifter cable endcap at the rear derailleur got caught in the chain and wedged itself between two of the cogs. This screwed shifting up pretty bad in some gears.

Another mechanical issue that has probably been impacting me a bit for a while is that the front shifter housing seems to be torsioned in such a way as to cause it to twist itself against the barrel adjuster knob on the front shifter. This has the effect of essentially screwing the barrel adjuster in, decreasing the tension on the cable. This meant that my front shifting, particularly from small ring to big ring, degraded over the course of the race until I realized what seemed to be happening and made adjustments. I should be able to deal with this better in the future now that I’m aware of what’s going on. Of course, I may not be racing again with this same drivetrain because…

XTR is on the way! Assuming Shimano ever manages to ship the whole thing, anyway! I’ve got a whole pile of fancy brakes and drivetrain bits in the basement but I lack a crankset and front shifter lever! More on that later!

Race results are at webscorer.

Beezley Burn Race Report

Rode hard out of the starting gate and put a lot of hurt into my competitors (not to mention my own legs!). The first 2 km were on a wide open straight fire road with a headwind. Didn’t really want to be leading the charge into the wind, but I didn’t want to be following anybody else’s pace either. Shortly before we entered the singletrack, I let Peter Super and Steven Moe pass, and the three of us remained together for most of the first lap. Steven ran out of gas late in the lap, leaving me and Peter together, with Peter leading. Staying on his wheel was tough, but manageable. After some time, we were caught by Matthew Faunt in the 19-34 category. He didn’t stick around long, and to my eye looked easily ready to upgrade to Cat 1. He passed us an intersection at the top of a short climb where the trail marker had fallen over and it wasn’t clear which way to go. Peter and I both paused, but Matthew appeared to have pre-ridden the course and took of in the right direction without hesitation. I never really saw him again. Peter gained several seconds on me at this point, accelerating faster than me following of our moment of indecision. I chased hard to catch Peter again, and wasn’t sure I’d manage to do so. After maybe 4 km of chasing, I caught him again just past the feed zone entering the second lap. I lead into the singletrack off the long fireroad straightaway, with Peter right on my wheel, then exchanged positions on the next stretch of road. We road together for the first half of this lap. When the trail opened up into some wider doubletrack around the midpoint of the lap, I decided to test Peter’s legs with an attack. He didn’t appear to respond, which was a good sign, but there was still a fair bit of racing to go and he didn’t necessarily need to respond immediately. If he had the energy, he could take his time and reel me in slowly. However, when I looked back after a leg burning climb and saw him struggling to get up some of the punchy steep sections, I knew I had a good chance to hold him off over the last few kilometers. In the end, I managed to do exactly this. The only disappointment was when I was caught within a kilometer of the finish by the second and third place riders from the 19-34 age group and didn’t really have the gas to stay with them as they finished with a strong head-to-head sprint.

Some technical details, mostly for my own reference:

Shock setup:

Air pressure Rebound damping Notes
110 psi 2 clicks Ran wide-open (descend mode).
Generally felt good; didn’t lose efficiency.
Need more rebound damping (already increased 2 clicks, post-race)

Fork setup:

Air pressure Rebound damping Notes
75 psi ? Felt soft, so I ran in Trail Mode, which helped. Traction was good. Want Descend Mode to feel a little closer to this setup.

Tire pressure was 30 psi, which is higher than I often run. Worked out well though.

Other notes:

  • Rear derailleur worked well; shifting felt good throughout.
  • Front derailleur = sadness. Chain kept falling off the big ring onto the small ring. Chain stretch or limit adjustment issues.
  • Brake pads were brand new and hadn’t been fully bedded in. Felt really grabby early on but fine later on.
  • Short enough race that there was no need/time to eat.
  • Not a lot of time to drink. I brought 2 bottles, each 2/3 full. Drank most of the first bottle during and just before the first lap; very little of the second bottle during the second lap.

Building OpenWRT With Docker

I’ve run OpenWRT on my home router for a long time, and these days I maintain a couple of packages for the project. In order to make most efficient use of the hardware resources on my router, I run a custom build of the OpenWRT firmware with some default features removed and others added. For example, I install bind and ipsec-tools, while I disable the web UI in order to save space.

There are quite a few packages required for the OpenWRT build process. I don’t necessarily want all of these packages installed on my main machine, nor do I want to maintain a VM for the build environment. So I investigated using Docker for this.

Starting from a base jessie image, which I created using the Docker debootstrap wrapper, the first step was to construct a Dockerfile containing instructions on how to set up the build environment and create a non-root user to perform the build:

FROM jessie:latest
MAINTAINER Noah Meyerhans <frodo@morgul.net>

RUN DEBIAN_FRONTEND=noninteractive apt-get update && apt-get -y install \
asciidoc bash bc binutils bzip2 fastjar flex git-core g++ gcc
util-linux gawk libgtk2.0-dev intltool jikespg zlib1g-dev make \
genisoimage libncurses5-dev libssl-dev patch perl-modules \
python2.7-dev rsync ruby sdcc unzip wget gettext xsltproc \
libboost1.55-dev libxml-parser-perl libusb-dev bin86 bcc sharutils \
subversion

RUN adduser --disabled-password --uid 1000 --gecos "Docker Builder,,," builder

And we generate a docker image based on this Dockerfile per the docker build documentation. At this point, we’ve got a basic image that does what we want. To initialize the build environment (download package sources, etc), I might run:

docker run -v ~/src/openwrt:/src/openwrt -u builder -t -i jessie/openwrt sh -c "cd /src/openwrt/openwrt && scripts/feeds update -a"

Or configure the system:

docker run -v ~/src/openwrt:/src/openwrt -u builder -t -i jessie/openwrt make -C /src/openwrt/openwrt menuconfig

And finally, build the OpenWRT image itself:

docker run -v ~/src/openwrt:/src/openwrt -u builder -t -i jessie/openwrt make -C /src/openwrt/openwrt -j3

The -v ~/src/openwrt:/src/openwrt flags tell docker to bind mount my ~/src/openwrt directory (which I’d previously cloned using git) to /src/openwrt inside the running container. Without this, one might be tempted to clone the git repo directly into the container at runtime, but the changes to non-bind-mount filesystems are lost when the container terminates. This could be suitable for an autobuild environment, in which the sources are cloned at the start of the build and any generated artifacts are archived externally at the end, but it isn’t suitable for a dev environment where I might be making and testing small changes at a relatively high frequency.

The -u builder flags tell docker to run the given commands as the builder user inside the container. Recall that builder was created with UID 1000 in the Dockerfile. Since I’m storing the source and artifacts in a bind-mounted directory, all saved files will be created with this UID. Since UID 1000 happens to be my UID on my laptop, this is fine. Any files created by builder inside the container will be owned by me outside the container. However, this container should not have to rely on a user with a given UID running it! I’m not sure what the right way to approach this problem is within Docker. It may be that someone using my image should create their own derivative image that creates a user with the appropriate UID (creation of this derivative image is a cheap operation in Docker). Alternatively, whatever Docker init system is used could start as root, add a new user with a specific UID, and execute the build commands as that new user. Neither of these seems as clean as it could be, though.

In general, Docker seems quite useful for such a build environment. It’s easy to set up, and it makes it very easy to generate and share a common collection of packages and configuration. Because images are self-contained, I can reclaim a bunch of disk space by simple executing “docker rmi”.

Spamassassin Updates

If you’re running Spamassassin on Debian or Ubuntu, have you enabled automatic rule updates? If not, why not? If possible, you should enable this feature. It should be as simple as setting "CRON=1" in /etc/default/spamassassin. If you choose not to enable this feature, I’d really like to hear why. In particular, I’m thinking about changing the default behavior of the Spamassassin packages such that automatic rule updates are enabled, and I’d like to know if (and why) anybody opposes this.

Spamassassin hasn’t been providing rules as part of the upstream package for some time. In Debian, we include a snapshot of the ruleset from an essentially arbitrary point in time in our packages. We do this so Spamassassin will work “out of the box” on Debian systems. People who install spamassassin from source must download rules using spamassassin’s updates channel. The typical way to use this service is to use cron or something similar to periodically check for rule changes via this service. This allows the anti-spam community to quickly adapt to changes in spammer tactics, and for you to actually benefit from their work by taking advantage of their newer, presumably more accurate, rules. It also allows for quick reaction to issues such as the one described in bug 738872 and 774768.

If we do change the default, there are a couple of possible approaches we could take. The simplest would be to simply change the default value of the CRON variable in /etc/default/spamassassin. Perhaps a cleaner approach would be to provide a “spamassassin-autoupdates” package that would simply provide the cron job and a simple wrapper program to perform the updates. The Spamassassin package would then specify a Recommends relationship with this package, thus providing the default enabled behavior while still providing a clear and simple mechanism to disable it.