Here’s the thing about running your own cloud infrastructure: once you make the decision to rely on it, then it had better work. The whole thing. Every part of it. Under heavy load. All the time.
Obvious, right? But it bears repeating. When you decide to make the move to doing things The Cloud Way, you are placing a gigantic bet on your infrastructure layer — and that bet is placed not only on the Cloud As A Whole, but on every individual component that comprises that cloud. In the open source world, these are frequently components that you didn’t write and do not control. I can assure you that customers don’t care in the least.
At Eucalyptus, we have smart and demanding customers, with extremely high expectations. They are not content with assurances that things will be production-ready at some magical release point in the future. They don’t care whether the bugs are in the cloud controller code, or the node controller code, or in libvirt, or in the kernel. They are using Eucalyptus at extreme scale, right now, to solve extreme business problems, right now. Which means that when their cloud breaks, they expect fixes right now — and if that means libvirt patches or kernel patches, that’s what it means. That’s why they give us all that nice money. That’s why customers pay us for free software.
Our customers try to squeeze every ounce of performance out of their machines; that’s part of the point of having a cloud, after all. And when the virtualization technologies we depend upon experience heavy load over a long period of time, we see some crazy things. Like segfaults in libvirtd, for instance. Or libvirt handlers that suddenly and inexplicably lose their mind. Or other weird occurrences that might lead one to believe that libvirt isn’t quite as thread safe as advertised. These failures may only occur at times of very high load, and they may not happen often — but they do happen. And when they happen, we have to handle them. The 3.1.2 release is the result of many hours of hard work by our engineers to find and fix these issues.
It’s a challenge and a privilege to serve customers like this. At times it can put incredible stress on the entire organization — and it’s at precisely these times when we are at our very best. Watching great engineers solve critical problems under pressure is a lot like watching great athletes at the end of a big game — and when they win, it’s just as exhilarating. These engineers are at the heart of what we do. Compared to them, I’m just selling tickets and fetching Gatorade.
It’s not that hard to put together a bunch of components and call it a cloud. But making a cloud bulletproof? That’s hard. And that, friends, is where we are the best in the world.
…is that any time a user runs into problems figuring out how to do something with Eucalyptus, it’s quite likely that the corresponding AWS procedure, as documented by the AWS community, will “just work”.
Example: growing an EBS volume. The commands listed here:
…basically work by switching ec2-tools and euca2ools. It’s nice to have that kind of knowledge base to fall back on, even as a starting point that may need to change subtly for euca-specific cases.
“I never see what has been done; I only see what remains to be done. –Marie Curie”
Automated installers are great. When they work, they work really well — but when they don’t, not only do they not work, but they bring great sadness to the hopeful user who trusted your automated installer. Tragic! Heartbreaking.
So why don’t automated installers work, when they don’t work? In almost every case, it’s because there’s a condition your installer assumes that isn’t met. And in this day and age, you don’t have just one installer for your software, you’ve got multiple potential install+config tools: multiple package managers, multiple configuration tools, multiple permutations of hardware, multiple permutations of hypervisor, multiple permutations of network topology. Which means that you’d better do a *great* job of figuring out your environment before you try to lay down and configure your bits.
Enter Nurse Euca. Nurse Euca will run before any install and take everyone’s temperature, offer an aspirin or a splint where needed — or will let you know if one of your requirements is Dead On Arrival. (“I’m sorry, Doctor, but em1 appears to be in septic shock. I recommend against resuscitation.”) (That’s totally gonna be an error message, btw.)
Awesome, right? Well, it will be when we write it. We’ve got bits and pieces of these kinds of checks in various places. On Friday we will be having a hackfest to pull these threads together and get Nurse Euca jumpstarted.
Hey, here’s a mostly-empty Github repo! By the end of Friday, it won’t be.
We’ll be on freenode, #eucalyptus-devel, at 7am Eastern US time. Yes, it’s early; we’ve got some friends in the Old World who will be hugely helpful, so we’ve chosen the time to accommodate them. See you then.
Our next hackfest is this Friday, August 10th, from 11am to 2pm Pacific Time, in #eucalyptus-devel on freenode. (If you show up at #eucalyptus or #eucalyptus-meeting, we will kindly direct you the right way.) We will be working on integrating with AppScale.
You don’t know about AppScale, you say? Well, you should. AppScale is an open source platform for Google App Engine apps. The idea is that many applications designed to run on Google App Engine should “just work” with AppScale and your own cloud infrastructure.
There’s an Ubuntu-based AppScale image already built and ready to go for Eucalyptus; in the next couple of days, we’re going to get that running on our Eucalyptus Partner Cloud, and then we will see if we can get some of this App Engine code running on top of it. By the end of the hackfest, we hope to have a few of these sample apps up and running, and filed away as Eucalyptus recipes.
If you want to join in the fun, just drop a line to the Eucalyptus community list and we’ll be happy to set you up with all the tools you’ll need. See you on IRC.
Well, that was fun. :)
Some lessons learned from this week’s inaugural Eucalyptus hackfest:
1. Make sure we’ve got the right image prepped. We could have sworn that we needed F17 for OpenShift Origin — turns out we needed F16. We were halfway through our allotted time before we had a suitable F16 image.
2. Openshift Origin is *big*. There are a *lot* of packages. There are the packages you need to install to get rake working, and then there are the packages that the rake script installs… and *then* there are the packages that rake *builds* (which is why it installs mock on your instance — we were wondering about that, and then we found out.) My large image couldn’t keep up; Andy finally had some success with an x-large image.
3. I like cloud-init in F17 way better than I like it in F16, because it gives me better log files.
4. Two hours isn’t enough time to finish a hackfest, but it’s definitely enough time to get a good start, and to get excited on what you’re working on. Next up: tackling configuration issues.
Thanks to all the folks who showed up. Looking forward to next week’s hackfest, whatever that may be.
We’re going to be starting up our weekly IRC hackfests on #eucalyptus-devel next week.
There’s a lot of cool integration work of various kinds that we want to do with Eucalyptus, and it’s the kind of work that’s best done with many hands. A lot of it is just “getting X to run on Eucalyptus,” and we want to fill in as many possible values of X as we can. Thus, hackfests.
The goal is to have at least a couple of hours of non-interrupted hacking time every week, and we’re going to aim for end of week, either Thursday or Friday afternoon. Figuring out timing is always an issue, so far now we’re just going to pick a time and see how it works out. The first hackfest will be noon-2pm Pacific time on Thursday, August 2nd on #eucalyptus-devel. This will overlap somewhat with the standing recipes meeting, but since we’ll likely be working on recipes much of the time, I think we can swing it. We expect to have a few core people present at these hackfests every single week, but of course, the more the merrier. It’s also perfectly fine for people to drop in and drop out as they may be available.
Our first target will be OpenShift Origin integration — so we’ll be all over the #openshift channel on freenode, and dragging as many of you as we can to #eucalyptus-devel in the process. :)
(update: what we’re working on is actually integration of “OpenShift Origin” — the bits that are used to make the OpenShift service, which is trademarked by Red Hat, etc., etc. Must respect the brand. Post updated accordingly.)
One of the projects I’m enjoying working on right now is the Eucalyptus Recipes project, which you can find on Github. I actually hacked together some code, and even checked it in! Needless to say, patches welcome. And if “patches” means “complete replacement with better code,” that’s fine also.
The goal is to build a collection of recipes (small right now, but growing) that any Eucalyptus user can inject into the boot process of an instance at start time, using cloud-init or a similar mechanism. Simple predefined Euca image + Euca recipe of your choice = fully configured software appliance. Because all Eucalyptus users have access to a standardized set of pre-built images, we can be relatively sure that any recipe that builds atop a particular image will be guaranteed to build properly anywhere that image runs.
This is in contrast to an image-based approach, to which AWS users have become accustomed. There are thousands of pre-built AMIs out there from which AWS users can pick and choose. That’s good, because there are images for almost every imaginable need — but it’s also problematic in a lot of ways. These AMIs are basically opaque. You don’t know what’s in them, you don’t know who built them, you don’t know how they were built, and until you actually run one, you don’t know what they actually do. The new improved AWS image catalogue will help this some, but it’s a problem inherent to the image model.
At Eucalyptus, we’re working on an images project as well, but I believe that the recipes approach holds more promise in the near term. Here’s why:
1. Storage. Eucalyptus provides a mechanism for users to fetch a set of predefined Eucalyptus machine images (EMIs). One day, we may provide a huge catalog of pre-built EMIs, but in the short term, we’re not really set up to host such a thing. With the recipe approach, we can concentrate on providing a small set of minimal EMIs for the major distros, and we can test them thoroughly so that they make a strong base for building from.
2. Ease of customization. In a pre-baked image, the configuration is fixed. If you want to change how the image works, it means hacking the image in place and rebundling it. That’s a pain, especially for, say, changing the MySQL root password for your spiffy WordPress install. Following the recipe approach, you just fork the recipe, replace passwords and other sensitive options in the forked recipe itself, and then build with the forked recipe.
3. Education! Read the recipe, and you can see how the application is actually built and configured. This is important to me personally; I distrust black boxes, and when I was a heavy AWS user, it was one of the things that made me nervous. Four Kitchens made a great Drupal+Varnish AMI available, and it “just worked”, which was pretty sweet and saved me a bunch of time — but I lived in a low-grade fear that if something went wrong, I wouldn’t understand how it was configured. My hope is that we end up with some very well-documented and interesting recipes that also teach people a little bit about how things work along the way.
4. Community development. If an AMI or an EMI is broken, patching it basically means creating an entirely new image that has no evident relationship to the old one. There’s really no clear concept of “upstream” with an image, and no simple way to collaboratively improve upon it. Defining an appliance as a script in Github, on the other hand, makes collaborative development and improvement of that appliance comparatively straightforward; it works just like any other open source project.
5. Integration with complementary tools. I wrote my first recipe in bash, because when it comes to coding I’m a bit simple, really, and nothing to be done. And it’s not as though this recipe notion is a new one; Puppet and Chef both have emerging forges with recipe collections of their own, and two of the first recipes we wrote were for Chef and Puppet bootstrappers. I’m not quite sure how it will work, but it’s pretty clear that many of the recipes will be “hey, make sure Puppet is running, and then go get that Puppet recipe from over there and run it.” One of the recipes I checked in recently sets up nginx based on the Puppet forge recipe.
6. Amazon compatibility. There’s no reason in the world that these recipes shouldn’t work on AWS as well. It’s my hope to add “tested with these AMI IDs” as part of every recipe’s documentation.
To be clear, there are also a couple of downsides to the recipe approach:
1. Time to instantiation. The image versus recipe dispute is age old, and one reason people have traditionally chosen to run from images is because they are “ready” so much more quickly. Going from image to fully functioning instance in Eucalyptus takes seconds; going from image, to recipe, to fully functioning instance can take minutes. When that difference matters, images are still the way to go — although I still think the right approach is to use a recipe to create an instance, and then to snapshot that instance and store it as the deployable image.
2. Proprietary applications. There will doubtless be organizations that will want to deliver proprietary software appliances to Eucalyptus users. This mechanism may not be suitable for those providers, since it’s fairly incompatible with secret sauces.
As it turns out, recipe building is also a perfect use case for our Eucalyptus Community Cloud. The ECC is intended to give potential Eucalyptus users a sense of how Eucalyptus works — but because the ECC is small and resource-constrained, we kill instances every six hours or so. When writing recipes, though, iteration is the name of the game, so it’s perfect. I wrote a Drupal 6 recipe over a weekend using the ECC.
Want to check out a recipe on the ECC? Simple stuff:
* Install euca2ools on your local system. It’s yum/apt-get installable from most repos at this point.
* Get your account on the ECC.
* Download and source your credentials for the ECC. Be sure to set up your ssh keys as well.
* Get the recipes repo:
git clone https://github.com/eucalyptus/recipes.git
* Get a list of images available by running euca-describe-images. Pick the base image you want to start from. The ID of the vanilla CentOS 6.2 image is emi-D482103E.
* Start your instance with the recipe, for instance:
euca-run-instances -k yourkey emi-D482103E -t m1.large -f centos6_nginx.sh
* ssh into your instance and watch the show. (For me, this was mostly tailing the yum log.)
So, it’s the beginning of a thing. Like all beginnings of all things, its future is uncertain — but it feels useful to me, and I hope that we can build some value with it in the coming weeks and months.
Oh, also: see you at OSCON.
Eucalyptus 3.1 is open for business.
No more artificial separation between Enterprise and Community. No more frenzied checkins to the “enterprise edition” while the separate-but-equal “community version” atrophies. No more working on new features behind closed doors for months on end. No more wondering about what’s on the roadmap. No more going weeks without any publicly visible check-ins. No more.
Today is the day that we release Eucalyptus 3.1, and reassert our position as the world’s leading open source cloud software company. With the emphasis on open source. We’ve been working to get to this day for months, and now, the day has come.
For those who want to get started with the new bits immediately, the Faststart installer can be found here. With two virt-capable laptops installed with Centos 6.2 minimal, you can have a private cloud running in 15 minutes if you follow the directions — and a few hours if you don’t. :)
Package repositories for the various distributions can be found here.
A list of all currently known bugs in 3.1 can be found here.
The list of features we’re currently scoping for 3.2 can be found here.
We have lots of other projects moving forward on Github as well. Projects like Eutester for automated testing of Eucalyptus (and Amazon) instances, Recipes for automated deployments of Eucalytpus (and Amazon) instances, our nextgen installer Silvereye, and many others.
All of these projects are open to community participation and transparently managed. We hold weekly meetings on IRC. You can find the weekly meeting schedule here. Minutes for all meetings for the past six months can be found here.
We’re also hiring.
“Build together. Run together. Manage together.” That’s been the mantra for this release, and it speaks directly to the culture of our company. If I learned anything at Red Hat, it’s that company culture matters. It literally makes or breaks the company. Especially in open source: either you’re an open source company, or you’re not. We are deeply committed to the open source model, because we believe that it creates the best software, and we’re going to prove it.
The most exciting thing about today’s release, to me, is that we’re only getting started. It’s been a long climb to get to this plateau. We’ve still got a lot of mountain yet to climb, though, and we’re looking forward to the challenge — but that can wait for another day. Maybe two. Today is about appreciating where we’ve been, and enjoying the view.
Well done, Eucalyptians. Well done.
Note: beta still means beta. We’re aiming for release candidates for Eucalyptus 3.1 within the next month or so. Still, these packages are pretty stable for us so far, pass the majority of our ridiculous battery of QA tests, and are altogether suitable for a quick install to see what the fuss is all about. And it’s a whole lot simpler than building from source.
It’s taken a while, but the move is complete. The source code for Eucalyptus 3.1 Beta is open and publicly available in Github. It’s actually been there for a while now, but we’ve done enough housekeeping and we’re ready to open the doors.
Build instructions can be found in the INSTALL file, but they are still in flux; comments and patches are welcome. Don’t hesitate to join us on #eucalyptus on freenode or on our community mailing list if you have questions.
Packages for the beta will be available for various distros in the coming days. Special props go to Debian Partner company Credativ for their impressive work on the Google Web Toolkit libraries.
We’re also working on our new bug tracker; we’re in private beta to work through various auth and workflow kinks. If you’re interested, ask for access on IRC or the mailing list, and we will set you up. After this beta period is concluded, we will open the new bugtracker to anyone and everyone — but we’re happy to give early access to anyone who asks.
This is another critical step in our evolution as an open source company. But we’re not done yet. Stay tuned.