Eucalyptus 3.3 Show and Tell. Y’all come!

After every sprint, the Eucalyptus engineers hold a Show and Tell session, where they share what they’ve been working on with everyone else in the company.  Technically, these Show and Tell sessions are for “product management validation acceptance” or some such blah blah blah — but I like it because it lets all of the engineers strut their stuff, so they can be stars inside the company.

Well, bleep that.  I say let ’em be stars outside of the company too!

Thus: on Wednesday March 13th, at noon Pacific time, we will be opening our Show and Tell to the entire world.  It’ll probably run a couple of hours or so.  Some of the goodies we’ll be showing for 3.3 Sprint 4:

* nearly full implementations of ELB, Cloudwatch, and Autoscaling

* mechanisms for node evacuation and instance migration

* user console improvements and features

* next generation storage adapters

We’ll run a Gotomeeting session for the audio and screen sharing, and we will be on #eucalyptus-meeting on freenode for those who want to ask questions of the Eucalyptus engineering team — time allowing, of course, since we’re going to have a lot to show off. 🙂

Sign up now and we’ll see you online on Wednesday.

Eucalyptus 3.3 Show and Tell. Y’all come!

Extending the Eucalyptus lead in AWS compatibility

Recently, we released our latest milestone build for Eucalyptus 3.3.  Go take it for a spin. This is a big one, since it incorporates, for the first time, functional versions of the “Big Three” AWS services we’ll be releasing later this spring: Autoscaling, Elastic Load Balancing, and Cloudwatch.  It also presents a good opportunity to step back and look again at our AWS compatibility story.

It’s no secret that Eucalyptus believes in the power of the Amazon Web Services API. Amazon continues to be the dominant public cloud, and they continue to widen the gap between themselves and the other public cloud providers.  This is thanks, in large part, to an API that is well considered, well documented, and has therefore spawned a powerful and growing ecosystem.  The rule of thumb for those who are developing code for the public cloud has thus become “make it work for Amazon first” — to such a degree that even VMware is running scared.

Such is the strength of the de facto standard.  The abstractions that are now being developed by every other cloud provider are either strongly influenced by, or derived directly from, the abstractions pioneered by AWS.  Every single cloud provider has their own alternatives to EC2 and S3 — but AWS, having mastered these services long ago, is now moving quickly through a set of higher level abstractions.  It is possible — perhaps even likely — that the AWS API, as the continual representative of the newest and best cloud abstractions, will emerge as a dominant standard for cloud applications in much the same way that the LAMP stack emerged as the dominant standard for web applications.  It will simply be assumed that if you want to provision applications rapidly at scale, you will either write an application that uses the AWS API directly, or you will depend upon a PaaS that works first and best with the AWS API.

To us, it has always made perfect sense to follow Amazon’s strong lead.  To build the strongest private cloud, build the best possible complement to the strongest public cloud.  And to ensure that users always have options, make sure that the private cloud in question is open source.

The next, obvious question: how do we do that, exactly?

We’ve been refining our approach for more than half a decade.  Which is about a thousand cloud years.  Our approach can now be boiled down to two fundamentals:

1. Implement services as closely as possible to the way Amazon implements them.  Which means paying close attention to every detail of every API interaction. It means closely tracking all compatibility issues we find, and treating them as critical.  It means writing tons of in-depth tests that can run against both Eucalyptus and AWS. And it means using the AWS WSDL to construct service stubs, which dramatically improves the speed with which we can produce new features, as we’re doing with this release.

2. Leverage the AWS ecosystem to define and test the limits of compatibility. Which means relentlessly testing third-party tools and libraries against Eucalyptus, and being satisfied only when those tools work as well against Eucalyptus as they would work against AWS. Compatibility is a journey, and we will consider ourselves “compatible” when our users can treat Eucalyptus as their own personal region of AWS.  Think of it a hybrid cloud Turing test.

All that said, we strongly encourage the diversity of tools to create a high-level compatibility between AWS and other public cloud services, and we see some value in tools like AWSome and Deltacloud and CloudBridge and the like.  Over time, as other cloud vendors continue to refine their own versions of the AWS abstractions, it will be easier to make common high-level tools that encompass actions across all various public and private clouds, and that may make it a bit easier for developers to write code that is truly portable across many different clouds.  (Then again, maybe not.)  In the meantime, though, we will let others focus on Broad Compatibility, while we continue our laser focus on Deep Compatibility, and the advantages that we will provide to our users as a result.

Anyway. Go install the latest milestone build for yourself.  We’ve also got some great demo scripts that will allow you to try out the latest and greatest functionality. Don’t hesitate to drop by on IRC (#eucalyptus on freenode) and let us know what you think.

Extending the Eucalyptus lead in AWS compatibility

AMI-to-EMI: ACHIEVEMENT UNLOCKED!

The ami2emi project has been moving along. As in, it actually works for a number of cases now. Configure cloud parameters for AWS and Euca, run a script, and boom: your chosen AMIs are brought to life on your Euca cloud. When it works, it Just Works. It’s cool. 🙂

Note that these cases do *not* yet cover configuration of the applications themselves — just the images.  The bits might still need to be twiddled to get the apps working properly, but all of the actual bits are successfully transferred into the new image, and the new image successfully spins up instances, and you can ssh to them and everything.

The AMIs that can currently be auto-slurped into Eucalyptus successfully share certain characteristics:

1. They carry their own kernels inside the image. On the AWS side, that means they’re linked against the stock pv-grub kernels, and we link them similarly to the kexec-loader kernels on the Euca side.

2. They’re instance-store images, rather than EBS-based images. At least so far.

You can find the list of currently tested images here. That list will expand rapidly as we have time to run more test cases.

Note the large number of Bitnami instances in this list. That must be because they’re Crazy Awesome.

So, give it a whirl. Patches exceptionally welcome, since it’s heinous bash scripting and I can use all the help I can get. At least I document my code, sorta. 🙂

AMI-to-EMI: ACHIEVEMENT UNLOCKED!

Converting AMIs to EMIs

One of the most common questions I’ve heard asked from Eucalyptus users is this one: “how easy is it to convert an AMI to an EMI?” And the answer is: not as easy as it should be.

We’ve got some process guidelines on our wiki, thanks to Tim Gerla — but we should have the ability to do much of this automagically.  So that’s the tool I’ve started work on.  In the spirit of “release early release often”, I’ve uploaded a very early iteration of this tool.  Find the brand new Github repo here.

It’s quite a ways yet from being prime-time, but it’s allowed me to do quite a bit of testing.  Some assumptions I’ve started with:

* First, I pulled a list of all public AMIs on us-east-1.  There were about 20,000 public images available there as of mid-November.

* Then I selected the subset of AMIs that were built with a PV-GRUB kernel, and I’m importing them to an instance of Eucalyptus running the kexec-loader kernel.  In both cases, the AKI/EKI is just a bootstrapping mechanism that then hands control over to the image’s own kernel, so we shouldn’t get caught up by kernel incompatibilities.  Using only the subset of kernels with PV-GRUB AKIs leaves us with about 7000, and picking one particular AKI (aki-825ea7eb) gives us about 1700.

* From there, we’re working with individual distros, and there will be idiosyncracies between the various distros out there, so it makes sense to pick one and go with it.  There’s a lot of Ubuntu out there on AWS, so I just grepped on AMIs with “ubuntu” in the name.  That’s 1067 images in my dataset.

* Now that we’ve got a reasonable set of images to examine, here’s the process that the scripts walk through.  For each AMI, we start an AWS instance, ssh to that instance, install euca2ools if they aren’t already installed, scp Euca credentials to the instance, and bundle the instance to the specified Euca cloud.  Then we fire up the resultant EMI on the Eucalyptus side and see if we can ssh in.  And we bail with appropriate error messages at various places along the line.

I’ve run through a few hundred images at this point.  Not one of them has been completely successful from start to finish.  About 10% so far have yielded a bundled EMI that boots and yields a Eucalyptus instance in a running state.  Can’t ssh into any of them yet, though.

The good news is that the failures are all quite specific and predictable, and the next steps are clear.  Do a better job of guessing login IDs.  Figure out why fstabs fail.  Look for rogue kernel modules.  Make sure we’re doing key injection properly on the Euca end.  Comb through the results of euca-get-console-output and look for patterns.  The big win is having tools that allow us to do that work in minutes, rather than in hours or days. Every step gets us closer to the goal of fully automated conversions on the fly.

Oh, and an apology: a lot of this should probably have been written using Eutester, instead of as a bunch of shell scripting.  The fact is, I’m a terrible hack.  But I’m just leading the charge temporarily; when I’ve figured out the basics, the real coders can swoop in and do things the right way.  In the meantime… patches welcome.

Converting AMIs to EMIs

Shoes for the Cobbler’s kids

We’re big fans of the Cobbler project here at Eucalyptus. We think it’s the best tool in the open source world for bare metal provisioning.  We’ve invested in a gigantic QA environment for continual integration testing, and Cobbler is one of the linchpins of that environment. It’s the kind of tool that’s best appreciated by sysadmins who deal with *a lot* of systems.

I’m sort of attached to Cobbler personally, since I watched it grow out of Red Hat’s Emerging Technologies team several years ago. Now it’s grown past its Red Hat roots to become a truly independent project — and independent projects need support from time to time.

The Cobbler folks have set up an Indiegogo campaign to raise some funds for some much-needed infrastructure, and as proud Cobbler users, we are proud to help them out. Their goal is to raise $4000, and Eucalyptus will match every donation, dollar for dollar, until they reach their goal.

If you’ve used Cobbler and it’s helped you do your job, pitch in. The campaign is running for two more weeks; let’s help put them over the top.

Shoes for the Cobbler’s kids

New FastStart for Eucalyptus is officially live.

Today we officially launch the next generation of FastStart, the quick deployment solution for Eucalyptus.  We think it’s a pretty dramatic improvement to our previous version, and it’s certainly the easiest way to stand up your own AWS-compatible private cloud.

So go try it out.

And while I have you, I’d like to shout out to the guy who made most of this happen: a guy named Bill Teachenor.  When you use FastStart today and discover that it’s totally awesome, come by #eucalyptus and say thanks to bteachenor for all his hard work on the Silvereye project, the codebase upon which the new FastStart is based.  There were plenty of other folks who helped — but Bill was the one who took the ball.

Open source is powerful because you don’t need anyone’s permission to make it better.  You just need time, belief, determination, and a bit of skill in the right places.  Bill looked at FastStart with the eyes of an experienced sysadmin, picked out a whole bunch of places where we could do stuff better, and led the way.  When you write good code that does useful stuff, people will follow.  Rough consensus and working code: it’s what drives the open source world.

So here’s to Bill, and all the folks who say “I can make this better” and then commit code at 2am to prove it.

New FastStart for Eucalyptus is officially live.

Step two: put your cloud in that box!

(I’m sure you all know that step one is “cut a hole in the box”.)

We’ve been continually working to improve the install process of Eucalyptus over the past few months.   In particular, we’ve been working on a project that we call Silvereye.  Our most recent goal: make it trivial to install a fully-running Eucalyptus cloud on a single machine.

A cloud on one machine?  Why bother?  Well, lots of reasons, actually.  The biggest: the developer workstation.  If you’re hacking on Eucalyptus, it’s pretty awesome to have Eucalyptus on a single system that you tear down and rebuild in 15 minutes.

Anyway: mission accomplished.  Go to our Silvereye downloads directory and get the latest build (right now it’s silvereye_centos6_20121004.iso).  Burn it to DVD, boot your target system, and choose the “Cloud-in-a-box” option from the Centos-based installer.  Answer some simple questions.  Boom, in 15 minutes you’ve got a cloud-in-a-box!

(Note #1: a helpful README can be found in the Github repo for Silvereye: github.com/eucalyptus/silvereye.)

(Note #2: in the cloud-in-a-box config, when you log in as root for the post-install config, it’ll say “hey, do you want to install the frontend now?”  Answer yes.  It automatically installs the node controller for you.)

(Note #3: Silvereye is not supported. At all. If you use it, there are ABSOLUTELY NO GUARANTEES that it won’t burn down your house, steal your pickup truck, or throw your mother into a wood-chipper.)

Silvereye is mostly the work of sysadmin-par-excellence Bill Teachenor, based on the original Faststart installer written by David Kavanagh — but various folks are now working on it; Andy Grimm, Graziano Obertelli, and Andrew Hamilton have all been pushing the cloud-in-a-box on various distros, and Scott Moser of Canonical did some great proof-of-concept work on the UEC code. So thanks to all of them, and everyone else who’s played with it.

Give it a spin; it really is dead-easy.  We still need to round off a few corners before we can call it the official installer of record, but we’re quite close now.

Want that AWS-compatible cloud on your laptop?  Of course you do.  Now go get it.

Step two: put your cloud in that box!

Bulletproofing the cloud

Here’s the thing about running your own cloud infrastructure: once you make the decision to rely on it, then it had better work.  The whole thing.  Every part of it.  Under heavy load.  All the time.

Obvious, right?  But it bears repeating.  When you decide to make the move to doing things The Cloud Way, you are placing a gigantic bet on your infrastructure layer — and that bet is placed not only on the Cloud As A Whole, but on every individual component that comprises that cloud.  In the open source world, these are frequently components that you didn’t write and do not control.  I can assure you that customers don’t care in the least.

At Eucalyptus, we have smart and demanding customers, with extremely high expectations.  They are not content with assurances that things will be production-ready at some magical release point in the future. They don’t care whether the bugs are in the cloud controller code, or the node controller code, or in libvirt, or in the kernel.  They are using Eucalyptus at extreme scale, right now, to solve extreme business problems, right now.  Which means that when their cloud breaks, they expect fixes right now — and if that means libvirt patches or kernel patches, that’s what it means.  That’s why they give us all that nice money.  That’s why customers pay us for free software.

Our customers try to squeeze every ounce of performance out of their machines; that’s part of the point of having a cloud, after all. And when the virtualization technologies we depend upon experience heavy load over a long period of time, we see some crazy things.  Like segfaults in libvirtd, for instance.  Or libvirt handlers that suddenly and inexplicably lose their mind.  Or other weird occurrences that might lead one to believe that libvirt isn’t quite as thread safe as advertised.  These failures may only occur at times of very high load, and they may not happen often — but they do happen.  And when they happen, we have to handle them.  The 3.1.2 release is the result of many hours of hard work by our engineers to find and fix these issues.

It’s a challenge and a privilege to serve customers like this.  At times it can put incredible stress on the entire organization — and it’s at precisely these times when we are at our very best.  Watching great engineers solve critical problems under pressure is a lot like watching great athletes at the end of a big game — and when they win, it’s just as exhilarating.  These engineers are at the heart of what we do. Compared to them, I’m just selling tickets and fetching Gatorade.

It’s not that hard to put together a bunch of components and call it a cloud.  But making a cloud bulletproof?  That’s hard.  And that, friends, is where we are the best in the world.

Bulletproofing the cloud

A big advantage to following AWS…

…is that any time a user runs into problems figuring out how to do something with Eucalyptus, it’s quite likely that the corresponding AWS procedure, as documented by the AWS community, will “just work”.

Example: growing an EBS volume.  The commands listed here:

http://blog.edoceo.com/2009/02/amazon-ebs-how-to-grow-storage.htm

…basically work by switching ec2-tools and euca2ools.  It’s nice to have that kind of knowledge base to fall back on, even as a starting point that may need to change subtly for euca-specific cases.

A big advantage to following AWS…

Nurse Euca is here to help you.

“I never see what has been done; I only see what remains to be done.  –Marie Curie”

Automated installers are great. When they work, they work really well — but when they don’t, not only do they not work, but they bring great sadness to the hopeful user who trusted your automated installer.  Tragic!  Heartbreaking.

So why don’t automated installers work, when they don’t work?  In almost every case, it’s because there’s a condition your installer assumes that isn’t met.  And in this day and age, you don’t have just one installer for your software, you’ve got multiple potential install+config tools: multiple package managers, multiple configuration tools, multiple permutations of hardware, multiple permutations of hypervisor, multiple permutations of network topology.  Which means that you’d better do a *great* job of figuring out your environment before you try to lay down and configure your bits.

Enter Nurse Euca.  Nurse Euca will run before any install and take everyone’s temperature, offer an aspirin or a splint where needed — or will let you know if one of your requirements is Dead On Arrival.  (“I’m sorry, Doctor, but em1 appears to be in septic shock. I recommend against resuscitation.”)  (That’s totally gonna be an error message, btw.)

Awesome, right?  Well, it will be when we write it.  We’ve got bits and pieces of these kinds of checks in various places.  On Friday we will be having a hackfest to pull these threads together and get Nurse Euca jumpstarted.

Hey, here’s a mostly-empty Github repo!  By the end of Friday, it won’t be.

We’ll be on freenode, #eucalyptus-devel, at 7am Eastern US time.  Yes, it’s early; we’ve got some friends in the Old World who will be hugely helpful, so we’ve chosen the time to accommodate them.  See you then.

Nurse Euca is here to help you.