Greg DeKoenigsberg Speaks

Why the Fedora ISV SIG never caught fire

Posted in Uncategorized by Greg DeKoenigsberg on January 6, 2012

Here’s a list of popular open source products that cannot currently be found in Fedora repos:

  • Zimbra
  • JasperSoft
  • SugarCRM
  • Alfresco
  • Magento
  • Eucalyptus
  • JBoss :)

Once upon a time, it was part of my job to help these kinds of companies to work more closely with Fedora. We created the ISV SIG for this purpose. Karsten and I would go to trade shows and meet with various open source vendors, and we’d talk with them at length about the great benefit of leveraging the Fedora install base, and the power of “yum install YourCoolProduct”, and the general usefulness of building an ISV packaging community, and they’d nod and smile, and then we’d have a follow-up meeting or two to discuss the ins and outs of being in a distro. And then… well, nothing much would happen.

Now, as it turns out, I’m in a position to appreciate, and articulate, these issues from the ISV’s perspective.

What do the applications listed above have in common? A couple of key points.

Point One: they are all sponsored by companies, who use the open source projects as a base from which to build proprietary products.

Point Two: they all tend to be the primary application running on their machine — in other words, they are appliance stacks — and they need to limit variance in those stacks to help guarantee a good experience for their users.

It’s easy to claim, and many do, that these projects aren’t in Fedora (or Ubuntu, for that matter) because of Point One. In truth, Point Two is *way* more important.

There’s a great page on the Fedora Wiki that does a good job of discussing the potential gains and losses of putting your ISV application into Fedora. I’m going to go through those gains and losses, and share my opinions of them, now that I’m on the other side of the fenceline.

[GAIN] Reduced maintenance burden for all dependencies that are already packaged in Fedora: no need to ship security updates for those components.

This is a good potential gain, but note that it does not require the ISV to be *in* the distro to get this gain. It’s entirely possible to package *on top of* the distro, track the distro closely, and get all of these maintenance gains, without incurring the high cost of pushing packages into the distro and maintaining them. I suspect that this is precisely what many companies choose to do.

[GAIN] Code auditability: the Fedora packaging processes ensure that all code is described by metadata (i.e., spec files). The packaging tools allow this data to be queried in informative ways. ISVs don’t necessarily track this data otherwise.

Also true, but again, note that it’s possible to build RPMs and get the same advantages without putting those RPMs into the distro. There are two separate costs here: there’s the cost of building an RPM, which is comparatively low if you’ve got the source and an experienced packager at your disposal — but then there’s the cost of pushing the RPM into the distro and following the distro’s rigorous rules around versioning and namespacing and supportability, which is a *much* higher cost for the ISV. The gain from that additional cost must therefore be demonstrably compelling.

[GAIN] Availability of package-specific expertise: ISVs can consult other packagers about the upsteams of their dependencies. Each Fedora package maintainer acts as a known point of contact for their package’s upstream project.

This is very much a potential gain, if it’s true. But what happens when most of the packages aren’t yet in Fedora? This is especially problematic in the Java world, where there are tons and tons and tons of jar files that are not “packaged” as such in Fedora, but are still perfectly useful to the Java community in jar form. If the distro packaging expertise for a particular jar doesn’t yet exist yet, then the company who pushes the packages into the distro must take on the initial cost of becoming that expert. It’s definitely true that this expertise can be shared over time, and also true that such shared expertise is a long-term win — but the upfront cost is high, especially for a small company that has lots of competing priorities.

[GAIN] The trust of Fedora users: ISV products packaged in the Fedora way will be more warmly-received by Fedora users than standalone GNU/Linux binaries.

Citation needed. :) I mean, yes, I believe this too, but it’s a gain that’s difficult to quantify. The real benefit we’re trying to claim here is that “yum install foo” is a simpler and awesomer experience — and it is. But the difference between “yum install foo” and “wget foo-installer | sh”, which adds the ISV’s yum repo and gpg key and then kicks off “yum install foo”, is not really that great.

[GAIN] Stability on Fedora: standalone binaries break frequently because Fedora is such a fast-moving target. Built-from-source packages have proven much more stable, since incompatilities are caught during mass rebuilds.

This is a bit of a tautology. It’s essentially arguing that your ISV packages will build better with Fedora because you’re working to make them build better with Fedora. Which is true, but again, can be true by building *on top of* Fedora and not *in* Fedora.  And it also only addresses build time failures, which, for an application, are failures that you’re likely to find immediately anyway if you’re doing proper build/test integration internally.

[GAIN] Bug triaging: Fedora users report bugs to Red Hat Bugzilla first; the package maintainer decides if it’s a packaging bug or an upstream bug. If it’s an upstream bug the packager will ideally create a minimal test case and send it to the upstream maintainers.

This is a strong *potential* gain, if the package maintainer is a trusted and responsible member of the community. But what if the package maintainer is an employee of the company, as is usually the case? It’s not a gain at all.  And what if the package maintainer also maintains 20 other packages, and isn’t particularly responsive?  Then it’s a net loss.

[LOSS] Binary dependency predictability: dependency updates may mean that the deployed set of components is not the same set of binaries the ISV tested during their release process.

Bingo!  No more calls, please — we have a winner.

Here’s the thing: an ISV does not have the luxury of dealing with variance. We’re dealing with tons of bugs, every day, because we’re young companies, pushing as hard and as fast as we can to make our software experience better. When we’re trying to kill a crazy bug for users/customers, the first order of business is to reduce the uncertainties, and the easiest way to do that is to be *very* specific about configurations. This is especially true as the software increases in complexity.

We can assume high competence and good faith on the part of community maintainers, and still be relatively certain that those good actors will make changes, for good reasons, that will damage the ISV’s application stack in unpredictable and important ways. Software is mean-spirited like that.

This could, in theory, be mitigated by keeping multiple versions of things, and having better mechanisms for tracking those versions. This is something that Red Hat Network customers wanted for years, and finally got — the ability to install a very specific package manifest that is not “all latest packages”, but “these specific package versions”.  But Linux distros don’t work that way, for good or ill.

In theory, everyone should always be running the latest version of things. In practice, that can be very difficult — and it can be *especially* difficult for the ISV when multiple distros have different notions of what the latest version is, and *exceptionally* difficult when those package manifests can change without warning, and outside of your control.

Maintaining a functioning product in multiple cutting-edge distros, with different release cycles and different dependencies, requires a serious, serious commitment to continuous integration and testing. I believe that Eucalyptus has a better process for this than most — and still it will be a tremendous challenge for us to keep up with two different fast-moving distros in Fedora and Ubuntu.

[LOSS] Unity with Windows release process: someone on the ISV’s team will need to be a Fedora contributor or they will need to recruit an external packager.

You can replace “Unity with Windows release process” with “Unity with Ubuntu release process” and the problem is the same. There are huge differences, of course, between a Windows release process and a Linux release process — but even staying in the Linux world, there’s a considerable difference between the Ubuntu release process and the Fedora release process, and expertise in the one in no way guarantees success in the other.

[LOSS] Ability to customize dependencies arbitrarily: there are rare cases where Fedora ships different versions of the same component for compatibility but in general this is strongly discouraged; custom patches should be sent upstream or eliminated by patching the product’s code to not require them.

Absolutely.

[LOSS] Download counting/tracking: if an ISV provides a tar-based distribution from their website, they can track counts and/or emails. This may be important for their marketing department.

Ayup. :)

* * * * *

It looks pretty grim in the end, doesn’t it? Well, it’s not as dark as all that. There are legitimate ways for the committed ISV to bridge the gaps over time:

1. Commit to building RPMs (and dpkgs), from source, the right way, for the ISV product, and making those source packages available to whomever wants them. There are legitimate reasons for an open source company to do this, and it’s a necessary precondition to being in the distros anyway.

2. Release their Linux versions as add-on yum/dpkg repos.  Of course, this also means being able to supersede/obsolete distro packages with foo packages, but this is easily done by maintaining separate namespaces.

3. Continue to work with other ISV vendors on packaging best practices at every opportunity, even if those packages don’t immediately end up in the distro.

4. Explore development builds that depend on the latest packages, available from wherever. One of the great advantages of Fedora, and other fast-moving distros, is that they do a great job of managing the future. We don’t want to live in the future, but we certainly want to have our eye on it, and that’s a great reason to continue to *try* be in Fedora — but we also need to make it clear to potential users that the future and the present don’t always see eye-to-eye, and that can be difficult messaging to convey.

The truth of the matter is that not every user understands the intricacies of the open source development model, and most ISVs in a competitive market get one shot to connect with their potential customers. One. Which means that the ISVs are going to do everything they possibly can to make sure that they’ve got control over how that experience goes, at the lowest possible development cost.

Fedora can afford to live right on the bleeding edge because they’ve got CentOS/RHEL to fall back on. Not everyone has that luxury.

(p.s. looking forward to talking more about this at FUDCon.  Also: the drinking.)

About these ads
Tagged with: , ,

22 Responses

Subscribe to comments with RSS.

  1. Tarus said, on January 6, 2012 at 3:26 pm

    I was one of the people who met with you, and our big issue was that our product is written in Java. We depend on a ton of Java libraries, yet Fedora wanted us to use the “Fedora” version of a particular library. A small percentage of our clients use Fedora as the O/S, and it was going to be a royal pain to change our rpm build process just for Fedora.

    It was way easier for us to just create our own Fedora repo, so all our users have to do is:

    rpm -Uvh [URL of repo]
    yum install opennms
    accept GPG key

    It’s two extra steps, but if our users can’t figure that out perhaps it is good that our user base is self selecting. (grin)

    • Greg DeKoenigsberg said, on January 6, 2012 at 4:11 pm

      So you were. And I understood your point then, too — but not as, ah, viscerally as I do now. :)

      • Benjamin Reed said, on January 7, 2012 at 9:56 am

        =)

        Yeah, one of the biggest problems that exists when trying to package Java software in a distro-like way is that a lot of Java projects are using Maven or something similar, to request “known good” versions of software, not necessarily just the latest version. There is not as much discipline about backwards-compatibility in java software in general (Apache Commons notwithstanding), so you stick with what you know.

        I know in OpenNMS, at the time we were talking with you guys about this, we depended on a very specific combination of versions of 3rd-party tools because they were interrelated. You couldn’t just let Fedora update to “the latest version of hibernate” and have it work because we depended on Hibernate *and* something else that used the ‘asm’ project, and they didn’t use the same version of asm, and asm generates and manipulates bytecode at runtime. When they don’t match, things explode…

        Add to that the problem that we’re trying to support Fedora, RHEL3 through RHEL6, and any other RPM-based distro that somewhat resembles them, so supporting individual dependencies in our RPM packages for Fedora means more, not less, work. It’s now a special case that requires different packaging than the RPM that we already make, which can basically be installed anywhere a JVM exists.

        Ultimately it’s a problem of how most open-source unixy folks think, compared to how Java culture has evolved. Java users tend to think of 3rd-party jars in the same way open-source developers think of static libraries. They’re a thing that you “freeze” along with a particular version of the software, and should never change out from under them because they’re essentially a part of the makeup of that software release.

  2. Brian Thomason said, on January 6, 2012 at 4:08 pm

    I had the same problems when working in a very similar role at Canonical. Even when we opened the Partner repository there loosening restrictions, the main inhibitor was always:

    [LOSS] Binary dependency predictability

    …and the added QA it entailed.

    Very good read, keep-em coming Greg!

    -Brian

    • daengbo said, on January 6, 2012 at 8:55 pm

      I think Greg’s point about applications being an appliance stack was a good one. I really liked when Ubuntu (the reason I’m responding to you) Server had the stack install for LAMP and some other things. It was a few years ago, maybe ’06, so I can’t remember exactly. e-Box / Zentyal is doing that right now, encouraging free users to download the full ISO (i.e. appliance).

  3. David Nalley said, on January 6, 2012 at 5:24 pm

    So, while I am in the same shoes that you are now, I do think there is a substantial benefit that you missed out on – and that’s substantially greater adoption. When I was working on the ISV SIG – ISV’s understood the value proposition of being in distros like Fedora and Ubuntu as preparing for the next RHEL or LTS release. However, one of those projects/companies asked for help justifying – and what we found was interesting from a stats perspective. Obviously there were no comparisons available for that specific project – but their chief competitor certainly was present – so mmcgrath and warthog9 pulled a sampling of logs from yum mirrors and ran counts on a specific package. The result? The number of installations in a month on Fedora was 3x what the total downloads for all distributions was from the competitor’s project site for the same month. Admittedly this is a one package, anecdotal statistic, and that YMMV and almost certainly will – that’s still a compelling statistic. Don’t get me wrong, I think that there are massive hurdles to getting almost any ISV’s package into $distro – but since the goal of most F/LOSS software companies is ubiquity for their software, it’s still quite attractive.

    • Greg DeKoenigsberg said, on January 6, 2012 at 5:57 pm

      Is aggregate per-package download data available to everyone?

      • David Nalley said, on January 6, 2012 at 6:12 pm

        No, at least not at the moment. It might be something worth looking into making a request for though – the problem is that essentially they used wc/grep to parse the httpd logs iirc, so you can’t make the raw logs available, and most of the repos are not controlled by Fedora – but perhaps this is something we could discuss with Matt Domsch next week.

      • mdomsch said, on January 7, 2012 at 12:59 am

        Someone was working on a log scrubber that would let us publish the logs modulo any PII data, for use exactly in this purpose. Not sure where that stands though. Trick is – it can’t be from Fedora’s own http servers – those serve very very few packages compared to the rest of the mirror network – and we don’t have http logs from the rest of the world. Smolt data is more interesting, for those who upload to smolt, which unfortunately is a horrilbly small subset of the install base.

  4. Adam Williamson said, on January 6, 2012 at 5:27 pm

    There appears to be a large elephant in the room: aside from the points you mention, all but two of the products you list are written in Java. Your points are interesting and largely valid, but I’m rather of the opinion that a much _bigger_ issue is simply that the Java Way and the Linux Distribution Way have historically been rather incompatible; I’ve talked about this with various JBoss people and everyone tends to agree. To simplify and over-generalize: Java people like to embed everything. Find some jars that you need to build your product, and embed them all with your product.

    This is not something Linux distributions like. They want system-wide jars shared between all Java apps that use those jars. Java devs don’t do this and, often, can’t see why we’d want to. As a secondary issue, Java Land doesn’t really have the kind of robust mechanisms that more traditional *nix shared libraries have for being shared system-wide – .so versioning and packaging conventions and so on – because they just _don’t share those things system-wide_.

    It seems that Java Land is gradually developing a sense of why it’s a good thing to actually share libraries properly, though, and coming up with mechanisms for doing it. I rather suspect it’d make it a lot easier to integrate big Java products into distributions once this comes a bit further.

    • David Nalley said, on January 6, 2012 at 5:40 pm

      It’s not just Java that does this (although they are likely the worst offenders). Virtually every web app (esp PHP-based apps) does the same type of bundling, which distributions prohibit.

    • Greg DeKoenigsberg said, on January 6, 2012 at 5:43 pm

      Fair point. As I said, a lot of this has to do with The Java Way — but not all. It’s just more acute with the Java way.

  5. Michael DeHaan said, on January 6, 2012 at 5:35 pm

    Yep, yep.

    The more and more I’m out in the field, the more I realize app deployment models do not look *remotely* like distribution deployment models. It’s not even a Java vs not Java issue.

    Even in writing simple github projects, I am very interested in vendoring simple dependencies so my code is not broken by a fast-moving other project that doesn’t care about *my* project.

    I think that’s just the reality of software.

    That being said, open source projects that have their own yum repos for EL-5 and EL-6? That’s pretty good to get adopters.

    Am I going to deploy something on Fedora in production? Well, I know people that do … usually large grids … but it’s rare, and I don’t understand why they wouldn’t target a stable distro anyway. Most do.

    The other thing to take into account is many apps are much more cross platform, so investing to get in with the latest in Fedora is inconvenient when Ubuntu/Debian may be moving at different speeds. There’s a big West Coast / East Coast split in these where an application really needs to support both — and I can see why they wouldn’t want to invest heavily in following all of them and maintaining packages, especially in an era where Fedora is rapidly changing init systems and packaging standards — it’s a huge time investment for not a lot of gain.

    If folks want to make themselves available for users to try out — rather than playing in the distro sandbox, I’d try standing up a yum repo. It might be nice if distros made a list of these yum repos available in graphical tools so they could be easily enabled or something, but I would totally expect appliances to drop a lot of content of their own in /srv to avoid dependencies changing that aren’t under their control.

  6. ben said, on January 6, 2012 at 5:51 pm

    As for different versions of packages per-distro; couldn’t you get a common git server with all the package sources in it (in separate git trees, of course) which pulls directly from upstream, have a master branch (which is vanilla upstream) and a stable branch (which has patches that the various distributions add to improve stability; hopefully they migrate upstream in time), and some separate place to store distro specific changes.

    That way distributions can use more cutting-edge sources while still having very stable packages, and vendors will only have to target version set (assuming the distribution specific changes don’t change the package functionality or stability too much). It obviously might not be reasonable to do this for all packages and all at once, but I think if we gradually switched toward that model it would be a big win for everyone.

    And, of course, individuals can still have their own private trees too. ;-)

    (please substitute repo for tree where you find appropriate; I’m not real familiar with the terms)

  7. jadudm said, on January 6, 2012 at 6:30 pm

    Frankly, anymore I’m happy if I can:

    1. Download a VM image.
    2. Boot it in (insert VM Manager of Choice)
    3. Try the product.

    Now, I’m in the academic space, where I want to try products to see if 1) they will have value to me as an educator, and 2) whether I want to maintain it myself or 3) fight the good fight to get my department/campus to support the product. (Department is possible, campus is always another game.) If I can skip all build/test steps, and just download “the app” in an environment that “just works,” then I’m game to try it. A 15 minute download is something I can do other work during; fighting an install for more than 15 minutes is not worth my time for a speculative exploration.

    Granted, that doesn’t get you into the broad install category that being in the distro repository might, but it does make it easy for someone to test things quickly without having to discover what about their system breaks your build/install.

    No doubt, this has many drawbacks…

    • Greg DeKoenigsberg said, on January 6, 2012 at 10:44 pm

      Tough for A Cloud Appliance — by definition, it’s bottom level of a virtualization stack, so you can’t really run it meaningfully inside a virtualized environment.

      But otherwise, I think you hit the nail on the head. It must be dead easy to install, period. To the degree that being in the distro helps us, that’s good. If there are other ways to accomplish that ease of install without being in the distro, that’s good too.

  8. dagny87 said, on January 6, 2012 at 8:07 pm

    I’m going to invent the “Communi-tini” just for this discussion.

  9. Michael said, on January 8, 2012 at 7:26 pm

    I fully agree with your reasoning. The success of windows is tied to the fact they are aiming ISV to provided the needed ecosystem under the form of commercial support around the OS of Microsoft, who can then sell it to their customers ( OEM, not end users ).

    Linux and most free softwares on the other hand have been created to have free code ( ie, that’s the whole point of the FSF and GNU manifesto after all ), and the whole system evolved around this assumption of collaboration. People who designed distribution and packages system did it with a sysadmin engineer mindset, trying to reduce duplication, and doing some kind of damage control around a ever evolving world, without any kind of coordination. ( note that I think that free software world is fine as it is, since Windows ecosystem is basically based around the assumption that people don’t collaborate, and that’s effectively what happen , some kind of self sullfilled prophecy, and I think the linux way is much healthier in the long term ).

    So that’s not really a news that ISVs have a hard time to distribute their product on generic Linux distribution, and that’s also where Red Hat has a card to play, offering a stable binary platform.

    As I understand, that’s also where EPEL could help, with Fedora being just a gateway to it ( ie, push in Fedora, then in EPEL ).

  10. pilhuhn said, on January 12, 2012 at 11:29 am

    [GAIN] Stability on Fedora: standalone binaries break frequently …

    Compile time breakage is one thing. Subtle runtime breakage because the latest version of a dependency has a subtle change in semantics a different one

  11. Karsten 'quaid' Wade said, on January 13, 2012 at 6:21 am

    OK, sure, that all makes sense, and very well articulated – but what you are saying isn’t really surprising, right? We surmised a good portion of it from those circular chat-a-rounds with ISVs. Now, I reckon you are living out the results of decisions made way-back-when by people before you. How often is that the case? Almost always – the original start-up developers were running fast-and-hard and for various reasons have a development environment that suited that and not the future.

    There are some nuances that I’m curious about.

    First, David Nalley’s point, not much investigation of the userbase potential of being in even a single distro. There is a strong possibility that being in a distro drives more installations than if someone has to seek out a website, find a repo package, download/install it/run an install shell script, etc. We agree that having minimal barriers is important, and not being in the distros is a barrier.

    Of course, that’s a chicken-and-egg problem. In order to find out how _your_ package does, you have to get it in a distro. Instead, ISVs distribute the pain to customers and potential customers. It’s a small pain to the customers, they probably don’t notice it … well, unless they do.

    Second, if you can do anything appliance-like, why not just abandon all other distros and build on top of only one? You tie your dependencies to the distro, insert personnel where it matters, etc. Oh, did I just almost-describe RHEL? Imagine what can be done building from scratch …

    Third, what if you were starting from scratch? Would you follow a similar pathway to get you where you are now? Or would you include what you needed from the start to run natively on Debian, Fedora, etc.?

    Fourth, if you already know that your customers and sales pipeline are full of people deploying on one of the major distro platforms, is there value in making that platform even better for your product – and customer experience – by being more closely aligned with that distro?

    Companies down to developer teams rip-and-replace tools, hardware, and processes. If there were a compelling business case made from senior management, I have no doubt that the challenges outlined in the post and comments could be overcome by any ISV. The items that were lost or harder to gather, such as ways to directly track downloads, would be replaced by equally or more valuable marketing methods.

    Personally, I blame a lack of open source project experience in ISV founders and senior management. :) If you don’t know that the model really works, you fall back on what you know worked in your last company. And the mistakes of vendors perpetuate …

    I’ll finish with a plug for the metrics working group on theopensourceway.org. In the process of caring about community health, we can also be tracking back to business decisions, needs, and results. If we could demonstrate a ROI on being an ISV the open source way that was equivalent to the UNIX to Linux savings, then we’d have something great to talk about, huh?

    http://theopensourceway.org/wiki/Metrics_working_group

    • Greg DeKoenigsberg said, on January 17, 2012 at 12:54 pm

      “Second, if you can do anything appliance-like, why not just abandon all other distros and build on top of only one?”

      That’s a very prescient question, Mr. Wade. One that people are considering more and more carefully in cloud-land.


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Follow

Get every new post delivered to your Inbox.

Join 34 other followers

%d bloggers like this: