Links:

Categories

Recent Posts

Bookmarks

Recent Status

Upcoming Events

I'm speaking at Open Source World in San Francisco, August 12-13 I'm speaking at The Ruby Hoedown in Nashville, TN, August 28-29

Site search

Categories

March 2010
M T W T F S S
« Dec    
1234567
891011121314
15161718192021
22232425262728
293031  

Tags

Blogroll

Navbar

Related

A Tour of the Puppet Dashboard

Rein just announced the 1.0 release of our new Puppet Dashboard, with screen shots:

We’re going to take a tour of the newly released Puppet Dashboard web front-end.  Puppet Dashboard is (or will be) a web front end that keeps you informed and in control of everything going on in your Puppet ecosystem. It currently functions as a reporting dashboard and an external node repository and will soon do much more, including having better marketing copy.

Fundamentally, Dashboard lets you do two things: configure nodes using parameters, classes and groups for use as an external nodes tool; and monitor the status of nodes through real-time reporting and versioned change tracking. The main dashboard page shows you the status of recent Puppet runs, displays important information like pass/failure statistics, and alerts you of important failures, errors and unexpected events. Let’s take a look at how it does this.

Go check it out.

Ubuntu Developer Summit

It looks like I’ll be starting to blog on our main company site before too long, which will hopefully include someone standing behind me with a sharp stick, poking me and making me write more.

In the meantime, here’s my quarterly blog post.

I was finally able to attend an Ubuntu Developer Summit last month, this time in Dallas (not quite as nice a location as the last one – Barcelona).  My reason for being there was that they had 3 or 4 sessions devoted to Puppet-related concepts, and they wanted upstream (that would be me) to provide some insight and get involved in getting the work done.

It was a really interesting and different conference.  I’ve actually participated in other UDSes in the past via phone and IRC, and now that I’ve been to the conference, I realize how structural that remote participation is, and I’d love to copy it as we hold more Puppet conferences (e.g., the PuppetCamp Europe we’re planning for next year).  Every room had a session-specific IRC channel, plus live streaming video and audio, so you really could participate from anywhere in the world.  The last UDS, I could hear an audio stream and would type on IRC my responses to questions.  It was a bit awkward, but it meant I could actually be involved.

I think the focus on coming away with action plans is another thing that sets UDS apart.  So many other conferences are the destination – get your talk written, your presentation presented, and then drink and talk all week.  While there’s lots of drinking and talking at UDS – there was a really lively bar scene, I think partially because there’s so little hallway track — the sessions are really focused on “we’re all in the same city for a week, let’s knock this out”.

That’d be another great takeaway, but it’s hard to know how to copy it without having thousands of motivated, active, involved community members, like Ubuntu does.  I might decide to try for something like its blueprints in a one-day pre-PuppetCamp session or something; I like the idea of a “get shit done” focus for at least part of any conference.

In terms of how it affects Puppet, it’s a couple of different ways.  Since Puppet is now in Ubuntu Main, they are concerned about the long-term support of whatever we release, because they have to support it for five years.  They’re also looking at integrating it into various aspects of the system, and in particular, etckeeper and the installer.  So far, Ubuntu is the only OS out there that I’ve seen really look for ways to integrate a CM system into the heart of the OS.  As they should be, they’re taking baby steps, but those steps are clearly harbingers for what’s to come.

The etckeeper integration is not something you’d use in a normal environment, but I could see how it could be useful for some cases.  It’d kind of be a filebucket on steriods.

The goal of the installer integration would be, at the least, to make sure that you could have a complete functional machine once out of the installer, so Puppet would need to be able to run chroot’d by the installer.  We also talked about converting the catalog into a format that the installer can understand — essentially a preseed file — which would allow the installer to take care of the majority of the package installs, which would be much faster on the first big run.

Hopefully both of these integrations will yield great results, and either way, it was a great conference and I hope that we and Canonical/Ubuntu continue finding interesting things to work on together.

PuppetCamp 2009

It’s a great week to do a bit of blog resurrection – it’s PuppetCamp in San Francisco.  The conference itself is nearly done – we’re nearly done with the actual presentations and will be moving on to the self-organized sessions.

We’ve had talks by Ohad Levy on The Foreman, Brice Figureau on StoreConfigs, Paul Nasrat on Facter, and James Turnbull on developing Puppet.  After the talks yesterday, we split up into self-organized sessions, which are always my favorite — everyone’s far more involved, and you know just about everyone is getting something from every session.

The conference has been successful enough that I’m confident we’ll have more of them.  Next time, though, we’ll skip the catering and spend the extra money on getting a location closer to the city center.

Summer Operations Conferences

I’ll be at three conferences in quick succession in June, speaking at two of them.

First I’m on a configuration management panel run by James Turnbull at the Open Source Bridge in Portland, OR.  I haven’t been on a multiple-tool panel in a while, so this should be interesting.  I’m also excited to be back in Portland – I absolutely love the town, so much that I’m moving back there this summer.  And for those non-Americans who think we don’t make great beer:  Get thyself to Portland, and see how wrong you are.  I should also be running a Puppet BoF there.

From there I fly down to San Jose to give a workshop on Puppet at the O’Reilly Velocity conference.  This is its second year, and while it claims to have a web operations/performance focus, I think it stands to quickly become the best general operations conference around.  LISA is too academic and just moves too slowly, and there isn’t a whole lot of other competition.

Last year’s was great, and this year’s is a day longer and looks to be even better.

Finally, the day after Velocity, I attend Om Malik’s Structure.  This conference is more focused on operations executives, which is a very different focus for me, but I met a lot of interesting and knowledgeable people there last year, including some from completely outside the technology space.  I normally diss executive events, but this one seems to stand out.

Hopefully you’ll have a chance to track me down at one of these events.

RESTful Puppet and You

No, I’m actually not dead, as much as it might appear from my lack of updates.  Fortunately there are other means to verify my vitality.

Anyway, the goal here is to help you understand how the imminent release of Puppet 0.25 affects you.  And by “you” here I am, of course, making some assumptions – if you don’t use or care about Puppet, um, not so much.  But if you use Puppet, or, even better, if you care about how well Puppet solves your problems, then this post is for you.  I hope.

I’ve no short-term memory and don’t feel like planning this blog post out, so I’m going to just do what I can to ad-lib the reasons that Puppet’s new-found RESTfulness kicks ass.  And stuff.

Smaller and Faster

First and foremost, 0.25 does not have many what you would call features.  It’s mostly a refactoring release.  This refactoring in general results in a faster system that takes less memory.  Experiments have borne this out:  People are seeing 10-40x speedups in fileserving, at least 30% less ram consumed on client and server, and much faster run times on the client.

So, look not for features but for benefits.

What is REST?

REST is more of a principle than a standard, but the things that matter in it for Puppet are:

  • All data transferred is treated as a stand-alone ‘resource’ (and here ‘resource’ means something different than a normal Puppet resource).  For instance, Puppet 0.25 treates catalogs, file content, file metadata, certificates, certificate requests, fact collections, and probably a bit more as resources.
  • All resources have a unique URI.  E.g., a given host’s catalog can be found at ‘http://puppet/$environment/catalog/$hostname’.
  • Standard HTTP infrastructure is used as the ‘api’ – HTTP ‘put’, ‘get’, and ‘delete’ are the primary operations, and HTTP content-type handling is used for serialization negotiation.

Of course, none of these are really helpful to you, whomever you are.  What’s helpful is how we’ve made these basic ideas useful in Puppet.  So here are some of the things you get:

Faster and Smaller

The big problem with our current network protocol (XMLRPC, which is essentially XML over HTTPS) is that everything has to be considered XML, which means that anything that isn’t actually XML (which is, um, everything in Puppet) has to be encoded and escaped so it can act like XML.  For file serving, this means that every file has a minimum of three copies in memory – plaintext, encoded, and escaped.  This is time- and memory-expensive.

Oops I Did It Again

Sorry, couldn’t resist.  One of the benefits of switching to REST is that we’re breaking compatibility, so we can fix some design problems we (i.e., I) made early-on.  The biggest benefits have been around fileserving and the catalog.

In pre-0.25 releases, recursive file-serving involves a minimum of one call per directory and one call per file.  This is obviously slow and bad.  0.25 fixes this so it’s one call for an entire directory structure, with possibly one call for each file that actually needs to be transferred.  The combination of this plus the lack of encoding and escaping has resulted in most people seeing a 10x speedup in recursive fileserving.

For catalogs, well, pre-0.25 releases don’t really talk about them.  They transfer this serialized recursive tree structure, rather than the whole catalog (which is really mostly a graph).  This matters a bit in 0.25, because it’s allowed us to remove some extraneous stupidity that resulted from the limitations of the tree structure (relationships no longer propagate to all contained resources – see the ticket for more information), but it’s going to be a bigger deal going forward as we can start relying on the complete graph being passed around.

Both of these refactors could have been done with XLMRPC, but they would have required breaking compatibility, which is always a difficult step to take, so doing it all at once makes the most sense.

Easier Integration

The big deal in our use of REST is really an internal system that enables REST – this is the all-powerful Indirector.  It’s essentially a plugin system that makes it easy to add new sources or destinations for data we care about.  For example, when we wanted to queue the storage of catalogs to a database (rather than forcing the client to wait until it’s done), we had to develop the queueing infrastructure, of course, but integrating it with Puppet was less than 100 lines of code that knew how to accept catalogs and send them to the queue.

This ease of integration opens up all kinds of possibilities, like moving our CA to be database-backed, but the main thing is that if you come up with some cool integration, it’s similarly easy for you — you’re not dependent on us.

We’re still missing one piece – a user-configurable routing system to help you configure which plugin is used – but we’ve got that mostly done and it’ll be in the release after 0.25.

Note that this integration was often possible in earlier releases, but it was much harder because you had to know a lot more about every individual class you were integrating with.  Because we use the same methods and the same interface and the same location pattern for everything, you don’t have to know as much.

Enforced Good Citizenship

Without the Indirector at the heart of everything, and an assumption that other people are going to be integrating with our code, we are forced to build more decoupled, portable components.  This isn’t much, but it does always enforce a kind of good citizenship, which is something.

Using ‘git rebase’ to clean development histories

In general, development in the Puppet world is a series of essentially disconnected batches of commits.  We do a pretty good job of applying related commits all at once, so it’s obvious when a set of commits is related, but otherwise, we don’t have to worry.

Sometimes, though, multiple series of commits are related to each other, which can easily get lost.  Even worse, multiple series of commits developed in tandem can cause downright painful development histories.

For example, we’re currently working on refactoring Puppet’s ActiveRecord integration while at the same time we’re adding the ability to queue the database store operations.  We’ve had two development teams (of 1 to 3 people each) working on each feature, constantly publishing and rebasing against each others’ work.  You could certainly argue that this isn’t the right model (we should have worked in serial rather than parallel, probably), time constraints didn’t allow this.

On Friday, we got to the point where we’re nearly done, and I started sending my code to the -dev list for review.  That was straightforward enough, because I’d done my development in a separate branch and I still had those branches.  The other team, though, had been pushing code around like made, tuning and modifying their code over time, so sending their commits out for review was harder.  To top it off, our final branch was already a mixture of the different efforts.

So, I decided to see if I could clean up the development history. We had 80 commits spread all around, and I needed to reorder and squash them so they made easy sense during code review.  (This is a pseudo-process without actual code, because the reality was too messy to reproduce here.)  First I needed a list of the commits in the branch; git rebase is our tool through this process, and using it interactively (with ‘-i’) is the key.  So, I made a new branch and started my rebasing:

git checkout dev

git checkout -b clean_dev

git rebase -i 0.24.8

This opens up my editor with an ordered list of all of the commits in my branch that aren’t in 0.24.8.  There are three things you can do with commits in this list:  Leave them alone, combine them with another commit, and delete them (which deletes them from the branch).  Because I knew this would be a long complicated process, I saved the whole list to a separate file to start.

Some of the commits in our branch were backports of fixes we needed from the ‘master’ branch, so these were the only commits I left in the commit list in my first rebase.  This added about six commits, and was pretty easy to merge.  All of the other commits were just deleted from my clean branch.

The next step was to add my indirected ActiveRecord code.  This was only four or five commits, but should have been collapsed into fewer than that (e.g.,  one of the commits fixed a misspelling in a test).  There are multiple ways you could do this, including cherry-picking, but rebasing is definitely the most powerful.  I created a new temporary branch in which to do my rebase:

git checkout -b clean_indirected_activerecord

git rebase -i clean_dev

This actually results in a noop, because I’m rebasing against a branch that’s a complete duplicate of my current branch.  However, because I’m in interactive mode, I can do whatever I want from here, so I pasted in my four or five commits, and s/pick/squash/ where appropriate.  Once I dealt with any merge conflicts, I then merged back into my clean branch:

git checkout clean_dev

git merge –no-ff clean_indirected_activerecord

I decided to force the merge commit to exist because, um, actually I don’t have a good reason.  It seemed like a good idea to have a clear milepost saying that a given branch is merged, like that first email in a code review describing a patch series.

So now I’ve got a branch that has a series of patches that prepare the branch for us (with backported fixes, mostly), then a series of patches providing the first set of development.  Now I just need to repeat this process for the other three development stages, one done by me and two done by others.

I had no problem with my other code; it was 8 or so patches, but I wrote them all, so I could easily handle merge conflicts.  I also had no problem with one of the other chunks of code, because it was only three commits, so simple cherry-picking would have sufficed.

The last bit is where rebasing eventually broke down.  In the end, I had 35 commits that I thought contributed something  to the code (we had some duplicate commits in there, somehow, and some other commits that got cancelled out by later commits), but it looked like they should have been reduced to as little as four or five commits, because that’s about how many components were added in the code.  However, I don’t know this code as well as the people who wrote it, so I decided to punt here and told them they needed to clean up their development path and send me some patches that had no duplication and no code that gets deleted in a later patch.  My expectation is that they’ll create entirely new commits from the current state of the files, because the current commit history reflects a process rather than the desired state.

This whole process made me think of a discussion we had a while back on the -dev list (and a related thread started by Brice Figureau).  Apparently Linux development maintains every commit series separately, and only merges them when it’s time for release.  Or rather, the release involves a final merge.  They maintain multiple ongoing development branches; one of them has all of the proposed patch series, but it’s never merged directly into the release branch.  Instead, when a given patch series is accepted, it’s merged separately into the main branch.

This rebasing I did above made me realize the benefit of that approach – if the four development chunks had each remained separate branches, rather than merging early and merging often, it would have been *much* easier to keep them clean, and the developer responsible for a given chunk could always easily rebase just his or her own commits without affecting anyone else.  Then, when it was time for release, I could just merge them all in the appropriate order and release.

This is pretty easy for four patch series, but obviously gets more complicated as we have tens of sets.  I think for now, it’s too much work to maintain the patch sets in appropriate merge order without actually merging, but I think at some point, it really will make sense.  At the very least, this process has taught me the value of rebasing early and rebasing often.

Puppet makes it into MacPorts

Nigel Kersten just let me know that Puppet and Facter are now in MacPorts.  That’s one more distribution (or in this case, pseudo-distribution) that Puppet’s a part of.  Thanks Nigel.

RailsMachine Releases Puppet Rails Tool

RailsMachine has announced their project Moonshine, which provides a pure Ruby interface to Puppet and is essentially custom-built to simplify Rails deployment and management:

One of the things that separates Moonshine from other solutions like Chef and Sprinkle is that out of the box, Moonshine comes with recipes for the same Ubuntu/Ruby Enterprise Edition/Apache/Passenger/MySQL stack that’s in production use at Rails Machine.

We’re pretty excited about this, for multiple reasons.  It’s another Ruby company developing in and around Puppet, and it’s a great, simple way for Rails developers to take advantage of both Puppet and the RailsMachine stack.

The next step is to get the ShadowPuppet pure Ruby interface imported into Puppet.  It will complement the existing language rather than replacing it, but we haven’t really settled all of the details yet.

Puppet wins Fukuoka Ruby Award

Puppet was one of the winners of the Ruby Award handed out by the Fukuoka Prefecture in Japan.  The Climate Information Toolkit won the top prize, and Puppet was one of three to win the second tier of prize.

Unfortunately, we could not travel to Japan to receive the award in person, but I was able to give a talk via Skype the night of the award ceremony.  It was around 2am my time on a day I’d traveled with my wife and twins, so it was a long day, but it was worth it.  I only hope my talk was worth anything.

Andrew did all of the hard work around getting the submission in and organizing the talk itself, so I definitely have him to thank for that.

Now I just need to figure out how to get my bank to accept those postal money orders from Japan. :)

The Most Free(tm) Way to Make Money from Open Source

Tarus Balog is on a one-man campaign against open-core licensing, or really, any company that produces both open source and closed source software:

Of course, in the open core model there must be “commercially-available extensions” in order to get companies to sign a “commercial license”. Why is this? Because the open core product has been intentionally hobbled to force companies to buy the closed software product in order to get it to do the things that customers need it to do, and thus to generate revenue for the software company.

I find this article interesting, because he seems to have taken the opposite tack from me in terms of deciding what causes the most dilution of an open source product.

I’ve always figured that requiring copyright attribution is a greater sin than providing commercial add-ons, with the strong assumption that the product completely stands on its own without those add-ons.

As far as I see it, requiring copyright attribution restricts the developer community, while providing commercial add-ons doesn’t restrict anyone anywhere, it just says that some of my code isn’t free.

Sure, I can see if I produce a crippled, useless free product that *requires* the commercial add-ons that it’d be evil, but if the OSS portions actually do kick ass on their own, then what’s the issue?

To me, community is the big differentiator.  If a given OSS project doesn’t actually care enough about a community, then open core is basically a sales tool.  But if it *really* cares about community, then open core always has to worry that the community can thrive without the commercial portions.  I basically think of this as requiring that the open core company always open source anything that’s required by the project or that’s essentially commoditized, but it leaves plenty of room for using a commercial license on those projects that not everyone needs, and especially on those areas that are clearly not commoditized and not many people need.

I think Tarus has a special perspective on OSS success because he’s in what amounts to an entirely commoditized space – there are so many monitoring tools with such a similar feature set that it’s crazy to think anyone would choose anything but an entirely open source solution.  In less commoditized spaces, it likely doesn’t make quite as much sense to stick to entirely open licenses.

This bit in particular sticks out:

If you like your users why not provide all of the features as open source? Red Hat does it, JBoss does it, OpenNMS does it. The answer will always be that their business model can’t survive unless they sell closed source software. In that aspect, I can see little difference between open core companies and Microsoft.

Well, I can tell you that the reason that Reductive Labs is considering producing commercial software is that we can’t afford to produce much more software unless someone pays for the development, and at this point, we have a thriving, healthy community that largely gets huge benefit from Puppet without ever needing help from us.  So, our options are to grow so slowly that all of the interesting opportunities pass us by, or to start producing software that allows us to, at the least, recoup our development costs.

I agree with Tarus’s basic sentiment: A lot of these open core companies aren’t actually producing open source software that you could reasonably use on its own.  But that doesn’t invalidate the model, in my opinion.

His model of providing supported binaries, as Red Hat and others do, only works for those who use compiled languages, which means it’s right out for us.  Maybe we should stick some C in there, just to make it easier to charge for support?

So what do you think:  Is it a greater sin to only accept patches to your product if the contributor is willing to assign copyright to your commercial company, or to produce some closed-source code?