Configuration Management Experiences with ZenPacks

January 26 2012 | By | in Unified Monitoring

Originally Posted on the Zenoss Google+ Page. Please Circle Us to Join the Discussion.

In my last blog I talked about building out an OpenStack cluster and briefly mentioned that one way we’re using it is a continuous integration system to build and test ZenPacks. In this post I’ll elaborate on some of what went into building this system. There are a lot of choices in server deployment and configuration tools, and I didn’t set out to test them all, but I did learn quite a bit in the process.

For the unfamiliar, ZenPacks are basically plugins for Zenoss. They are most commonly used to support monitoring of specific types of devices or technologies, but they can be used much in the same way as Firefox add-ons to extend or modify absolutely any part of Zenoss. Many teams inside Zenoss, Inc. write ZenPacks. From the developers, to the services team others. Zenoss community users and commercial customers also write ZenPacks.

As of this writing, I’m aware of over 400 ZenPacks in total from the various sources who have decided to make their ZenPacks either public, or available to Zenoss. A big challenge in building an automated system to build and test all of these ZenPacks is the number of permutations that need to be tested. Specifically we need to test each ZenPack against all supported Zenoss configurations. Once you factor in server architecture, Linux distribution, Zenoss version, Zenoss flavor (platform, core and commercial,) you end up with up to 54 testing permutations for each ZenPack. Multiply that out and you have over 20,000 different configurations to build and test, and the number of ZenPacks will continues to grow.

Due to this expansion of parameters I decided early that it wouldn’t be feasible to have a build box for each of the 54 possible permutations per ZenPack. I had to reduce it to a build server per server architecture and Linux distribution. This reduced the number of unique build servers to only 6. I then needed a way to automate the building and configuration of these 6 distinct types of servers and deploying as many of them as I needed to handle the volume. Furthermore I needed to make sure it was easy to add new Zenoss versions into the equation as easily as possible.

Here’s where we get to the configuration management. There were a lot of tools out there to choose from: Puppet, Chef, CFEngine, Bcfg2, Fabric and many others. I initially started down the road of using Chef for no particular reason other than it and Puppet seem to have the largest communities right now and I previously worked with Matt Ray at Zenoss who’s now over at Opscode (the company behind Chef) and have a lot of respect for him.

To be quite honest, my initial foray into Chef wasn’t encouraging. I went into the exercise hoping to leverage the existing Chef resources and cookbooks and get away without writing a lot of code myself. I turned out that through some combination of me not knowing what I was doing, and my use case being a bit strange from Chef’s perspective, I went looking for another solution.

Considering that my strongest language is Python, not Ruby, I went to Fabric next. Fabric is much less featured than Chef, and really targeting a different use case. All it does is provide a nice framework for remotely executing commands on lots of systems over SSH. As I wrote more and more custom code in Fabric I eventually came to the realization that I was rewriting a configuration management system and that if I was going to be writing code anyway, I might as well be writing it in software built for the purpose. Fabric was good, but not what I needed.

With my tail between my legs, I returned to Chef with a new perspective and expectation. I was going to have to write some “code.” With this expectation my work in Chef progressed rapidly and I’m glad I switched back. I was able to easily deploy a Jenkins server and fully configure it with all of the plugins I needed to use, and deploy my ZenPack build servers each with 9 switchable Zenoss configurations installed in different logical volumes.

Assuming I was trying to do typical activities like installing packages and writing configuration files, my Chef experience would have been a lot easier. However, the weird part of my requirement was that I needed to install different versions of the same package. Ultimately I was able to solve this using Chef by segmenting the installation directories of the packages into logical volumes, and forcibly removing the packages’ records from the RPM database after installing them. I’d say this is a good example of Chef being used to make easy things easier and hard things possible.

I wholeheartedly recommend Jenkins if you need to do any automated building, testing, deploying or other continuous integration work. The default capabilities are outstanding, reliable and easy to use. The third-party plugins are also amazing and cover just about anything else you’d want to do. Jenkins seems to have a focus on Java projects using maven, but I found it worked very well for ad-hoc Python projects as well. I may do a more detailed post on how we’re using Jenkins in the future.

If you’re interested in the Chef cookbook I’m using to build out this environment, you can find it on GitHub at the following URL. Look in the chef-repo/cookbooks/zenosslabs/ directory.

https://github.com/zenoss/zenosslabs


[adrotate block="1"]
  • Organon

    Did you consider looking at Crowbar at all? In particular if you’re looking to incorporate any OpenStack elements. https://github.com/dellcloudedge/crowbar

    • Chet Luther

      We are running this on top of an OpenStack Nova cluster. It’s funny that you mention Crowbar. While we didn’t use it to build our OpenStack cluster, we have been working with it lately. Primarily in an effort to create a Zenoss barclamp. Look for more from us in the very near future on this.

      • Organon

        The Crowbar boys (zehicle and friends) will certainly enjoy hearing that, re: building a Zenoss barclamp.  I wouldn’t stop at a “simple” Zenoss barclamp, however.  Part of the default barclamps deploy with auto-integration with both Nagios and Ganglia running on the Crowbar admin node.  Sounds like a Zenoss barclamp would/could/should do just the same when deploying other workloads.  Just a thought…

        • Chet Luther

          They already know about it. Still under development, but auto-provisioning monitoring of new nodes is already in.

          http://vimeo.com/35642761

          • Organon

            I love it when a plan comes together…  Well done Keith.  I look forward to seeing app-level barclamps being co-deployed with either a Zenoss client, or an auto-hook into Zenoss to help system monitoring happen at app deploy time.

  • Pingback: Improving #ZenPack Quality by @cluther