Jan Vansteenkiste
Why not to use Puppet::Parser::Functions.autoloader.loadall
Recently (about 5 minutes ago), I was writing a custom puppet-function to offload some puppet magic. In short: I’m writing a wrapper around create_resources so I can keep syntax for the end-users of my module crispy clean. This means I need the create_resources function to be available in my custom function. This can be done by using Puppet::Parser::Functions.autoloader.loadall as suggested on the puppetlabs custom modules guide. Unfortunately, when using #loadall, all functions will be loaded.
Why unfortunately? In my case: A function defined in puppet-foreman depends on the rest-client gem and I do not have this installed. Some people might say: Just install the gem and be done with it! This is hardly a proper solution. The way to go would to be only include the function I really need, being create_resources.
And here is how:
Puppet::Parser::Functions.autoloader.load(:create_resources) unless Puppet::Parser::Functions.autoloader.loaded?(:create_resources)This will basically load the create_resources function after checking that it has not been loaded before. This (the function already being loaded) could be the case if you properly depend on puppetlabs-create_resources in your manifests. Side note: I added a small dummy class so my modules can depend on this function being available.
This has resolved my issues with #loadall, but if I ever needed to include another function that DOES use #loadall, I’ll be screwed all over again. So (pretty) pls, don’t use #loadall.
Puppet Module Patterns
I’ve used puppet quite intensively since a couple of months (about 4 I would guess). Before that, I’ve played with it, change something here and there. But quite not as much as now. I’ve used several puppet modules from wherever google leads me, roamed github, inherited a few from colleagues and created several from scratch. While doing so, I saw a lot of stuff I disliked and learned a lot on how we I can (ab)use puppet to do what I want it to do. Over those last months, I have grown my set of ideas on how a puppet module should look. So, before every statement I make, you should probably add ‘IMHO’.
WHO THE F.Why the hell would this guy (me) have anything to say about puppet modules. Let’s situate first.
I’m now an Open Source Consultant. I’ve been (in order) a Java programmer, sysadmin, Drupal developer and now back sysadmin (doing devopsy things). Last 3 positions I worked for (and still work for): Inuits – Open Source company in Belgium. Currently, I’m positioned at UnifiedPost (About 100 people but thinking big!). I help out with daily maintenance (and there is plenty) and starting to adopt puppet as much as possible. Puppet was already in use at UP (UnifiedPost), but knowledge was rather thin as I came in. They did however manage some hosts with it (about 300-400). I dove in the puppet code rather fast and stumbled upon several patterns that increased pressure on my mouse heavily. Even modules I grabbed from the net (whatever the source is) made my grip firmer.
PROBLEMSBefore trying to fix the problem, we should find exactly what bothers me with all these modules I lay my eyes on. I’ll try to keep it organized.
- Modules are not classes!
- Too hard to use by non-developers
- Poor interaction with third-party modules
- Not versioned
- Not pretty at all do down right f_ugly
Although a module exists out of several classes, it should not behave like one.
An example to clarify what I mean: The accounts module. (I’m sure this is the case for may organizations that have an accounts module).
I can think of a several valid reasons why you would have one. (Keep reading nevertheless!). What does our accounts module contain: A definition to ease the use and do some customization (set defaults, create some files, …). We also find a list of users, passwords and authorized_ssh keys. This (specific) user information does not belong in a module. It should either be in a class (below the manifests folder) or stored externally. In my point of view: Nodes use classes. They register the kind of machine and define what should be installed. Classes include modules and change settings. Possibly parameterized so we can keep it node specific. All module parameters values reside in the node or the class(es) it includes.
2. Too hard to use by non-developersThis brings us to our next point. Can I just grab your module and start using it? Or do I need to weed out hardcoded strings, change host names or edit templates. Do I need to understand the complete way it works ‘under the hood’. I have written a short post recently, expressing my feelings about this. Bottom line: If I need to edit any file within your module to get it working the way I want it to, there is something wrong with it. Sure, features and/or support might be missing. but if my $::operatingsystem is supported, I should get it working without touching anything of the module code.
3. Poor interaction with third-party modulesI have reused (or attempted to) several modules found on github and always had the same problem, it does not play well with our current puppet-tree. The best example for this is probably the apache (or httpd) module. Almost any puppet modules that has a dependency on apache being installed, comes with its own implementation and/or dependency. Most companies already have a apache module and change the new module to work properly together with theirs. There goes upstream support. I have run into this issue with puppet-foreman recently and this will probably be my first big test case for my coding pattern.
4. Not versionedMost modules you will find only live on one branch, master. Some may have a develop branch, but most of the time, there is no saying in what version you are using. Unless you elevate hash-tags to version numbers. (Using git submodules does this at some extend). But updating a submodule is always a dangerous thing to do, there is no way to tell what will break.
Besides the ‘version’ of a module, we also have to take the puppet version into account. I tend to be a cutting-edge user for all my software, but I can easily understand if you don’t for whatever reason. So keeping puppet-modules backwards compatible is a must (is it?).
5. Not pretty at all do down right f_uglyWhy is properly formatted code important: for anybody else that ever has to ever change or even use it. This could be a colleague or someone (anyone) else that found your module and wants to improve (here is when YOU win time, if somebody else does the job for you) or change it. Even if you did not have time for writing up documentation, most people will have to stroll through your code. Having properly formatted code is always a nice-to-have feature then.
REQUIREMENTSSo now that we know what I dislike about puppet modules, let’s try to define something more positive. What is a good baseline for a puppet-module.
- VCS (This one is pretty obvious, I will not elaborate on this any further.)
- Follows style guidelines
- Use of centralized parameters / settings
- Fully(!) parameterized
- Easy and centralized handling of compatibility ($::operatingsystem-ish stuff)
- Documented
- Releases
- Puppet compatibility
- Integration: is uniquely identifiable
- Easy to extend
Why? Some valid reasons are so you make fewer mistakes and are more aware of what you are doing Do you really need double quotes? Using single quotes for static string values will prevent you from forgetting to quote ‘$’ (commands anyone?). Always using a default case will make you more aware that more than one distro exists in the world. At least fail when your module is not fit for a certain operating system to prevent unexpected behavior. If you read through the style guideline, you will see that many of these items are easily to do if you just remember to do so when you are writing the actual code.
2. Centralized parameters / settingsDoes your module support distro? Just check the params.pp file. Everything is there. It does not get much easier than this to add support for a new distro.
3. Fully ParameterizedThis might seem like I’m making the same point twice, but we should differentiate between our general ‘settings’ that configure the working of our module, and specific definitions that have parameters. You define where all your vhosts are but each vhosts definition you create also takes parameters. Same rule applies for a definition as for a module, we should be able to use it without having to change anything in the code. Often, this means making more stuff than you need – at time of writing your module – dynamical (parameterized). You can hard-code a ‘Listen 80′ or template it using a $ports parameter.
4. Compatibility handlingAs an exampe: my colleague asked me: Does your module work for Debian? I was happy to answer: You just need to add support to the params.pp file. That’s all whats needed (and maybe add some templates).
5. DocumentationWhen thinking as a developer when writing a module, we know we need to offer easy documentation for the end-user. This is no different when writing a puppet module. It’s always a good thing to keep people out of your code as much as possible. Proper documentation is the first step. I try to write as much as possible, main reason being when a colleague asks how to use it, I can point him to the documentation instead of going over the code myself to remember what exactly is going on. On a side note: ATM, I’m having some troubles with bug #11384. Votes – and a patch even more – welcome ;)
Beside top-level documentation, inline (code) documentation should also be written. Not for the obvious stuff, but when you do something more advanced, explain to a fellow coder (or yourself some weeks later) why and what you are doing.
6. ReleasesPuppet modules should also have releases. This would an easier way of drawing attention when we change the API (or definition parameters) or when we fix a bug. This is also a great sign when our module is no longer backward compatible with old code (breaking API). I try to support old code as much as possible. But at some point, we will have to weed the old so it does not clutter the new. Keeping stuff simple/stupid (although I have passed that bridge a looooong time) is still a good principle.
7. Puppet CompatibilityWe need to know with what version of puppet your module is compatible. Some features of the puppet language you use might or might not be available in older releases. You can check the Puppet Language Guide for what is introduced in what version, but there are a lot of other differences that are not so much documented. I’ve been using the create_resources function quite a lot but it’s only in core puppet for versions 2.7 and up. Luckily, there is a backport for 2.6 on the puppetlabs’ github.
8. Integration: is uniquely identifiableTo improve compatibility, we first need must be able to tell what module we are integrating with. I personally started to use a $modulename and $moduleversion param in the main class of my module. Modulefiles like puppetlabs requires them for the puppet forge are cool, but we can not use them in our code. We could write a fact for this so we don’t need to duplicate code. I won’t add this to my to-do list as I already have a way-to-long backlog, but feel free to add it to yours.
With this information, we could do different things based on the module and version we are working with.
9. Easy to extendIn this part, the developer in me is taking the overhand and it will depend on personal preference a lot more than any of the previous points. A quick example: I wanted to use a conf.d/* configuration style. Even more, for certain configuration files, order is important so we need to prefix files with 00_, 01_, … I could have easily done this for each type of configuration file I want to store here. In stead, I wrote a confd wrapper definition/class that does this for me. It’s a 2 step process: You initialize/setup a conf.d folder and then define yourresources within them. I’m realizing now that this should have been a separate module. I have added it to my to-do list.The main advantage is I can easily re-implement conf.d style folders now without worrying about the logic behind it.
SOLUTIONSQuick wins! These go without saying, start using them now.
Check your code for formatting and style.For this, we have puppet-lint. This tool will deal with most common problems and errors/warnings against the style guide. This tool takes one puppet manifest as argument and displays the errors/warnings it finds. You can easily integrate it with jenkins since the log-format argument has been added.
Documentation.I suppose most people will have issues with this. Good documentation is essential and not much hard work if you do it right. I prefer to START with documenting what a class will do and implement afterwards. This is a lot like writing tests first and then use them to see if you are writing proper / working code. The danger is of course that you change the internal working but forget to update your documentation. After each feature I add, I tend to go over the documentation and see that everything is still up to date. Once documentation has been written, you can generate it using puppet doc. To work around certain puppet doc’s ugliness, I wrote a small wrapper script for my Jenkins jobs that does some post processing on them. See previous post for that.
ReleasesI’ll be quick about this one: Use git-flow.
General / Initial module structure.For creating your initial puppet module structure, there is always the puppet-module tool. Install it by installing the gem. I have tried using it, but I’m relying on my own bash magic for creating classes.
This is my basic structure I re-use over and over.
- ./manifests/init.pp
- ./manifests/params.pp
- ./manifests/packages.pp
- ./manifests/setup.pp
One note on these filenames: always try to avoid confusion! I have seen a lot of config.pp classes and params.pp classes where the config.pp actually does configuration of the package on the system while params.pp is for configuring the behavior of the puppet-module. I like setup.pp better than config.pp, since it’s easier to figure out what the class does: It sets up the system! Another good option would be install.pp.
OUTROI realize these solutions are no where near finished but since FOSDEM 2012 is coming up and I’m running low on time, I wanted to publish this post so anybody can starting giving their opinion on the matter before coming to a final out-of-the-box solution most people can relate to. So, actually, this is a big fat TO BE CONTINUED.
Matters we need to discuss:
- Compatibility handling (both to other modules and puppet)
- Making modules easy to integrate and/or extend.
Reducing vagrant box size
Here are some tricks I use to make my vagrant boxes as small as possible:
Tips: Booting in single user mode:I boot in single user mode since it will prevent running services that could output logs. I do this because I zero out all my logs before packaging the box.
Updating:After updating any package, run yum clean (or the apt equivalent).
When booted in single user mode, don’t forget to start-up your network before updating.
When updating kernels, install the kernel packages, reboot and remove the old kernel packages that are no longer in use. Remember to re-install the VirtualBox add-ons too after a kernel update.
Cleanup:After doing whatever you need to do with the box, I do some rather nasty stuff to make sure the box uses as little as possible place. If you are using a RAW hard-disks, these might be a bad idea (stuff gets BIG).
- Zero out all remaining unused disk space
- Zero out the swap
- Clear out all log files (I just make them empty, I do NOT delete them)
(You can find this script – or an older version in /root/tools/cleanup_diskspace.sh on my newer vagrant boxes.)
cat - << EOWARNING WARNING: This script will fill up your left over disk space. DO NOT RUN THIS WHEN YOUR VIRTUAL HD IS RAW!!!!! You should NOT do this on a running system. This is purely for making vagrant boxes damn small. Press Ctrl+C within the next 10 seconds if you want to abort!! EOWARNING sleep 10; echo 'Cleanup log files'; find /var/log -type f | while read f; do echo -ne '' > $f; done; echo 'Whiteout root'; count=`df --sync -kP / | tail -n1 | awk -F ' ' '{print $4}'`; dd if=/dev/zero of=/tmp/whitespace bs=1024 count=$count; rm /tmp/whitespace; echo 'Whiteout /boot' count=`df --sync -kP /boot | tail -n1 | awk -F ' ' '{print $4}'`; dd if=/dev/zero of=/boot/whitespace bs=1024 count=$count; rm /boot/whitespace; ### Repeat the above for other partitions you have. swappart=`cat /proc/swaps | tail -n1 | awk -F ' ' '{print $1}'` swapoff $swappart; dd if=/dev/zero of=$swappart mkswap $swappart; swapon $swappart;Furthermore – about this script – USE IT AT YOUR OWN RISK
Puppet modules in Jenkins.
- You will need a recent enough version of puppet-lint that supports the --log-format flag. Install the gem so that the Jenkins can use it.
- On Jenkins, you will need the Warnings Plugin and the HTML Publisher Plugin.
- Make sure that when checking the module from your VCS, it ends up in WORKSPACE/modules/module_name.
Go to the Configure System page and find the Compiler Warnings settings. Add a new console log parser and call it puppet-lint. I use following configuration for parsing puppet-lint warnings and errors.
The warnings plugin has been updated and now has puppet-lint support out of the box! So configuring puppet-lint manually is kind of useless now.
Name:
puppet-lintRegular Expression:
^\s*([^:]+):([0-9]+):([^:]+):([^:]+):\s*(.*)$Mapping Script:
import hudson.plugins.warnings.parser.Warning // map regular expression to strings String fileName = matcher.group(1); String lineNumber = matcher.group(2); String kind = matcher.group(3); String check = matcher.group(4); String message = matcher.group(5); // return a Warning. return new Warning(fileName, Integer.parseInt(lineNumber), check, kind, message);Example Log Message:
./manifests/params.pp:25:autoloader_layout:error:apache::params not in autoload module layout Jenkins job configurationWe will add several build steps that will run certain actions on our puppet modules.
- Check syntax
- Check style
- Generate documentation
1. For the syntax check, I use following shell script (add a build step):
for file in $(find . -iname '*.pp'); do puppet parser validate --color false --render-as s --modulepath=modules $file || exit 1; done;2. For the style check, we use puppet-lint (add another build step):
find . -iname *.pp -exec puppet-lint --log-format "%{path}:%{linenumber}:%{check}:%{KIND}:%{message}" {} \;3. And for generating documentation:
## Cleanup old docs. [ -d doc/ ] && rm -rf doc/ ## Dummy manifests folder. ! [ -d manifests/ ] && mkdir manifests/ ## Generate docs puppet doc --mode rdoc --manifestdir manifests/ --modulepath ./modules/ --outputdir doc ## Fix docs to how I want them, I don't like that the complete workspace is included in all file paths. if [ -d ${WORKSPACE}/doc/files/${WORKSPACE}/modules ]; then mv -v "${WORKSPACE}/doc/files/${WORKSPACE}/modules" "${WORKSPACE}/doc/files/modules" fi; grep -l -R ${WORKSPACE} * | while read fname; do sed -i "s@${WORKSPACE}/@/@g" $fname; done;In your post build section:
- Enable Scan for compiler warnings and select puppet-lint.
- Enable publish HTML reports (use ‘doc‘, ‘index.html‘ and ‘Puppet Docs‘ as values). This will add a link to the Job page linking your generated puppet docs.
That’s about it! Any suggestions / improvements on this are always welcome!
Notes:- I have some examples/tests setup on my Jenkins instance for testing at http://jenkins.vstone.eu. Since I use this for testing, it might be offline / broken / buggy at times.
- The scripts I use may also require some changes if you are using an older version of puppet. I’m currently using 2.7.x for testing my modules.
Puppet modules and using dot graphs (both are unrelated but related to each other)
Puppet modules… How I feel about them in a dot file:
digraph PuppetModules { node [ fontname = "Bitstream Vera Sans" fontsize = 10 shape = "record" ] edge [ fontname = "Bitstream Vera Sans" fontsize = 10 ] question [label="Do I need to edit a file in your module for changing settings?", shape="oval"] ok [label="Great.", shape="oval"] bah [label="You are doing it WRONG!", shape="oval"] question -> ok [label="No"] question -> bah [label="Yes"] }Read on if you want a rendered version.

