September 3, 2020

Ansible module documentation reliability and compliance

WORDS BY   Tadej BorovÅ¡ak


Module documentation is an entry point for Ansible playbook authors. So it is vitally important that we keep the said documentation in sync with the module implementation or risk getting angry bug reports. But here lies the problem.

Maintaining the synchronization between documentation and implementation is not as straightforward as it could be. Why? Because each Ansible module contains two copies of parameter-related information. Module maintainers must first document each module parameter and then copy a slightly transformed description into the parameter validator.

Currently, there is no way to remove the information duplication from the Ansible modules. So we decided to do the next best thing: automate the heck out of the copy-to-the-validator part. Laziness for the win ;)

A quick note on terminology

Arguments, options, and parameters in this post all have the same meaning. They represent the data that the Ansible playbook author passes to the Ansible module.

So why use different words to describe the same thing? Because the rendered Ansible documentation uses the name parameters, the documentation block in Ansible modules uses options, and validation-related code calls them arguments. Because why not ;)

Where does it go wrong

Each Ansible module is composed of different sections, but today we are only interested in two of them: the [documentation block][doc] and the AnsibleModule instantiation.

Documentation block

The documentation section is an inlined YAML document that, among other things, also contains an options key where all module parameter descriptions live. For example, this is how we would describe options for a simple module:

DOCUMENTATION = """
module: some.awesome.thing
short_description: Manage resources
options:
  name:
    description:
      - Resource name.
    type: str
    required: true
  state:
    description:
      - Resource's desired state.
    type: str
    choices: [ present, absent ]
    default: present
"""

If we were to render the previous parameter description into HTML, we would get back a table row similar to this one:

The parameter documentation screengrab.

The parameter documentation screengrab.

And people who prefer the terminal over web browsers (hello brothers and sisters) can use the ansible-doc utility to print this information to console.

In a perfect world, Ansible playbook authors would first read the API documentation and then write a task without making any mistakes. But unfortunately, we live in a world where mistakes do happen, and most people only read the documentation when something goes wrong. And it is hard to blame them for that because let us face it: most of the documentation that we, developers, write is crap ;)

AnsibleModule instantiation

Because we want to catch at least some mistakes in Ansible playbooks, Ansible modules validate their parameters. For our sample options from the documentation example above, we would instantiate AnsibleModule like this:

def main():
    argument_spec = dict(
        name=dict(
            type="str", required=True,
        ),
        state=dict(
            type="str", default="present", choices=["present", "absent"],
        ),
    )

    module = AnsibleModule(
        argument_spec=argument_spec,
    )

And just like that, we duplicated some of the information, broke the DRY principle, and set ourselves on a path of documentation and validation desynchronization.

Dealing with duplication - the Ansible way

So, how is Ansible currently dealing with the problem of information duplication? In short, it does not. What Ansible does offer is a way of detecting the desynchronization between the two copies of information.

The validate-modules sanity test will report the discrepancies between the API documentation and argument specification, but we still need to resolve them manually. For example, if we would change the default value for the state parameter to absent in the argument specification, Ansible’s sanity test would report back with this error:

$ ansible-test sanity --requirements --test validate-modules
ERROR: plugins/modules/thing.py:0:0: doc-default-does-not-match-spec:
    Argument 'state' in argument_spec defines default as ('absent') but
    documentation defines default as ('present')

This detect-and-fix approach works pretty well if we are working with existing Ansible modules. Parameter changes are usually small in such scenarios, which keeps the error message count low. But things start to go downhill if we are writing new Ansible modules. In cases where the Ansible module contains a non-trivial amount of parameters, we can quickly end up with hundreds of errors.

Introducing argument specification generator

Since we are adding new modules to existing Ansible collections quite often, we spend a considerable fraction of development time dealing with the initial desynchronization of information.

That prompted us to develop the ansible-argspec-gen tool that will generate argument specification directly from the module documentation. And yes, developers should not be allowed to name things ;)

Here is how the ansible-argspec-gen tool works:

  1. it starts by extracting the Ansible module’s documentation,
  2. then, it generates the argument specification from the extracted documentation,
  3. and finally, it updates the module’s source code between the markers.

Let us see how this would work on our sample module. Once we add markers to the module’s source code, we will end up with something like this:

def main():
    # AUTOMATIC MODULE ARGUMENTS
    argument_spec = dict(
        name=dict(
            type="str", required=True,
        ),
        state=dict(
            type="str", default="present", choices=["present", "absent"],
        ),
    )
    # AUTOMATIC MODULE ARGUMENTS

    module = AnsibleModule(
        argument_spec=argument_spec,
    )

Note that we are using a default marker text in our example, but you can customize it via the --marker argument. Now we are ready to run the generator:

$ ansible-argspec-gen sample.py 

Once the previous command terminates, our module will look like this:

def main():
    # AUTOMATIC MODULE ARGUMENTS
    argument_spec = {
        "name": {"required": True, "type": "str"},
        "state": {
            "choices": ["present", "absent"],
            "default": "present",
            "type": "str",
        },
    }
    # AUTOMATIC MODULE ARGUMENTS

    module = AnsibleModule(
        argument_spec=argument_spec,
    )

Magic ;) But there is more. If we supply the --diff switch to the tool, it will also print the module changes to the console:

$ ansible-argspec-gen --diff sample.py 
--- sample.py.old
+++ sample.py.new
@@ -20,14 +20,14 @@

 def main():
     # AUTOMATIC MODULE ARGUMENTS
-    argument_spec = dict(
-        name=dict(
-            type="str", required=True,
-        ),
-        state=dict(
-            type="str", default="present", choices=["present", "absent"],
-        ),
-    )
+    argument_spec = {
+        "name": {"required": True, "type": "str"},
+        "state": {
+            "choices": ["present", "absent"],
+            "default": "present",
+            "type": "str",
+        },
+    }
     # AUTOMATIC MODULE ARGUMENTS

     module = AnsibleModule(

The return code of the program indicates what happened during the run:

  1. 0 means that nothing changed.
  2. 1 means that the tool updated at least one module.
  3. 2 means that an error occurred during the execution.

And what we get if we combine the status code with the --dry-run switch? A check for our continuous integration pipeline that makes sure developers do not forget to run the tool. You are welcome ;)

The tool can do a few other tricks, like extracting various constraints from the documentation. But we will leave this information for another post. In the mean time, you can play with a sample module that we put into a GitHub gist.

Where can you start

The proper way of solving the issue at hand would be to remove the duplication from the code. But this is not entirely trivial to implement since the information from documentation fragments is not available to modules at runtime.

Thus, no matter how ugly the argument specification generator may look at first sight, it is currently the best weapon against information desynchronization.

Does all this sound complicated? Avoid getting your hands dirty and reach out. Get a high-quality Ansible integration in a fraction of the time with the help of our team. We are ready to do the heavy lifting for you.

Cheers!