In the current DevOps industry we strive to meet certain best practices that not only make our work easier, but also make our systems and applications more reliable. Two of those practices are to automate as much as possible and to document what we do. And sometimes, we can mix them and automate the generation of documentation. Here’s an example that made our lives better by helping us simplify the generation of documentation in an automated way.

Documentation

Documentation is one of the key resources in a systems engineering or development team. There are two aspects that pretty much everyone agrees on: it’s very important to have good and useful documentation and it’s very difficult to keep it updated and sane. The variables that influence this are many and generate many alternatives:

At CAPSiDE, we’re one of these teams that has tried several methods. We’ve used and we use several tools, because not all documentation is the same and technology keeps evolving. Here’s a story of how we automated the generation of HTML documentation from reStructured text with git, gitlab CI/CD and Docker.

Automating Documentation - CAPSiDE, architects of the Digital Society

The situation

Clouddeploy is an open-source tool developed at CAPSiDE that helps us create and maintain infrastructure in AWS through code. For the documentation on this tool, we started using a wiki, but as the documentation grew so did the idea to use another system. While looking at different alternatives, we came upon Sphinx which turns reStructured Text (RST) into several formats, such as HTML or epub.

Automating Documentation - CAPSiDE, architects of the Digital Society

Everything in CAPSiDE is under version control, so the RST files had to be managed by git. We also have a Gitlab where our repos live. So, with the first version of this documentation, any team member did the following to contribute:

This early process was cumbersome compared to just editing a wiki, so it needed some refinement. And since automation is at the core of systems engineering, we looked at ways to do just that.

Automating Documentation - CAPSiDE, architects of the Digital Society

CI to the rescue

Gitlab is a very complete repository hosting tool that provides CI/CD tools. This is very powerful and easy to set up, and Gitlab’s documentation on it is clear and complete. It’s as simple as this::

If you add a .gitlab-ci.yml file to the root directory of your repository, and configure your GitLab project to use a Runner, then each commit or push, triggers your CI pipeline.

In the previous workflow there were several opportunities to reduce work for the team:

  • Avoid making everyone set up the environment
  • Compile the code into html and make sure it looks good
  • Publish the generated HTML to the web server

All this could be done in two jobs in Gitlab:

  • One to generate the html and provide it to the user to check
  • One to publish

Gitlab provides several ways to run the CI/CD jobs. For a simple job like this in which setting up the environment was necessary, we decided to use Docker.

Gitlab Runners and Docker

The Docker image in this runner only needed to set up the environment so we could compile the code. A simple Dockerfile like this was enough for the job:

FROM python:2.7

RUN pip install sphinx sphinx_rtd_theme

With this, we get the official Python 2.7 image and use pip to install sphinx and the RTD theme. The next step is to create the image that the runner will use in the Gitlab server:

# docker build -t capside/python-sphinx .
# docker images
REPOSITORY TAG IMAGE ID CREATED SIZE
capside/python-sphinx latest 13bff878fafc 1 hour ago 744MB

We can now create the gitlab runner. To do this, you need to have sudo privileges in the Gitlab server and be an admin in Gitlab to obtain the token:

sudo gitlab-runner register

And answer to the questions it prompts (more information here). Some of the configuration set up by answering these questions can be changed later. The last question is about choosing the execution type:

Please enter the executor: ssh, docker+machine, docker-ssh+machine, kubernetes, docker, parallels, virtualbox, docker-ssh, shell:
docker

In this case, we choose docker, which brings us to the last question where we will use the docker image we created earlier:

Please enter the Docker image (eg. ruby:2.1):
capside/python-sphinx

We now have the runner set up. The next step is to configure the pipeline so this runner is used when we push to the repository.

Creating the pipeline

We now need to create the .gitlab-ci.yml file in our repository, which will set everything in motion. The pipeline will have two jobs:

  • Job 1: create the html
  • Job 2: publish

Here’s the .gitlab-ci.yml we generated:

image: capside/python-sphinx

stages:
- build
- deploy

build:
stage: build
script:
- make html
artifacts:
paths:
- _build/html

publish:
stage: deploy
only:
- master
before_script:
# Create the SSH directory and give it the right permissions
- mkdir -p ~/.ssh
# Add the SSH key stored in SSH_PRIVATE_KEY variable to the agent store
- echo "$SSH_PRIVATE_KEY" | tr -d '\r' > ~/.ssh/id_rsa
# Set right permissions
- chmod -R 700 ~/.ssh
script:
# Copy generated HTML files to documentation web server
- scp -r -o StrictHostKeyChecking=no _build/html [email protected]:/home/publish/doc
- ssh -o StrictHostKeyChecking=no [email protected] sudo /opt/docs/deploy_doc.sh

Here’s an overview of what this all means:

  • We indicate which docker image we will use for this pipeline (the one we created before and configured a runner with)
  • We indicate the stages of the pipeline. In this case, we have two stages: build and deploy
  • We define the jobs: build and publish
    • In the build job we indicate what we want to execute (make html) and what artifacts we will need to keep from this execution. The artifact will then be available after this job is done and anyone will be able to download it from the Gitlab page and look at the generated files to ensure they look as expected.
    • In the publish job we copy the generated artifact to the remote documentation server. Some details with this:
    • The only: master line indicates that this job will only be executed if the push was done to the master branch. Therefore, everyone can create branches with new documentation and when they push the pipeline will only execute the build job, allowing them to generate the HTML code and look at it without deploying it to production.
    • The before_script section sets up the environment needed to copy the artifact to a remote server through SSH from the container. It’s important to note there the $SSH_PRIVATE_KEY variable. This variable is set in the Gitlab interface of the project in Settings -> CI/CD -> Secret variables, where we put a name and a value to the variable and decide whether it is protected or not (that is, if this variable is available in jobs triggered from all branches or only protected branches).
    • Finally, the script section copies the artifact and triggers the execution of a script that deploys the code in the remote server accordingly.

Automating Documentation - CAPSiDE, architects of the Digital Society

The final situation

We talked in the beginning about the different steps that someone in the team needed to do to update documentation. The situation has changed now and it looks like this:

  • Pull the documentation repo from the company Gitlab server
  • Create a branch and start working on the documentation
  • Once content with the work done, push the branch to Gitlab. This will trigger the pipeline and execute the build job, which will provide an artifact: the HTML code in a zip file which the team member can download to make sure that everything looks as expected. The status of the job, its output and the artifacts can be seen from the project page in Gitlab, in the CI/CD section
  • If satisfied, launch a pull request to the master branch
  • When the pull request is accepted, the pipeline will be triggered again, it will generate the code and copy it to the web server

Automating Documentation - CAPSiDE, architects of the Digital Society

We have then removed the need to set up the environment for every team member, to compile the code and to copy it manually to the server once is approved. This makes updating the documentation way easier than before, which is key to keep it up to date.

TAGS: automation, ci pipeline, docker, documentation, documentation generation, gitlab

speech-bubble-13-icon Created with Sketch.
Comments

Leave a Reply

Your email address will not be published. Required fields are marked *

*
*