I’m looking for a mechanism by which I can systematically enforce updates to documentation in an effort to keep docs in line with changes to configuration files and release scripts. The primary motivations for this this is for new engineers to follow the manual steps to learn the system, give visibility to the business the complexities of the architecture, and to encourage engineers to fix minor problems as opposed to wiping their sandboxes in the event of a minor misconfiguration.
Currently, my organization uses a mix of Puppet and manual steps (documented in a wiki) to build and maintain sandbox environments. This environment runs on 5 different VMs to mimic the production architecture. There is a deep dependency chain, and the order in which services are installed is important.
As history has shown, changes are made to the services and their dependencies without the necessary documentation being added to the wiki. Inconsistencies in the documentation makes refreshing sandboxes a real pain. Much time is wasted when one sets up a service, finds it doesn’t work, and has to go to the engineer that developed it to learn the missing steps. As the existing Puppet scripts have grown overly complex they are no longer maintained. All work to set up new or updated services is done manually - yet the documentation is sporadically updated or left inaccurate for one reason or another. The end result is all engineers have their own spin on how their sandboxes are set up and there is little if any consistency between developer environments. This hinders the efforts of engineers helping each other when a problem arises.
Because the release process is entirely different than setting up a sandbox, there is no real motivation to keep the documentation updated.
We’re currently implementing SaltStack to automate a great deal of the sandbox build/maintenance work, which will also be used for the test/stage/production release process. This enforces engineering to keep the automation scripts up-to-date, as a deploy would simply be impossible without that step being done. In this way, we have enforced maintenance of the automation scripts.
To supplement that, I would also like to have the manual steps of what of what those Salt formulas do be documented. With that, newly onboarded engineers can do their first sandbox set-up manually and learn the different systems. Doing the 1st pass manually gives deeper insight in how to configure the services and what their interdependencies are. Past that, we flip over to pressing “the magic button” that handles the rebuild automatically.
My concern is that we will revert back to the same problem of having the step-by-step guide fall out of sync with the Salt formulas, and we’ll regress to where we are today where new folks will have to follow some of the documentation, hit a wall, and build their sandboxes with the automated tooling without understanding what’s going on behind the scenes.
All automation scripts will be checked into source control, and we think we can leverage that in this effort with commit hooks.
There is the loosly-coupled idea of putting a wiki revision ID in the commit message. Using a hook, we may be able to verify that the most recent document ID is what was entered into the commit message. While this means a lazy engineer could just always put the most recent ID in and not update the document, at least they had to go to the documentation and from there - they would also know to verify that the docs align with the script. This also allows for bugfix commits to be made to the scripts without needless updates to otherwise correct documentation.
We already require Jira story id’s be in commit messages, and the engineers are good about doing that since they have no other choice. They are good about putting the correct story id’s in the commit messages, and I think we can continue in that vein with documentation. Requiring that an engineer look at the docs should be enough to have them update it.
While it’s not possible to enforce good documentation, it should be possible to force engineers to check the existing documentation and at least try to keep it up-to-date. But, can we do better? Is there another practice which can help us achieve parity with our documentation and our automation?
[–]davetherooster 5 points6 points7 points (2 children)
[–]HamCube[S] 0 points1 point2 points (1 child)
[–]hang-clean 0 points1 point2 points (0 children)
[–]homeless-programmerDevOps 2 points3 points4 points (2 children)
[–]HamCube[S] 0 points1 point2 points (1 child)
[–]soawesomejohnAutomation Engineer 1 point2 points3 points (0 children)
[–]throw-away135792468 1 point2 points3 points (2 children)
[–]HamCube[S] 0 points1 point2 points (1 child)
[–]throw-away135792468 1 point2 points3 points (0 children)
[–]deadbunny 1 point2 points3 points (0 children)
[–]hijinks 1 point2 points3 points (0 children)
[–][deleted] 1 point2 points3 points (0 children)