The art of developer documentation
Writing technical documentation, probably one of the least favorite things most developers like to do, but equally important as writing code. Let’s see why…
- someone in your team developed and maintained a component in your stack, but that person has now left the company and you have to take over. You open the repository…and you see a blank readme file. You can’t believe that person didn’t document the component!
- and oh, remember that complex software system you wrote a year ago and have to add a new feature to now? You probably forgot how it worked…and worse, you didn’t document its design. This won’t be fun to figure out…and will probably take twice as long to implement.
- your company is rapidly expanding and hiring new people. With every new hire you have to explain them the same thing again and again, how tiresome!
Good documentation solves all of the above problems. Just like a good set of unit/integration tests, it saves you and the team time in the long run.
Let’s start with a cornerstone of good technical documentation, diagrams!
A good diagram is worth a thousand words, if it’s a good one…
However, many diagrams are just a set of boxes that don’t say anything about how the system works. You see these typically in high-level system “diagrams”.
Don’t make these your go-to diagram type for technical documentation! Don’t get me wrong, these have their purpose when you just want to list features or components of a platform, but generally make for poor technical documentation.
So, what’s a good technique to document a software system? That’s where the C4 model comes in.
The C4 Model - Documenting software systems
The C4 model is a powerful way to visualize software systems. It consists of 4 (zoom) levels that allow you to create meaningful diagrams which
for each level clearly indicate the systems/containers/components/actors involved and their relationship to each other. I highly recommend reading up on this on the excellent c4model.com site, made by Simon Brown, the original author of the C4 model.
Let’s go over each of the four levels by way of example.
Level 1 - System Context diagram
The highest level and ideal to show the structure of the system and its actors.
Let’s assume we need to document the system through which developers can manage cloud resources (servers, volumes, applications, …) on cloud platform X. This diagram only shows the high-level systems and actors that are at play in this scenario. The Management Console, the primary system we want to document, is displayed centrally in the diagram.
Things to take note of:
- I gave the diagram a proper name and legend, so that the purpose and components of the diagram are clear.
- no technologies are mentioned
- each block on the diagram has an indication of its type (person, software system) and a short description
Level 2 - Container diagram
Zooms in to one of the systems, by showing the containers (apps, databases, data streams, …) and their interactions. Here we zoom into the
Management Console system.
Things to take note of:
- we start to introduce technologies into the diagram, both for the containers and their interactions
- related systems and actors are still mentioned, but they take a backseat
Level 3 - Component diagram
Takes one container and shows the components within that container. These components are typically easily identified in the codebase of the component, as they form the high-level structures. In this diagram we zoom into the
Notifications API container.
Level 4 - Code diagram
Typically a UML diagram. I rarely use these and would only recommend them if you zoom into part of your code that might benefit from having its structure displayed, otherwise a UML diagram gets messy very quickly.
Flow Chart diagram
While the C4 system is great, you will also need some other diagrams to augment it. One of these is the flow chart diagram, which is great to visualize complex logic with multiple branching paths and decisions.
Take this example of a connection lifecycle flowchart:
Swimlane Flow diagram
Variation on the standard Flow Chart diagram, but allows you to make a clear distinction in which component or actor is responsible for which action. I typically use these when I want to visualize a flow between multiple components.
Take this example swimlane flow diagram, to visualize a rudimentary authentication and authorization system of a user interface.
Diagram tools I typically use are the desktop versions of diagrams.net (aka draw.io) and yEd. Both are free to use, and while not focused on creating software diagrams, do the job just fine in my experience.
Technical Feature documentation
When working in a team on a big project, you will need a proper way to document the design decisions that are made to implement a feature. Not only to get everyone on the same line during the implementation phase, but also as a reference after the feature has been implemented. This avoids the common “Why did we implement it like this? I remember it had a good reason, but I can’t remember anymore, so better not touch it!”.
In my experience it’s important to create a design document first which explains in detail what you’re going to build. It makes you think up front and allows an easier review process, but also makes for excellent documentation on how the various features in your system work. And yes, during the implementation process things might still change, but then it means just a minor change to the design document. If you don’t do this up front and not make it part of the process, chances are it will never get documented.
In open source projects you can find similar efforts, such as Apache Kafka KIPs (Kafka Improvements Proposals) and Apache Cassandra CEPs (Cassandra Enhancement Proposals).
Software Component Documentation
A good readme
Every repository of a component should have a good readme file, so that new developers can get up to speed quickly. It should:
- at the top of the document, have a short description of the component describing its function within the system
- a more in-depth feature list, describing in more detail the functionalities that are provided by the component. eg. supported authentication mechanisms, APIs that it exposes, data that it transforms, …
- a C4 Component Diagram, which shows the major sub-components, their interactions and the place of the software component in the wider system.
- document technologies and why they were chosen (framework, language, build tools, libraries), conventions and other architectural component decisions.
- mention related repositories: the other containers it interacts with, or important libraries that are being used
- have a getting started guide for developers:
- requirements: what to have installed on your system
- configuration: how to configure the component for development
- build instructions: how to compile, run tests and package the component
- release & deployment instructions: How to make a release of the component and how to deploy it
Next to a good readme, you can have other component documentation that goes more in-depth into the implementation details. This is where you can make use of flow/swimlane diagrams to visualize complex logic, discuss internal systems and dare I say UML diagrams where necessary.
Operating the component
When your component is going to run in production, you also have to describe how to properly operate it. This document will have to answer at least three key questions:
- start/stop/restart/scale actions: what to take into account? Can they be restarted without much issue, or scaled to any number of instances, or are there certain limitations?
- metrics: what are the key metrics exposed by this service and what’s their meaning?
- Incident handling: in case something goes wrong, how to troubleshoot and where to look for more information (logs, metric dashboards, …)?
- important configuration variables