Let’s look at how a mental model can evolve during the problem-solving process. Let’s say you have two separate applications, deployed onto two separate servers. These applications don’t interact with each other, so your initial model might be something like this:
This model could be useful for some things. You might reasonably expect that you could upgrade each application independently, or restart application 1 without worrying about the users of application 2.
But let’s say that the users of the two applications start to complain about performance problems, and in fact, the complaints seem to come at the same time. You start to monitor the two applications, and you see something surprising – the response times from the two servers are correlated!
Fortunately, we now know that all models are wrong. And our original model is no longer useful because it can’t explain the correlation in response times. It is time to evolve to model to account for what we’ve observed.
We now know that these two applications must be coupled together in some way, and we might start by looking for that coupling in the hardware environment. After asking a few questions, we discover that these two applications are running on virtual servers and in fact, the two virtual servers are running on the same hypervisor. We can now create a new model:
This model tells us that the two supposedly independent applications actually may compete with each other for the computing resources of the host hypervisor. In fact, we have no real reason to expect that our two applications will have the hypervisor all to themselves, so we should really draw the model like this:
Now we’re getting somewhere. This model is more useful because it gives us some things we can investigate. We can look at the resource utilization on the hypervisor, for example, to see if the usage patterns also correlate with the performance slowdowns. We can look for other applications which might be interfering with the performance of our two applications. And we can look for problems in the shared storage system.
We could continue to make the model more sophisticated. Perhaps the two applications share a database server, or share a part of the network infrastructure like a firewall or a proxy server. We could keep evolving the model as we find more places where are applications are coupled together. Each evolution allows us to ask more questions and investigate more potential root causes.
Still, remember that each evolution of the model is wrong because there is always some element of the real deployment that was left out. Models can grow in complexity almost indefinitely, like this one:
This level of detail can be extremely useful when trying to figure out an elusive problem. The model suggests many places to look and many questions to ask. But that can be a disadvantage, too. The best models hide unnecessary detail, because too much detail can become a distraction.
The main point of this discussion? If you are not making progress in figuring out what’s wrong, it is time to reevaluate your mental model and find one that is more useful. And never become too attached to your mental models, because they are always wrong!