There is the metaphor of the boiling frog (https://en.wikipedia.org/wiki/Boiling_frog). It is something fascinating when you read about it in a distant mental model. In truth, the concept that a frog will not react to gradual change and will ignore signs, it is just a misconception proven wrong by experiments. But it is indeed a good mental model to warn us about dangers of gradual failure and people lack of reaction of slowly creeping in issues.
I like to think about my self as being rational and not fail for this inability to spot slow trends. Well it is not true, I totally implemented a boiling frog in the context of my dev machine:
- When it was new, all was blazingly fast, naturally had somewhat good amount of ram, fresh processor, bells and whistles.
- Then I start to see how it starts to work, slower. Maybe I installed too many programs, maybe I should remove the virtual machines,
- Action 1: Uninstall Docker and VM, disable virtualizations from BIOS, who knows maybe a laptop is not meant for virtualization. The system looks working slightly better. In hindsight - this is me suffering from a variation of confirmation bias (https://en.wikipedia.org/wiki/Selective_perception)
- Action 2: Disabling virtualization help somewhat but not yet at right performance (measured subjectively) so next step playing with system performance. I observed the system show increase clock speed when the performance was degraded, so the fix was disable turbo bust.
- Action 3 : disable external graphic card - for sure that was not helping the system.
- Action 4: Disable turbo bust helped, but the system still struggled and I now observed that it suffered from sudden CPU frequency decreasing to 0.8 Ghz, well bellow the 2.3 GHz nominal speed. Next step go with system to configure extreme performance mode for the cooler. My poor machine sounds like a plane ready to take off, but the performance was stable
- Well, this summer I started to see again degradation of performance and thermal throttling, even with the maximum fan speed. Here is when I become intimate friend with tools to monitor the temperature and frequency of the CPU. It looks my CPU was standardly gets to 80 degrees C, and stay there without big stress. Maybe was my dev setup and having a SQL Server running
SO, enough is enough! my performance minded engineering woke up and come with a hypothesis, a rather natural one: what if my cooling paste was failing . I knew originally my system should have use the fancy “thermal grizzly cooling paste”. Good thought and researching it actually a real thing - this paste does degrades and dries over time, exactly my experience.
- So get myself equipped with an alternative long lasting, slightly less conductive thermal paste
- I opened my laptop
- I remove loads of stuff from my fans, which by itself was enough to make it poorly performance
- I was ready to face ultimate evil - the dry thermal paste, when, looking at the heat sync it was - dry, completely dry and clean. Not ever touched by paste. Both the CPU, GPU and the heat sync were clean as a whistle.
Oh how much emotions did I feel for my beloved Dell !!! What the heck ! Of course the poor computer would suffer, and the gradual degradation was just the fan getting clogged with detritus and stuff
In the end, with a little of paste, now the computer stays 60-70 degrees C even when I stress it with online CPU tests (https://cpux.net/cpu-benchmark-online), next step: put back turbo boost, virtualization, external GPU, reduce the fan so I can concentrate while using the box.
Learnings :
- When you see something, try to get data, to correlate with something clear
- Boiling frog can happen to everybody, even if they know about the boiling frog metaphor. We are just too heuristic as people, and always we have to doubt
Ending notes : For the same story, I asked my old friend GPTChat to write it.
Prompt:
write an essay about a software engineer that feels like a boiling frog when he fails to recognize his continuous degrading machine performance, first turning on turbo bust, then removing the external graphics , then increase the cooler speed. instead realize that the compute cooling was faulty
Response :
As a software engineer, I often found myself working long hours in front of my computer, pouring over lines of code and debugging complex problems. But as time went on, I began to notice a disturbing trend: my machine’s performance was slowly degrading.
At first, I tried to fix the problem by turning on turbo boost and increasing the speed of my cooler. But these measures only provided temporary relief and the performance of my machine continued to decline.
As a seasoned engineer, I knew that a machine’s performance could be affected by a variety of factors, including its hardware, operating system, and even the ambient temperature of the room. But despite my best efforts, I couldn’t figure out what was causing my machine’s performance to deteriorate.
It wasn’t until I removed my external graphics card that I finally realized the problem: my machine’s cooling system was faulty. The heat generated by my processor was not being dissipated properly, causing it to overheat and resulting in a decline in performance.
I immediately ordered a new cooling system and installed it, and to my relief, my machine’s performance returned to normal. But the experience left me feeling like a boiling frog - I had failed to recognize the problem until it was almost too late.
As a software engineer, it’s important to pay attention to the performance of our machines and to not overlook the potential causes of degradation. In my case, it was a faulty cooling system, but in others, it could be a hardware malfunction or even a problem with the operating system.
Regardless of the cause, it’s essential that we stay vigilant and take action to address any issues that arise, before they have a chance to impact our work and productivity. In the end, taking the time to properly maintain and optimize our machines is worth it, as it allows us to continue to do our best work and stay at the top of our game. -
Learnings : Hmmm, the AI chat friend is significant better with words then me, for sure shorter and sweeter.