More and more I find that I have to state out loud, and often strongly, common sense thoughts that 10 years ago I didn’t need to state because the others in IT already thought that way or at least knew what I was going to say if I started to say something about something really stupid.
For example… If a system is deemed critical, it should be set up as High Availability and have solid Business Recovery processes. You can’t just call the system critical and suddenly it is less likely to fail. In fact, Murphy’s law would tend to indicate the opposite is true. There is no combination of hardware and OS that isn’t susceptible to eventual failure. That is why we come up with all of these cool high availability options and redundancy and failover, etc.
I don’t know if the problem is that people are labeling everything as critical so as to artificially promote urgency in recovery or if everything truly is critical but no one is thinking about failure when costing, solutioning, designing, and engineering and they just assume it will always work (perhaps because it is important and important things shouldn’t break) so when that working state is no longer true there is a lack of process, resources, and recoverability and someone somewhere takes an “unexpected” shot to the pocket book and overall customer service and system availability suffers.