As described in one of my previous posts, being a system administrator in the IT business world is not like to supervise an assembly line.
A software development factory or a data center offering IT services are very complex systems with many unpredictable events to manage and control.
In my personal experience, life in general is quite indifferent to our plans or wishes. Things tend to go as they want without a great respect for our needs. There must be a kind of moral in this story, but I’m a person too simple to treat this arguments, so i prefer to skip.
So, when you lead a SA group, you want to consider yourself a kind of ship captain. You and your crew are in charge of to set sail from the harbor A and to reach the safe harbor B. You aren’t driving a train on the tracks and to face storms and rough sea is part of your daily duty.
A complex system like a data center hides an unpredictable amount of entropy. Hardware failures, electrical power variation, conditioning stopping, software bugs, maleware attacks, new release of software, the unexpected behavior of intranet services, a badly configured automated backup or simply mice eating your network cables (It happened to me. I swear). A good captain knows that he can spread probes over the network, monitor servers behavior, insert/replace intranet services carefully, write down check lists (beware of people that don’t write check lists), prepare B (and sometimes C) plans, but, very important, he knows that is impossible to control entropy with strong and not flexible procedures.
When you will start to forget what you learned at college, may be you will start to understand that you cannot control the chaos, but you can facilitate it trying to drive the events on the route you are in charge to follow.
It means that your blueprint will be really good only if it isn’t carved in stone and you have experience and mental flexibility to face unexpected events with the required discipline and knowledge.
Too many times I saw people trusting on new monitor software or amazing intranet servers. In my opinion, every object (software or hardware) carries its own internal entropy rate EO and introducing it in your system can be considered an improvement only if the final entropy of the whole system is reduced. Otherwise you are only adding new chance of failure.
So, trust more on your experience and less on “solutions”.
This is a dummy one