“Houston, we have a problem…” This phrase makes me cringe about the same as gramatarians at the split infinitive in “… to boldly go where no man has gone before…” No, the Apollo astronauts didn’t have a problem – they had an incident.
An incident is an issue that is time sensitive and needs to be addressed in a timely manner depending on criticality and priority. A problem as the underlying cause of an issue that may or may not be addressed again based on factors such as likelihood to cause an issue, cost to fix, priority, etc. When you are in space and running out of air, you don’t have a problem – you have a critical incident.
How about some examples to make that clearer?
Your computer won’t boot up because of corrupt data. Incident or problem? The answer is – incident. You need to repair the corruption somehow to get your computer working again. Now, the underlying problem may be that you have physical defects on your hard-drive that cause data corruption.
You get a notice that your car has been recalled due to braking issues. Incident or problem? That answer is – problem. There isn’t an immediate issue with the brakes, but it sounds fairly likely that there could be. In this case, it’s a critical problem that should be taken care of right away.
For technology companies, this is also a critical difference because too many build process to support incidents (known as incident management) and not problems (problem management). There are some big differences, and having good processes in place for both can have big dividends.
Obviously, incident management is critical because major issues need to be addressed quickly or you can have large, negative business impacts. Even small things – like the CEO’s Blackberry going down – should be handled maturely and efficiently to help the IT shop appear effective and responsive. Having major systems out like email can really impact an organization.
Problem management can help identify longer range issues that can help avoid the firefighting and negative consequences of incident management. Problem management should include two prongs: 1) as a follow-on to incident management where a root cause analysis and set of recommendations is supplied and 2) as an ongoing effort to review potential risks and known issues encountered during normal operations or from outside the company.
If you are in a CIO or senior leadership position, make sure to as your operations team about incident vs. problem management and if they handle them differently. And, if they don’t give you a good answer, you should know… “Houston, we have a problem…”