Even companies without a formal incident process in place will have something in place to address users’ break/fix issues. The process may be ad hoc (Maturity Level 1) or something that is repeatable but not well documented or able to be reported against (Maturity Level 2). Regardless, most organizations don’t throw away computers when they stop working.
First, I would interview the principles involved in the existing incident processes, including business leaders of the end user community, the team that takes initial requests from users (Service Desk, desktop team, etc.), the desktop support team and their management, the infrastructure/network teams, the leadership of the app dev teams, and other IT vertical teams (DBAs, Security, Acquisition/Asset, etc.). This interview process may need to be repeated for the companies’ different divisions.
From these interviews, I would then:
- Determine the pleasure/pain points of the processes
- Document them (process flows and RACI matrixes)
- Determine the current maturity level
- Create a table showing the deltas of the different incident processes across different teams/divisions (if deltas exist)
- Determine which teams have opted out of the current processes
- Identify any integration points that already exist (change/release, problem, asset, etc.)
- Determine where integration points need to be developed
- Determine whether there are any significant single points of failure (both technological and personnel)
- Determine what monitoring is currently implemented
- Review old incident records where monitoring should have caught an issue but didn’t
- Determine the accuracy of existing incident data
- Establish baseline reporting against the incident data (length to resolve by priority/team, volume by priority/team, volume by affected service/technology, length to resolve by service/technology, etc.)
- Review the existing incident communications
Also, for incident management, there are three very different processes that need to be assessed:
- Normal break/fix
- Major incident response
- Security incident response
Often, a fourth process (the service request process) is closely associated with the incident process, and they commonly share the same forms and engagement models. In reviewing the three incident processes, I might also be reviewing the service request process.
After performing an assessment of the different processes, I would then then create a roadmap sorted by where the largest gains can be made with the least amount of effort. I would also highlight where large amounts of effort might be required to achieve management goals.
After obtaining buy-in from the associated stakeholders for the roadmap, I would develop an implementation plan that includes process, technology, and cultural change.
Making changes to a process is very disruptive to an organization, and making changes to the incident process is especially disruptive. This process affects not only every IT support team, but potentially the end users as well. Care must be taken at the outset to ensure that there are huge improvements for both IT teams and end users in order for the change to be accepted. After achieving some big wins, then attention can be turned to ensuring the process is effective, efficient, and meets management requirements.