Friday, July 08, 2011

SOP Friday: Troubleshooting and Repair Logs

In the last SOP Friday post - New PC Checklists - I mentioned the TSR Log or Troubleshooting and Repair log.

The TSR Log is an extremely valuable tool for tracking issues, working with tech support from vendors, and documenting your work. We use a TSR Log whenever we build a server, when we call any vendor, and when a tech has worked on any issue for more than 30 minutes without making progress.

For newer technicians, we might require a TSR Log for any issue that causes more than 15 minutes work without progress.

In addition to being a GREAT documentation tool, the TSR Log is a great way to learn troubleshooting. It forces the user into thinking rigorously and documenting in such a way that you can effectively seek assistance from your co-workers or "tech support" on the other end of the phone.

SOP Friday: Troubleshooting and Repair Logs

- Overview -

A few years ago, I posted a commentary about one of our key philosophies for success: Know What You Know. One of the important tools you have to help in this endeavor is the TSR Log.

With a TSR Log, you can state very clearly what you've tried and what the results were. You can make a change and then undo it with confidence because you have a map of where you've been. This is perfect for working with a manager, another technician, or a vendor.

If you own the Network Migration Workbook, you'll find a sample TSR Log in each of the checklists. We use a TSR Log every time we build a new server. It's great documentation . . . and more.

If anything goes wrong, you’ll be able to document exactly what happened and where it happened in the process. This is very handy if you find yourself rebuilding that server from scratch some day. You’re going to hit the same snag and it will be very handy to have quick access to the solution.

A TSR Log helps you keep very accurate information about how long it actually takes to build a server. This number will change over time as you gain experience and Microsoft releases updates. But even though this is a bit of a moving target, the more accurate your information, the more profitable you can make your next migration! (This is true because your time estimates will be more accurate.)

From that book, here's part of our description of the TSR Log:
    "First, we need to make sure that we’re not continually performing the same 'fixes' again and again. If you keep track of what you’ve tried, in a systematic manner, you can eliminate causes for whatever problem you’re troubleshooting. Second, when someone comes to help you (a team member, an outside consultant, or a vendor), you can relay exactly what you have and have not tried. Sometimes vendors insist on going over the same ground, but you can stop them from going over the same ground more than once! Excellent records about what you’ve done can also help you get a problem escalated more quickly (sometimes). Third, when you need to go over a problem with a client, you will have excellent records about what you did, what you didn’t do, who was involved, and how long it took. This is all good information."

- Implementation Notes -

Implementation of this SOP is easy to initiate. But it can be difficult to get everyone on the team to go along with. Over time, you need to support one another by asking "Did you have a TSR Log?" For us, this is important enough to impact quarterly reviews. If the service manager asks to see a TSR Log and there isn't one, that's a potential career-ending incident!

First, you'll need a form (see next section). We post ours in .pdf format on our SharePoint site so technicians can access it easily. We also require technicians to carry one printed out and ready to go at all times. We require them to use a TSR Log whenever they have been “stuck” on a problem for any amount of time.

Second, to use the TSR Log, you need to simply fill out some key data and then proceed to take notes. There are two "triggers" for taking notes. One is whenever something significant happens. For example, when the server is rebooted, when a change is made, when an error occurs.

The second reason you enter something in the log is simply when you pass a fifteen-minute mark. Never let more than 15 minutes pass without an entry. It might simply be "Setup continued to unpack files." That way you know you didn't simply forget the log. But, more importantly, it will really help you pinpoint when things "go wrong" during an installation, configuration, troubleshooting, etc.

Once you have TSR Logs that have actually been used by technicians to solve problems, you'll need to deal with them properly. That means keeping all related notes together with the TSR Log. If you worked with a vendor to solve a problem, request a copy of their notes by email. This is true of Microsoft, Trend, HP, or anyone else you deal with.

Over time you'll see that your notes are MUCH better than theirs! Attach a copy of those notes to this document.

When the issue is resolved, three hole punch this document and place it in the Tech Notes section of the Network Documentation Binder (see The Network Documentation Workbook for a description of the Network Documentation Binder.

No. Having a PSA system does NOT eliminate the need for an NDB. In the PSA, annotate any related Service Tickets with a brief description of the problem and final resolution. Then simply refer to this TSR log by log number for full details on the issue.

For migration projects and server builds, you should probably keep a photocopy of the TSR Log in a file cabinet at your office. You can file by client/date, or simply keep all TSR logs together in one file drawer. Just make sure you can find it if you need it later.

- Form -

The TSR Log has three sections. At the top are sections for the client and the vendor (if relevant). After that, you simply need a series of lines with a place for date stamps and a line for notes.

Section One: Client
- Client
- Date
- Contact
- Technician
- Phone
- Log #
(The Log Number should be created as follows: # i.e. 2011.07.08.01)
- Description of Issue

Section Two: Vendor
- Support Service
- Required Numbers or Codes
- SR(X)
- Phone Number
- Service Contract
- Date and Time Initiated

Section Three: Notes

____:____ ________________________________________

- Final Notes -

If you're not used to TSR Logs, or rigorous note-taking, this one might be difficult to execute. But stick with it and everyone on the team will get better at some of the most important things you do.

Remember: Most of your LOST labor comes from re-work and disorganized troubleshooting. TSR Logs can help you address both of those issues.

We all know that computers don't act randomly. They can't. So when someone says that errors occur "randomly," they can't be correct. There's a pattern or a cause. We just can't see it.

With TSR Logs, we have a good chance of finding the pattern - and solving the problem - a lot faster!

Your Comments Welcome.


Check Out the Managed Services Operations Manual

Four Volume Set
The Managed Services Operations Manual

by Karl W. Palachuk

Over 1,100 pages - plus lots of juicy downloads

Paperbacks - Ebooks - Audio Books

Standard operating procedures, policies, and practical advice for IT consulting companies of all sizes.

From the author of Managed Services in a Month.

Learn More!

No comments:

Post a Comment

Feedback Welcome

Please note, however, that spam will be deleted, as will abusive posts.

Disagreements welcome!