Incremental Migration: First Things First

Is a total rewrite always a bad idea? I don’t think so. I use the pilot system often in new projects and that pilot is designed to be thrown away. What I get out of it is a proof of concept and knowledge about how the interlocking pieces of the system work. I plan to rewrite the pilot.

Joel Spolsky has this to say about rewriting code:

It’s important to remember that when you start from scratch there is absolutely no reason to believe that you are going to do a better job than you did the first time. First of all, you probably don’t even have the same programming team that worked on version one, so you don’t actually have “more experience”. You’re just going to make most of the old mistakes again, and introduce some new problems that weren’t in the original version.

In his example he refers to Microsoft rewriting Word from scratch. If they were working on the Word pilot then rewriting it would have been expected but they were so far into the development lifecycle it didn’t make sense. The issues they faced became new issues instead of known issues. When Saint Peter’s calling I prefer to tackle the devil I know.

Back in 2004 I worked for a radiology clinic where patient records went missing, insurance wasn’t billed, and diagnostic mixups were common. My team was tasked with designing software to help improve the workflow in this office. I sat with people at the front-desk doing order entry, with people in the back office doing coding and billing, with technicians operating rad machines, and with the office manager (doctors are always busy). I learned their jobs from watching them and I identified a few problems:

  1. Scheduled patients not visible to anyone but front-desk (lack of visibility)
  2. Manual data entry into billing system (typos!)
  3. After patient checks in, printed notes are carried to technician’s desk (docs get misplaced)
  4. Technician manually enters patient name into modality (moar typos!)

The result of these systemic problems was that the office manager had to manually compare schedule to billing to technician’s logs to doctor’s dictation notes. At the time this office saw around 80 patients per week and averaged 4 to 6 mistakes in the same time. The mistakes were fixed by the office manager using 3 admin tools to go in and update 3 databases. Then they had to check if the patient was scheduled again and notify front-desk and technician that the person they are about to see has records that changed recently.

What we came up with was an incremental improvement plan. Here are some of the new features:

  1. New browser-based scheduling UI works on every computer in the office instead of just front-desk
  2. Changes to schedule sync with billing system automatically
  3. Check in app prints barcode to identify patient documents and updates worklist database
  4. Technician hits “refresh” on worklist UI in modality to pull patient name in
  5. Every document gets a copy of barcode which can be scanned into modality or any web-based work station to pull up patient record with no typing

The result of our implementation was that typos became a thing of the past. When corrections did need to be made we implemented an admin tool that seamlessly combined the patient records from every database in the office regardless of format (MS SQL, MySQL, DICOM PACS). Because we built each component as its own pilot we were able to test and rewrite as needed while working with the real-time needs of clinic staff. Our schedule of releasing new code once a week meant they were testing while we were fixing bugs and writing new features. We could deploy patches in the afternoon and run through a smoke test before the schedule started again at 6am.

Our process was born out of necessity: Neither myself nor any of my teammates wanted to be on call at 6am. We had to be sure our work was stable even during development.

When we needed to rewrite, it had to be ready to go by the end of the week. That’s how we determined whether a rewrite was safe. If it couldn’t be done in a week the work had to be broken into chunks that could be tested by the staff during the next week.