[Code Review] 1. The Case for Code Review

Just wanted to pause and write about that super-exciting topic: Code reviews.

Let's begin with defining what code review is, and establishing the case for making it part of your development process. We'll also take a trip down memory lane and acknowledge doing code reviews was really hard for years, and now it's laughably easy.

What is Code Review?

Simply, code review is when someone else looks at your code to critique for defects.

Metaphorically, it's like when you hand your rough draft of a piece of writing to someone else and ask: "Does this make sense to you?" The writing process and programming process are close kin, but that's another post unto itself.

So what is the reviewer looking for?

Does the (changed) code do what you set out to do?
Have you considered various types of good and bad input?
Do you handle errors appropriately?
Does the work communicate intent clearly and concisely to the reader?

There are more, but #4 interests me, because it's the same thing you'd say about a piece of prose. Again, a topic for another day, but it's quite interesting that it's relatively easy to compiles reliably and which accomplishes a task. It's harder to make code a human can understand days, weeks, even years later. See Software Craftsmanship.

Why Review Code

Code review promises many things:

Fewer bugs in your production code.
Cross-training people, lessening the business risk of turnover.
Promotes better "team" ownership of a codebase.

Let's unpack that a bit.

Bugs happen. Thankfully our tools handle things like syntax errors, module integration issues, and style-guide problems automatically, most of the time. However, logic and design problems still occur, and other sets of eyes really help find that before your customers do.

Cross training and collective ownership don't happen organically. People--especially software folks--like to specialize and master things. They like doing what's comfortable and easy. In absence of other energy or intent, you'll quickly find that a codebase turns enclaves like "Carl's Code" or "Larry's Code" with everything but a

/* STAY OUT UNLESS YOU TALK TO ME (NATHAN) */

comment on it. Actually, I've seen those comments. They're funny!

In any case, a session where other folks have to review code forces them to read and understand the code. At minimum, some awareness of how another piece of code works, and it de-personalizes the code and makes it more a property of the entire team.

That being said....

Why Doesn't it Happen?

I submit: Doing required code review isn't self-evident. It takes time. It can be frustrating, both for the reviewer and code author(s).

I've been programming professionally for 17 years, but during that time I've only been in groups with effective, systemic code review for ~5 years, and even that off-and-on. Why is that?

It's useful to look at the reasons people don't:

ROI: It takes too long / We don't have the time.
Operational feasibility: It's too hard.
ROI: I have no way of verifying the code under review actually got fixed.
ROI: We have tools that enforce most of this stuff anyway.

Thus, the resistance factors are: (1) Is this worth my time? (2) Is it necessary? (3) Is it practical, day-to-day? To answer, we need to contrast programming before ~2009 and after.

The Before (Tools) Time

Scenario

You've done 3 months worth of work on a new feature for your software, largely by yourself. You've already committed the code 2 weeks ago and people have change it since. There's a ticky mark on your Project Manager's spreadsheet she copy-pastes under everyone's quarterly features: Code Review Complete. So, you call a meeting of the whole team and email them links to your code in CVS.

If you're lucky, everyone brings their 7lb+ laptops with PCMCIA wireless cards to the meeting. If you're not, people bring page after page of print-outs of the .h/.cpp files with scribbles. The pedant(s) will bring a set of style sins you've committed and hand you those on a sheet (with a superior smile).

Then the meeting begins. If it's a formal review, the author (you) says nothing. This is formal. This is by the book, dammit. The facilitator reads your code, line-by-line, and people raise questions and defects. Another participant--the scribe--writes down those defects along with line numbers.

Lord willing, you make it to the end of your changeset by the end of the hour(s). If it's anything non-trivial, you probably won't. You're handed/emailed the defects, and you're expected to fix them and call another verification meeting where we'll do this all again.

Yes, that's what code review looked like in 2002 or so.

Problems

Let's acknowledge that the above can work. It's a very formal process, developed and codified. I've done it, probably 3 times in 17 years. One of those 3 times it was effective.

Centrally: Who has time to do that?

The problems I observed with the above:

It's a "super meeting": Everyone has to synchronize the prep-time, the meeting, and the follow-up. A meeting like that is man-weeks of impact to your schedule.
No one does the prep work. Well, the pedant(s) on your team will. They'll love eschewing real work to critique others (yet another topic for another day). In reality, of 8 team members who walk in that door as reviewers, maybe 2 will have read any of your code before that day.
It's painful to collate, parse, and take action on the review comments. They're out of context, and use absolute references to code:

Doofus.cpp, line 1337: Consider 'switch' instead of repeated 'if/else' blocks.

Thus, you have to take action on the reviews from bottom to top or your line numbers get messed-up. Tedious, at best.

No one wants to attend the follow-up meeting to verify your changes were correct. People will acquire strange illnesses and schedule vacations to garden spots like Detroit to miss that meeting, believe me.

Like most business problems, the above have very little to do with technical issues. Mostly it's people. If it's too hard, people find ways to not do it. Also, code reviews were the second thing that got cut to meet schedule, right after that nebulous 'Unit Tests' line item on the Gantt chart.

Code Review in the Github Era (post 2009)

Scenario

You have a user story you took off the Kanban board this morning. You code up a unit test and write the code for it before lunch. You check-in your changes to a git branch and push that branch to 'origin'. Once the ad-hoc build for your branch completes (automatically), you create a pull-request to your team.

Wherever they might be over the next 2 hours, your team members review your changes--it's only ~40-50 lines of ruby/groovy, anyway. Their screen shows them the current code and your suggested changes in a two-pane view. When they have comments, they click on the line number on the screen and type-in their comments.

At around 3pm, you see Sue has given some feedback you need to integrate; she found a critical corner case you didn't account for. You update your unit test and your code, then push again. The code review updates automatically with the change.

You receive 3 '+1' comments from your teammates and you merge the changes by 4pm.

How is this better?

It's asynchronous. Your teammates may review your code any time the review is open.
It's location-independent. Your reviewers could be anywhere in the world.
Zero prep work is required. Given reasonable familiarity with the codebase, the 2-pane diff view shows you the changes in context.
The system handles the comments, rework, and verification for you. You have: Unit tests, Integration tests, automated build, and a code review system. There's no way to lose the context of the review.
Verifying your rework is usually trivial for the reviewers.

Summary

Going back to our original resistance factors:

1) Is it worth my time? Unless you're doing 50 pull requests per day, yes. Reviews can be less than 10% of your time every day.

2) Is it necessary? One can produce working code without code review. Arguably, if one wants a codebase that can sustain a 100% developer turnover in 3 years, then I'd consider it essential.

3) Is it practical? Formal reviews were never practical, unless you worked at NASA or on flight control software. The ROI wasn't there. A more github-style flow is entirely practical.

* *

I'd argue Code Review was an ad-hoc, spotty practice before the tooling improved. Like diet and exercise, it was "What we should do," but precious few of us were diligent about it. The ROI was barely there before tooling caught-up to how people work in the git/github era.

Now, given excellent tools like Github and ReviewBoard, an engineering group can integrate a code review into their workflow easily. Again like diet and exercise, if one is diligent every day, it's easy to make it a lifestyle.

Search This Blog

Harry's Ruminations