Monday, August 30, 2010

Exceptions as part of "regular" control flow

I've heard this rule a lot: "Never use exception for control flow." Its an interesting statement to parse. We know what an exception is, but the definition of control flow is a little fuzzy, so lets clarify things a bit.

In computer science, control flow (or alternatively, flow of control) refers to the order in which the individual statements, instructions, or function calls of an imperative or a declarative program are executed or evaluated.

http://en.wikipedia.org/wiki/Control_flow

Immediately I'm confused. Is it possible to use an exception and not effect control flow? Nope. Exceptions are a means of controlling flow. Sometimes people making this point specify that its "normal" or "regular" control flow. "Normal" and "regular" are delightfully relative and unhelpful. Normal? Compared to what? What they're trying to get at is that they don't believe exceptions should be used unless there is an application error.

This seems odd considering that business layer classes aren't necessarily made for only one code path or even one application and therefore lack context to determine the exceptionality of the condition. I touch on this a bunch in my Exceptions vs Null article.

There are lots of "crazy" things you can do with exceptions and if you're interested in seeing more check out this article on c2.com. Here I will only be discussing three usages I found particularly interesting.

Exceptions as return values

This example comes from c2.com. I think even people who have never thought deeply about exception usage would never dream up this code. Still, I believe this might be the perfect example of what people are talking about when they say not to use exceptions for control flow.

void search( TreeNode node, Object data ) throws ResultException {
    if (node.data.equals( data ))
        throw new ResultException( node );
    else {
        search( node.leftChild, data );
        search( node.rightChild, data );
    }
}

Get it now? As a starting point I think we can all agree that this it batty. Its a search algorithm that throws an exception upon success. Really? The article correctly points out that this is a violation of the Principle of Least Astonishment. I know this code make me feel violated.

Exceptions as commands to the caller

} catch(MyService_Exception_CouldNotBeReached $e) {
    throw new MyOtherService_Exception_Retry("Couldn't reach my service, retry!");
}

This exception seems to be commanding the caller to retry something. Like the previous example, this also breaks the Principle of Least Astonishment. In order to benefit from the full functionality of the method you need to be setup to catch exception commands that it throws and follow out some other action to continue. This is a application-agnostic service making application-specific decisions (whether or not to retry). No thank you.

Exceptions as loop termination conditions

c2.com offers us another gem, and I don't mean that negatively.

try {
    for (int i = 0; /*wot no test?*/ ; i++)
        array[i]++;
} catch (ArrayIndexOutOfBoundsException e) {}

The first thing when I thought when I saw this was "What about things like Python's StopIteration?" One line later its mentioned. Yay! This article may be reading my mind. I do wonder if the fact that StopIteration exist in Python, Ruby, and Javascript now might start to carve away at exception/null failure/absence debate. I'd like to know more about the decision making that went into the the design of feature but so far have failed to find any discussions on the matter.

All that said, I'm not sure how I feel about the name - "stop" iteration. Sounds like the service commanding its caller. If I were to use a exception-terminated loop I'd prefer to explicitly specify the exception type rather than using a language generic exception with a name that is a command. Maybe something like this (which I would not be surprised to find already exists somewhere):

until(RecordNotFoundException $e) {
    print $recordSet->getNext()->getLabel() . "\n";
}

Tuesday, August 24, 2010

Categorizing exceptions: Subtypes vs Error codes

Today's problematic pattern: Catching a single, usually per-package, exception and then using an associated error code in a conditional statement to determine what else to do. I've seen several colleague programmers do it and they never seemed to think about doing it any other way. Its an oddly unconscious decision.

Consider the following code:

try {
    $http->get('/~bob');
    // do something
} catch (Http_Exception $e) {
    if($e->getCode() == 404) {
        // do something else
    }
}

When you use things like well-established HTTP status codes this almost seems reasonable. It gets a little weirder when you're talking about some internal code defined in some package no one knows anything about. Take this bank account withdrawal example:

try {
    $customerBankAccount->withdraw(100);
} catch (CustomerBankAccount_Exception $e) {
    if($e->getCode() == 123) {
        // 123 means they overcharged their account, SHIT!
    }
}

My problem with this is that we're using two tools for a single purpose. What is the purpose of an Exception subtype, Http_Exception or CustomerBankAccount_Exception, in these examples? Exception type categorization. What is the purpose of an exception code in these examples? Exception type categorization. Why are we categorizing the same exception using two different systems? Additionally, why would caller code be left to interpret magic numbers like 404 or some other constant value a programmer in your office came up with?

It can get worse. People create their own custom exception error codes with pre-specified ranges like code 100 to 200 are set aside for account balance errors and 200 to 300 are for "the bank spent all your money" -related errors. Once again, can't exception subtypes be used for this? The ranges are essentially more API for a developer to remember.

Better example time - GO!

This is more descriptive...

} catch (Http_Exception_NotFound $e) {

Sure, we all know what a 404 is anyway, but what about a 402 or a 203 or a 912 (Glenn Beck Logic Not Found)? If you're writing this code you're gonna need to figure out what you want to catch. The next guy reading through it probably doesn't want to figure it out and he'd appreciate it if was just looking at some plain English exception class names.

If you want to catch all those 4xx-class exceptions why not...

} catch (Http_Exception_ClientError $e) {

(The official category of 4xx errors is "Client Error")

I bet a lot of developers see this kind of problem and say "OH SHIT OH SHIT OH SHIT!!! I need to write a one line class with no body at all for like 20 exceptions, that's like 20 lines... OH SHIT OH SHIT OH SHIT!!!" I never quite get that. They would rather spend the time writing catch-blocks with if-blocks inside them that each duplicate some evaluation on a magic error code. It won't take very long for that try/catch/if/else boilerplate nonsense to make those 20 lines of exception definition code look mighty appealing.

Look what I found!

During my research for this post I stumbled upon discussions in many programming language communities about this idea that error codes can be used for exception message translations. "That sounds interesting," I thought. I've never considered that use before but as I read through a plan proposed to the Zend Framework I quickly began to dismiss the notion.

Thoroughly enjoy this proposal.

Currently, exceptions can only be handled on a per-class basis, or possibly by string comparison against the message.

Instead, we propose that exception codes be used throughout the framework, using a 4-byte hexadecimal format.

This has at least four obvious benefits.

First, it's now possible to distinguish exceptions based on the type of error instead of only by class. This allows users to handle them intelligently.

Second, codes let us do some interesting things with exception handling. Users would now be able to call a method such as:

if ($e->stringTooShort()) { 
    ... 
} 
if ($e->stringTooShort()) { ... }

thereby making exception special-casing easy.

Third, this gives us the ability to translate error messages using Zend_Translate (loaded only when an exception occurs) and using separate translation files (for example, .mo format). Not all developers speak English; those that do generally prefer to see exception messages in their native language.

http://framework.zend.com/wiki/pages/viewpage.action?pageId=22134

Who cares about point 4, I'm already gagging. Even the language irks me a bit. Just being able to do something doesn't make it "beneficial". A magical genie could come and grant me my one wish; the ability to shit dead baby seagulls. Its certainly something I couldn't do before, but is it beneficial? Maybe. I really hate baby seagulls.

I don't think I have to re-explain my feelings on Point 1 and 2, but whats the deal with lucky number 3? Oh, number 3. If you checked out the article you probably saw that each package in the whole framework has its own error code range preallocated. Its like preemptive namespacing for codes. No, it IS namespacing for codes. Yes, there's already a way to namespace exceptions; the same way we namespace any other class.

If the goal is to be able to translate any exception then we can't really do that with this approach. Other packages using the same error-code-to-string mapping scheme could overlap in code ranges, so we're at least going to have to catch each different base exception type for each package that has its own defined error code ranges. Then we're going to have to somehow map those to different translation files.

I'm not saying its necessarily easier to just map class names to exceptions, but I can't imagine it being harder. This new approach doesn't seem to be solving any problems and its creating a few new ones.

The proposal has tons of user comments. Some people pushed back. One user asked why PHP's Exception class took an error code in its constructor at all (a very good question). One of the proposal's authors answered:

The error code parameter is intended to be used exactly how we're using it: differentiating exceptions within exception "namespaces" (classes).

Sigh.

Finally, translations are for presentation. Are these presentation layer exceptions? No sir, sadly they are not. We're prepping our business layer exceptions with some data that is meaningless everywhere except the presentation layer assuming the presentation layer knows how to interpret the codes in the first place.

I don't know how to end this. I'd like to be shown something that throws my conclusions on its head, but from my reading thus far, I'm not seeing it. Feedback welcome.

Exceptions vs Null

Plenty of developers agree that returning mixed-type results is not a good practice. It leads to conditional statements wherever the method returning the result is used. Everbody agrees thats bad, and then Null walks in, and we're not sure anymore.

Null is suppose to be magical value which can passed as any type of object and represent "no object". Usually methods return Null when the thing we want to get is absent, and that absence is considered normal for our particular application. Maybe the caller asked for a row that doesn't exist in a table or the next line of a file when you've already reached the end.

If the method you are calling may return Null then you must (almost?) always check for Null. Isn't this why we dislike mixed-type return values? This is only the beginning of the problem with Null.

Personally, I've chosen this rule: "In OOP never return Null and instead always use exceptions".

Absence vs Failure

Those whom I disagree with seem to draw the line at "absence versus failure". They interpret the absence of something as being unexceptional and return Null. The failure to be able to do something within a method call is seen as being exceptional so they throw an exception (or maybe they return some other magic value because they think they're mother-fuckin' David Blane, and by that I mean talentless and insignificant).

This excitingly-named article - When To Use Exceptions - asks one question that seems rather compelling and butts heads with the "absence versus failure" rule:

"Who exactly are [library programmers] to decide that not finding [something] is a non-exceptional event for my application?"

I like everything about that statement including the touch of anger. Business layer classes, or libraries, shouldn't be making application layer decisions. Its like a business layer class triggering a fatal error rather than throwing an exception. I determine whats a fatal error, not the tools I'm using to build with. I always separate my application layer from my business layer, but until recently never considered determining exceptional conditions as part of that division.

Suppose we had a web app for managing blog entries. The app chooses a search-engine-friendly /english-words-dashed-together/ URL for the blog post based on the blog post title you entered. This gives us two obvious times were we want to lookup URLs:

  • We want to use the URL in a new blog post but need to make sure its not already in use (absence is not exceptional)
  • A blog post URL has been requested and we need to find the blog post and serve it (absence is exceptional)

Both of these scenarios could call upon the same method - $blogPostGateway->getPostByUrl($url). The exceptionality of this scenario depends on the context within the application, yet, as a business class, this gateway knows (or should know) nothing about the application which it lives in.

This does not mean that a method call lacks any context. There are still post-conditions for the method even absent an application context. A method call's post-condition is that it returns a value of a pre-specified type. If an object of the proper type cannot be returned because of absent data then we have an exceptional case. We don't need a new tool, Null, to sit in place of that object because we already have a tool to say "I failed!" - an exception.

Null = repetition

Suppose $blogPostGateway->getPostByUrl($url) does return Null. Now the caller is left to interpret these magic values. Should callers be left to duplicate those conditionals everywhere? That's not good design.

Preferably the "absence scenario" here would be handled on the blog post gateway in a method like $blogPostGateway->blogPostExists($url) which returns a boolean. No Null. No new exceptions necessary. This decision not to use Null also forces the developer to make the right choice and write these extra methods which check for existence.

Whats the difference, really?

This issue is seen by some as a matter of style. Exceptions with try/catch blocks seen as being identical to Nulls with if/else blocks. Is there a difference? I think so the previously mentioned points are huge, but there are functional differences as well.

Exceptions can be handled immediately, eventually, or allowed to go uncaught and fail fast. Null values may not cause problem until much later in a program's execution. That means cryptic error messages and more tracing to get back to where the Null originated.

Is there anyone else who knows a lot about Null that doesn't like Null?

It seems the guy who invented it calls it his "Billion Dollar Mistake".

Summary

Don't use Null. Don't. Do not.