IOException
.
I'm not even going to debate how checked exceptions are like communism (good idea in theory, totally fails in practice). Even if people don't get that, I wish they at least stopped the madness with this poor little
IOException
.
Let's review again what
IOException
is for:
"Signals that an I/O exception of some sort has occurred. This class is the general class of exceptions produced by failed or interrupted I/O operations."In Hadoop everything is an
IOException
. Everything. Some assertion fails, IOException
. A number exceeds the maximum allowed by the config, IOException
. Some protocol versions don't match, IOException
. Hadoop needs to fart, IOException
.
How are you supposed to handle these exceptions? Everything is declared as
throws IOException
and everything is catching, wrapping, re-throwing, logging, eating, and ignoring IOException
s. Impossible. No matter what goes wrong, you're left clueless. And it's not like there is a nice exception hierarchy to help you handle them. No, virtually everything is just a bare IOException
.
Because of this, it's not uncommon to see code that inspects the message of the exception (a bare
String
) to try to figure out what's wrong and what to do with it. A friend of mine was recently explaining to me how Apache Kafka was "stringly typed" (a new cutting-edge paradigm whereby you show the middle finger to the type system and stuff everything in String
s). Well Hadoop has invented better than checked exceptions, they have stringed exceptions. Unfortunately, half of the time you can't even leverage this awesome new idiom because the message of the exception itself is useless. For example when a MapReduce chokes on a corrupted file, it will just throw an IOException
without telling you the path of the problematic file. This way it's more fun, once you nail it down (with a binary search of course), you feel like you accomplished something. Or you'll get messages like "IOException: Split metadata size exceeded 10000000.
". Figuring out what was the actual value is left as an exercise to the reader.
So, seriously Apache folks...
Stop Abusing
Leave this poor little IOException
!IOException
alone!
Hadoop (0.20.2) currently has a whopping 1300+ lines of code creating bare
IOException
s. HBase (0.92.1) has over 400. Apache committers should consider every single one of these lines as a code smell that needs to be fixed, that's begging to be fixed. Please introduce a new base exception type, and create a sound exception hierarchy.
Updates:
- Apr 15: There is now an issue for HBase to fix their abuse of
IOException
(HBASE-5796). - Will update if someone from Hadoop/HDFS/MapReduce files a similar issue on their side.
No comments:
Post a Comment