Let's review again what
"Signals that an I/O exception of some sort has occurred. This class is the general class of exceptions produced by failed or interrupted I/O operations."In Hadoop everything is an
IOException. Everything. Some assertion fails,
IOException. A number exceeds the maximum allowed by the config,
IOException. Some protocol versions don't match,
IOException. Hadoop needs to fart,
How are you supposed to handle these exceptions? Everything is declared as
throws IOExceptionand everything is catching, wrapping, re-throwing, logging, eating, and ignoring
IOExceptions. Impossible. No matter what goes wrong, you're left clueless. And it's not like there is a nice exception hierarchy to help you handle them. No, virtually everything is just a bare
Because of this, it's not uncommon to see code that inspects the message of the exception (a bare
String) to try to figure out what's wrong and what to do with it. A friend of mine was recently explaining to me how Apache Kafka was "stringly typed" (a new cutting-edge paradigm whereby you show the middle finger to the type system and stuff everything in
Strings). Well Hadoop has invented better than checked exceptions, they have stringed exceptions. Unfortunately, half of the time you can't even leverage this awesome new idiom because the message of the exception itself is useless. For example when a MapReduce chokes on a corrupted file, it will just throw an
IOExceptionwithout telling you the path of the problematic file. This way it's more fun, once you nail it down (with a binary search of course), you feel like you accomplished something. Or you'll get messages like "
IOException: Split metadata size exceeded 10000000.". Figuring out what was the actual value is left as an exercise to the reader.
So, seriously Apache folks...
Stop AbusingLeave this poor little
Hadoop (0.20.2) currently has a whopping 1300+ lines of code creating bare
IOExceptions. HBase (0.92.1) has over 400. Apache committers should consider every single one of these lines as a code smell that needs to be fixed, that's begging to be fixed. Please introduce a new base exception type, and create a sound exception hierarchy.
- Apr 15: There is now an issue for HBase to fix their abuse of
- Will update if someone from Hadoop/HDFS/MapReduce files a similar issue on their side.
Post a Comment