MR Job with JSON serialization for I/O

Mar 23, 2014 at 6:02 PM
hey there!

simple MR examples using basic string serialization with Mapper/ReducerCombiner Base and Mapper/ReducerCombiner Contexts are working fine. however I ran into issues when trying to work with JSON classes. I wrote an MR job as follows:
JsonInOutMapperBase<MyIn, MyOut>
with JsonMapperContext<MyOut> for the Map-Method and
JsonInOutReducerCombinerBase<MyIn, MyOut>
with JsonReducerCombinerContext<MyOut> for the Reduce-Method

(...of course Mapper-Type MyOut matches ReducerCombiner-Type MyIn)

Running this job from VS2013 on Azure HDInsight 2.1 cluster always fails with the following error/exception for the launched Map-Task(s)

java.lang.RuntimeException: PipeMapRed.waitOutputThreads(): subprocess failed with code 255

and the whole job eventually gets killed. Any ideas what I might be doing wrong here? As I said other non-JSON examples are working...

Unfortunately the samples in the source code are not using JsonInOutXXX -Base classes, so that doesn't help much.

Thanks for your help.