When developing a map-reduce program there is often the need to unit-test the components on small example inputs. The StreamingUnit class supports this with simple in-process execution of map-reduce components. This can also be used to aid in debugging the logic of your mapper and reducer

For example, if a map-reduce program defines `MyMapper` and `MyReducer`, it can be invoked in-process with

var output = StreamingUnit.Execute<MyMapper, MyReducer>( string[] input);  


The output of this call is a `HadoopStreamingOutput` object that includes component output, job output, counters and log messages. To perform a simple unit test of core functionality, use
    var output = StreamingUnit.Execute<MyMapper, MyReducer>( string[] input);
    if(output.Result != expected) throw new Exception("output doesn't match expected");

Last edited Oct 13, 2012 at 8:36 AM by mwinkle, version 1

Comments

No comments yet.