New Job Submission APIs are integrated with Azure Subscription which allows those APIs to work with storage accounts associated with cluster and retrieve output of the jobs. In order to authenticate to Azure subscription you will need to install Azure management certificate. This can be done using instructions in the How to install and configure Windows Azure PowerShell guide. Once the subscription settings are imported you can retrieve SubscripitonId and thumbprint of the certificate by using Get-AzureSubscription PowerShell command in Azure PowerShell Prompt. Having this information you are ready to start creating the app.
  • First add HDInsight SDK NuGet to your project:
    PM> Install-Package Microsoft.WindowsAzure.Management.HDInsight
  • Then get the certificate object from certificate store using thumbprint to identify it
            var store = new X509Store();
            var cert = store.Certificates.Cast<X509Certificate2>().First(item => item.Thumbprint == "{Thumbrint of the certificate}");
  • Create Job Submission Client object using factory method
            var creds = new JobSubmissionCertificateCredential(new Guid("{Your subscription Id}"), cert, "{Your cluster name}");
            var jobClient = JobSubmissionClientFactory.Connect(creds);
  • Create job object that captures details of the job. In this sample we’ll create Hive job. In similar manner you can create Pig, map reduce, or streaming job
            var hiveJob = new HiveJobCreateParameters()
                Query = "select * from hivesampletable limit 10;",
                StatusFolder = "/samplequeryoutput"
  • Submit job to the cluster. This call will return immediately when the job submission request is complete and won’t wait for the job completion
            var jobResults = jobClient.CreateHiveJob(hiveJob);
  • In order to get results of the completed job we’ll need to poll for the job status until it is in Completed state. We’ll do this with the help of little helper function
            WaitForJobCompletion(jobResults, jobClient);

            private static void WaitForJobCompletion(JobCreationResults jobDetails, IJobSubmissionClient client)
                var jobInProgress = client.GetJob(jobDetails.JobId);
                while (jobInProgress.StatusCode != JobStatusCode.Completed && jobInProgress.StatusCode != JobStatusCode.Failed)
                    jobInProgress = client.GetJob(jobInProgress.JobId);
  • When the job has completed we can retrieve output of the job using GetJobOutput function. Similar functions are available to retrieve standard error output of the job (GetJobErrorLogs) and task logs (DownloadJobTaskLogs)
            using (var stream = jobClient.GetJobOutput(jobResults.JobId))
                StreamReader reader = new StreamReader(stream);
  • There are also other job management functions available to list jobs, get job details and cancel execution of the job:
    • ListJobs
    • GetJob
    • StopJob

Last edited Oct 24, 2013 at 11:40 PM by maxluk, version 3


iambillmccann Mar 8 at 9:53 PM 
The code...

var cert = store.Certificates.Cast<X509Certificate2>().First(item => item.Thumbprint == "{Thumbrint of the certificate}");

throws a compiler error...

"Error 4 'System.Security.Cryptography.X509Certificates.X509Certificate2Collection' does not contain a definition for 'Cast' and no extension method 'Cast' accepting a first argument of type 'System.Security.Cryptography.X509Certificates.X509Certificate2Collection' could be found (are you missing a using directive or an assembly reference?) ..."

I'm running VS 2012 against .Net 4.5