2011-05-23 105 views

回答

9

随着博托你可以做这样的事情:

args1 = [u's3://us-east-1.elasticmapreduce/libs/hive/hive-script', 
     u'--base-path', 
     u's3://us-east-1.elasticmapreduce/libs/hive/', 
     u'--install-hive', 
     u'--hive-versions', 
     u'0.7'] 
args2 = [u's3://us-east-1.elasticmapreduce/libs/hive/hive-script', 
     u'--base-path', 
     u's3://us-east-1.elasticmapreduce/libs/hive/', 
     u'--hive-versions', 
     u'0.7', 
     u'--run-hive-script', 
     u'--args', 
     u'-f', 
     s3_query_file_uri] 
steps = [] 
for name, args in zip(('Setup Hive','Run Hive Script'),(args1,args2)): 
    step = JarStep(name, 
        's3://us-east-1.elasticmapreduce/libs/script-runner/script-runner.jar', 
        step_args=args, 
        #action_on_failure="CANCEL_AND_WAIT" 
        ) 
    #should be inside loop 
    steps.append(step) 
# Kick off the job 
jobid = EmrConnection().run_jobflow(name, s3_log_uri, 
            steps=steps, 
            master_instance_type=master_instance_type, 
            slave_instance_type=slave_instance_type, 
            num_instances=num_instances, 
            hadoop_version="0.20") 
+0

奏效 - 感谢unthingable! – poiuy 2011-07-29 23:17:19

+0

由于VALIDATION_ERROR,我正在终止emr ..任何想法? – vks 2017-06-19 13:47:13