亚马逊弹性MapReduce：输出目录

我横贯弹性MapReduce，并保持击中，出现以下错误的亚马逊的例子运行：亚马逊弹性MapReduce：输出目录

Error launching job , Output path already exists.

这里是跑，我现在用的是作业的命令：

C:\ruby\elastic-mapreduce-cli>ruby elastic-mapreduce --create --stream \ 
    --mapper s3://elasticmapreduce/samples/wordcount/wordSplitter.py \ 
    --input s3://elasticmapreduce/samples/wordcount/input \ 
    --output [A path to a bucket you own on Amazon S3, such as, s3n://myawsbucket] \ 
    --reducer aggregate

Here is where the example comes from here

我下面的输出目录Amazon'd方向。存储桶名称是s3n://mp.maptester321mark/。我已经通过自己的所有建议找问题上this url

这里是我的credentials.json信息：

{ 
"access_id": "1234123412", 
"private_key": "1234123412", 
"keypair": "markkeypair", 
"key-pair-file": "C:/Ruby/elastic-mapreduce-cli/markkeypair", 
"log_uri": "s3n://mp-mapreduce/", 
"region": "us-west-2" 
}

来源

2012-07-29 Mark Peters

为什么啊，为什么必须S3迫使我们每次创建新的目录？ – 2014-06-15 05:45:49

Hadoop作业就不会破坏已经存在的目录。您只需要运行：

hadoop fs -rmr <output_dir>

在您的工作之前ot只需使用AWS控制台删除目录。

来源

2012-07-30 00:49:36

我已经在开始工作前删除了目录，但仍然会引发此错误。 – 2012-07-30 02:05:19

您是否能够验证它实际上已被删除？ – 2012-07-30 03:17:55

尝试不同的输出目录 – 2012-07-30 04:08:53

用途：

--output s3n://mp.maptester321mark/output

代替：

--output s3n://mp.maptester321mark/

我想EMR使输出斗运行之前，这意味着你已经有你的输出目录/如果指定--output s3n://mp.maptester321mark/和可能是你得到这个错误的原因。

来源

2012-12-01 03:06:45

--->如果文件夹（存储桶）已经存在，则将其删除。

--->如果你删除它，你仍然可以得到上面的错误，确保你的输出是这样的 S3N：// some_bucket_name/your_output_bucket如果你有像这样S3N：// your_output_bucket/ 它的一个问题与EMR！因为我认为它首先在路径上创建存储桶（some_bucket_name），然后尝试创建（your_output_bucket）。

感谢哈日

来源

2013-09-16 16:54:51 hbr

亚马逊弹性MapReduce：输出目录

回答

相关问题