2012-03-29 196 views
1

我已经将许多时间表从Hive导出到SQL Server。我从来没有面对这个问题。在sqoop-export中遇到一些问题?

我使用字段分隔符作为“,”并且还在SQL Server中创建了一个表。

[email protected]:~/sqoop-1.3.0-cdh3u1/bin$ ./sqoop-export --connect 'jdbc:sqlserver://192.168.1.1;username=abcd;password=12345;database=HadoopTest' --table tmptempmeasurereport --export-dir /user/hive/warehouse/tmptempmeasurereport 

12/03/29 16:20:21 INFO SqlServer.MSSQLServerManagerFactory: Using Microsoft's SQL Server - Hadoop Connector 
12/03/29 16:20:21 INFO manager.SqlManager: Using default fetchSize of 1000 
12/03/29 16:20:21 INFO tool.CodeGenTool: Beginning code generation 
12/03/29 16:20:21 INFO manager.SqlManager: Executing SQL statement: SELECT TOP 1 * FROM [tmptempmeasurereport] 
12/03/29 16:20:21 INFO manager.SqlManager: Executing SQL statement: SELECT TOP 1 * FROM [tmptempmeasurereport] 
12/03/29 16:20:21 INFO orm.CompilationManager: HADOOP_HOME is /home/hadoop/hadoop-0.20.2-cdh3u2 
12/03/29 16:20:22 INFO orm.CompilationManager: Writing jar file: /tmp/sqoop-hadoop/compile/1c5aae88cd7daca66aa665d4bab5b470/tmptempmeasurereport.jar 
12/03/29 16:20:22 INFO mapreduce.ExportJobBase: Beginning export of tmptempmeasurereport 
12/03/29 16:20:22 INFO manager.SqlManager: Executing SQL statement: SELECT TOP 1 * FROM [tmptempmeasurereport] 
12/03/29 16:20:22 WARN mapreduce.ExportJobBase: IOException checking SequenceFile header: java.io.EOFException 
12/03/29 16:20:23 INFO input.FileInputFormat: Total input paths to process : 2 
12/03/29 16:20:23 INFO input.FileInputFormat: Total input paths to process : 2 
12/03/29 16:20:23 INFO mapred.JobClient: Running job: job_201203291108_0645 
12/03/29 16:20:24 INFO mapred.JobClient: map 0% reduce 0% 
12/03/29 16:20:29 INFO mapred.JobClient: Task Id : attempt_201203291108_0645_m_000000_0, Status : FAILED 
java.util.NoSuchElementException 
    at java.util.AbstractList$Itr.next(AbstractList.java:350) 
    at tmptempmeasurereport.__loadFromFields(tmptempmeasurereport.java:383) 
    at tmptempmeasurereport.parse(tmptempmeasurereport.java:332) 
    at com.cloudera.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:79) 
    at com.cloudera.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:38) 
    at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144) 
    at com.cloudera.sqoop.mapreduce.AutoProgressMapper.run(AutoProgressMapper.java:187) 
    at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:647) 
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:323) 
    at org.apache.hadoop.mapred.Child$4.run(Child.java:270) 
    at java.security.AccessController.doPrivileged(Native Method) 
    at javax.security.auth.Subject.doAs(Subject.java:396) 
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1127) 
    at org.apache.hadoop.mapred.Child.main(Child.java:264) 

12/03/29 16:20:34 INFO mapred.JobClient: Task Id : attempt_201203291108_0645_m_000000_1, Status : FAILED 
java.util.NoSuchElementException 
    at java.util.AbstractList$Itr.next(AbstractList.java:350) 
    at tmptempmeasurereport.__loadFromFields(tmptempmeasurereport.java:383) 
    at tmptempmeasurereport.parse(tmptempmeasurereport.java:332) 
    at com.cloudera.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:79) 
    at com.cloudera.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:38) 
    at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144) 
    at com.cloudera.sqoop.mapreduce.AutoProgressMapper.run(AutoProgressMapper.java:187) 
    at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:647) 
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:323) 
    at org.apache.hadoop.mapred.Child$4.run(Child.java:270) 
    at java.security.AccessController.doPrivileged(Native Method) 
    at javax.security.auth.Subject.doAs(Subject.java:396) 
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1127) 
    at org.apache.hadoop.mapred.Child.main(Child.java:264) 

12/03/29 16:20:38 INFO mapred.JobClient: Task Id : attempt_201203291108_0645_m_000000_2, Status : FAILED 
java.util.NoSuchElementException 
    at java.util.AbstractList$Itr.next(AbstractList.java:350) 
    at tmptempmeasurereport.__loadFromFields(tmptempmeasurereport.java:383) 
    at tmptempmeasurereport.parse(tmptempmeasurereport.java:332) 
    at com.cloudera.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:79) 
    at com.cloudera.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:38) 
    at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144) 
    at com.cloudera.sqoop.mapreduce.AutoProgressMapper.run(AutoProgressMapper.java:187) 
    at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:647) 
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:323) 
    at org.apache.hadoop.mapred.Child$4.run(Child.java:270) 
    at java.security.AccessController.doPrivileged(Native Method) 
    at javax.security.auth.Subject.doAs(Subject.java:396) 
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1127) 
    at org.apache.hadoop.mapred.Child.main(Child.java:264) 

12/03/29 16:20:43 INFO mapred.JobClient: Job complete: job_201203291108_0645 
12/03/29 16:20:43 INFO mapred.JobClient: Counters: 7 
12/03/29 16:20:43 INFO mapred.JobClient: Job Counters 
12/03/29 16:20:43 INFO mapred.JobClient:  SLOTS_MILLIS_MAPS=18742 
12/03/29 16:20:43 INFO mapred.JobClient:  Total time spent by all reduces waiting after reserving slots (ms)=0 
12/03/29 16:20:43 INFO mapred.JobClient:  Total time spent by all maps waiting after reserving slots (ms)=0 
12/03/29 16:20:43 INFO mapred.JobClient:  Launched map tasks=4 
12/03/29 16:20:43 INFO mapred.JobClient:  Data-local map tasks=4 
12/03/29 16:20:43 INFO mapred.JobClient:  SLOTS_MILLIS_REDUCES=0 
12/03/29 16:20:43 INFO mapred.JobClient:  Failed map tasks=1 
12/03/29 16:20:43 INFO mapreduce.ExportJobBase: Transferred 0 bytes in 21.0326 seconds (0 bytes/sec) 
12/03/29 16:20:43 INFO mapreduce.ExportJobBase: Exported 0 records. 
12/03/29 16:20:43 ERROR tool.ExportTool: Error during export: Export job failed! 

[我vesion是原样的Hadoop-0.20.2-CDH3, sqoop-1.3.0-cdh3u1, 蜂房0.7.1]

难道我做错了什么?请帮我解决这个问题。

很多谢谢。

+0

你们是不是将数据从外部蜂巢表导出? – WR10 2012-03-29 11:43:55

+0

@ WR10:是的。我正在使用外部表 – 2012-03-29 11:48:28

回答

3

我建议您在sqoop命令中添加- 字段终止的- 线路终止的选项。

+0

@ user1073129:您在hive中的外部表实际上并未将数据存储在hdfs目录/ user/hive/warehouse/tmptempmeasurereport中。所以sqoop无法找到目录中的文件。为了导出配置单元外部表,首先需要执行“插入覆盖目录{在此给出hdfs路径}”选项以从表中选择数据。由此,外部表中的数据被复制到给定的hdfs目录中。然后你可以运行sqoop作业给这条路。否则,您可以通过在创建外部表时提供给定数据的路径来导出数据。 – WR10 2012-03-29 13:11:01

2

我得到这个错误,如果我输出到表中有额外的列在文件中不存在。如果您检查自动生成的tmptempmeasurereport.java,您将看到Sqoop正在使用的逻辑。

2

我通过在文本输入文件中的最后一条记录末尾删除\n来修复了此错误。

  • "1,this,42\n2,that,100\n" - 失败
  • "1,this,42\n2,that,100" - 工程