解决方案
我更改文件sqoop-site.xml中和端点添加到我的MySQL 。
步骤
创建MySQL实例并运行此查询: CREATE TABLE SQOOP_ROOT (version INT, propname VARCHAR(128) NOT NULL, propval VARCHAR(256), CONSTRAINT SQOOP_ROOT_unq UNIQUE (version, propname));
和INSERT INTO SQOOP_ROOT VALUES(NULL, 'sqoop.hsqldb.job.storage.version', '0');
改变原有sqoop-site.xml中加入你的MySQL端点,用户名和密码。
<property>
<name>sqoop.metastore.client.enable.autoconnect</name>
<value>true</value>
<description>If true, Sqoop will connect to a local metastore
for job management when no other metastore arguments are
provided.
</description>
</property>
<!--
The auto-connect metastore is stored in ~/.sqoop/. Uncomment
these next arguments to control the auto-connect process with
greater precision.
-->
<property>
<name>sqoop.metastore.client.autoconnect.url</name>
<value>jdbc:mysql://your-mysql-instance-endpoint:3306/database</value>
<description>The connect string to use when connecting to a
job-management metastore. If unspecified, uses ~/.sqoop/.
You can specify a different path here.
</description>
</property>
<property>
<name>sqoop.metastore.client.autoconnect.username</name>
<value>${sqoop-user}</value>
<description>The username to bind to the metastore.
</description>
</property>
<property>
<name>sqoop.metastore.client.autoconnect.password</name>
<value>${sqoop-pass}</value>
<description>The password to bind to the metastore.
</description>
</property>
当你在执行第一次命令sqoop job --list
它将返回零个值。但创建作业后,如果关闭EMR,则不会丢失执行作业的sqoop元数据。
在EMR中,我们可以使用Bootstrap Action在集群创建中自动执行此操作。
是的,问题是因为Metastore是保存在本地,有时我需要关闭ETL过程,当恢复时我需要从最后一个id重新启动。阅读文档,我看到了sqoop-metastore,并更改了sqoop站点。xml将这些属性远程保存在MySQL实例中。我明天会验证这种方法。 –
@CarlosEduardo this _(远程MySQL作为metastore)_将解决您的问题。 –
@CarlosEduardo你试过了吗? –