1
我已经安装了Nutch 1.9并将其配置为成功使用Solr 4.10.1进行爬网。我试图设置Nutch索引元数据,如此处所述https://wiki.apache.org/nutch/IndexMetatags如何索引nutch中的所有元标记
如何将它设置为索引站点上的所有元数据?我对metatags.names设定值*这样
<property>
<name>metatags.names</name>
<value>*</value>
<description>Names of the metatags to extract, separated by ','. Use '*' to extract all metatags. Prefixes the names with 'metatag.' in the parse-metadata. For instance to index description and keywords, you need to activate the plugin index-metadata and set the
value of the parameter 'index.parse.md' to 'metatag.description,metatag.keywords'.
</description>
</property>
,但我不确定如何设置index.parse.md值,而不列出个别元标记的名称。我想这
<property>
<name>index.parse.md</name>
<value>meta*</value>
<description>Comma-separated list of keys to be taken from the parse metadata to generate fields. Can be used e.g. for 'description' or 'keywords' provided that these values are generated by a parser (see parse-metatags plugin)
</description>
</property>
但运行
bin/nutch indexchecker http://nutch.apache.org/
时不显示任何元数据,我相信有元数据在该网站上,因为它在运行时返回Parse元数据
bin/nutch parsechecker http://nutch.apache.org/
任何帮助将不胜感激!谢谢