我有以下代码从here改编,我使用Node.js和Cheerio读取html文件并将大型源文件拆分为小块。该代码适用于单个文件。Node.js fs cheerio读取和写入多个文件
现在我需要读取多个大型html文件并将它们依次分割并将结果文件输出到文件夹中。 如何读取和写入文件夹中的每个文件然后将其分开?
下面是代码:
var cheerio = require('cheerio'),
fs = require('fs');
fs.readFile('./sourceHtml2/testone.html', 'utf8', dataLoaded);
function dataLoaded(err, data) {
$ = cheerio.load(data);
$('#toplevel > div').each(function (i, elem) {
var id = $(elem).attr('id'),
filename = id + '.html',
content = $.html(elem);
fs.writeFile('./output2/' + filename, content, function (err) {
console.log('Written html to ' + filename);
});
});
}
这里是我的示例源文件
<!DOCTYPE html SYSTEM "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<title>Lorem Ipsum</title>
</head>
<body>
<div id="toplevel">
<div id="1-1">
<h1>HTML Ipsum Presents One</h1>
<p>
<strong>Pellentesque habitant morbi tristique</strong>senectus et netus et malesuada fames ac turpis egestas. Vestibulum tortor quam, feugiat vitae, ultricies eget, tempor sit amet, ante. Donec eu libero sit amet quam egestas semper.
<h2>Header Level 2</h2>
<ol>
<li>Lorem ipsum dolor sit amet, consectetuer adipiscing elit.</li>
<li>Aliquam tincidunt mauris eu risus.</li>
</ol>
<h3>Header Level 3</h3>
<ul>
<li>Lorem ipsum dolor sit amet, consectetuer adipiscing elit.</li>
<li>Aliquam tincidunt mauris eu risus.</li>
</ul>
</div>
<div id="1-2">
<h1>HTML Ipsum Presents Two</h1>
<p>
<strong>Pellentesque habitant morbi tristique</strong>senectus et netus et malesuada fames ac turpis egestas. Vestibulum tortor quam, feugiat vitae, ultricies eget, tempor sit amet, ante. Donec eu libero sit amet quam egestas semper.
<h2>Header Level 2</h2>
<ol>
<li>Lorem ipsum dolor sit amet, consectetuer adipiscing elit.</li>
<li>Aliquam tincidunt mauris eu risus.</li>
</ol>
<blockquote>
<p>Lorem ipsum dolor sit amet, consectetur adipiscing elit. Vivamus magna. Cras in mi at felis aliquet congue. Ut a est eget ligula molestie gravida. Curabitur massa. Donec eleifend, libero at sagittis mollis, tellus est malesuada tellus,
at luctus turpis elit sit amet quam. Vivamus pretium ornare est.</p>
</blockquote>
<h3>Header Level 3</h3>
<ul>
<li>Lorem ipsum dolor sit amet, consectetuer adipiscing elit.</li>
<li>Aliquam tincidunt mauris eu risus.</li>
</ul>
</div>
<div id="1-3">
<h1>HTML Ipsum Presents Three</h1>
<p>
<strong>Pellentesque habitant morbi tristique</strong>senectus et netus et malesuada fames ac turpis egestas. Vestibulum tortor quam, feugiat vitae, ultricies eget, tempor sit amet, ante. Donec eu libero sit amet quam egestas semper.
<h2>Header Level 2</h2>
<ol>
<li>Lorem ipsum dolor sit amet, consectetuer adipiscing elit.</li>
<li>Aliquam tincidunt mauris eu risus.</li>
</ol>
<blockquote>
<p>Lorem ipsum dolor sit amet, consectetur adipiscing elit. Vivamus magna. Cras in mi at felis aliquet congue. Ut a est eget ligula molestie gravida. Curabitur massa. Donec eleifend, libero at sagittis mollis, tellus est malesuada tellus,
at luctus turpis elit sit amet quam. Vivamus pretium ornare est.</p>
</blockquote>
<h3>Header Level 3</h3>
<ul>
<li>Lorem ipsum dolor sit amet, consectetuer adipiscing elit.</li>
<li>Aliquam tincidunt mauris eu risus.</li>
</ul>
</div>
</div>
</body>
</html>
您的帮助将不胜感激。
在看看['fs.readdir'](https://nodejs.org/api/fs.html#fs_fs_readdir_path_options_callback)。它允许你获取一个文件夹中所有文件的数组,你应该能够遍历该数组并传递给你的函数。 –