0
我想为一些内容报废一个网站,但一切正常,但报废文本仅适用于控制台,但我想在浏览器上打印这些报废的数据。我认为我在处理回调的方式上做错了什么。任何人都可以帮忙吗?nodejs网络报废和回调问题
我的代码如下:从数据可用回调函数
app.get('/test', function(req, res) {
//All the web scraping magic will happen here
var url = 'https://www.mywebsite.com/path/to/abc';
var allText;
var getTheText = function() {
request(url, function getText(error, response, html){
// First we'll check to make sure no errors occurred when making the request
if(!error){
// Next, we'll utilize the cheerio library on the returned html which will essentially give us jQuery functionality
var $ = cheerio.load(html);
// Finally, we'll define the variables we're going to capture
var allText = $('body').children().find('p').text()
console.log('allText');
console.log(allText);
return allText;
}
else {
}
//return result;
});
console.log(allText);
}
getTheText();
console.log('gettheText is ' + getTheText());
res.send(allText);
})
只是一个提示,不要处理cheerio wile处理请求。使用redis或kue将其推入后台作业。一旦你完成了scraping,将结果推送到websocket或通过ws发送事件来获取结果 – georoot