所以我目前正在制作一个谷歌浏览器扩展程序,当我将一个新成绩发布到我的大学的所有课程成绩簿中时,它会通知我,所以目前我正在尝试迭代地抓取和抓取网址并将其与最后一次迭代进行比较(...),并且当我使用request()函数时(即使使用异步),函数当前返回未定义的响应和正文,并给我这个错误的其他奇怪的事情,如果我尝试console.log所有这些。request()函数返回未定义的值
这里之后,我发现了错误:
bundle.js:24 Uncaught TypeError: Cannot read property 'headers' of undefined
at Request._callback (bundle.js:24)
at self.callback (bundle.js:54273)
at Request.EventEmitter.emit (bundle.js:95413)
at Request.start (bundle.js:54842)
at Request.end (bundle.js:55610)
at end (bundle.js:54652)
at bundle.js:54666
at Item.run (bundle.js:103974)
at drainQueue (bundle.js:103944)
这里是我的代码(更改URL,这样你就看不到我校的登录网址):
var Crawler = require("simplecrawler"),
url = require("url"),
cheerio = require("cheerio"),
request = require("request");
var initialURL = "https://www.fakeURL.com/";
var crawler = new Crawler(initialURL);
request("https://www.fakeURL.com/", {
// The jar option isn't necessary for simplecrawler integration, but it's
// the easiest way to have request remember the session cookie between this
// request and the next
jar: true,
mode: 'no-cors'
}, function(error, response, body) {
// Start by saving the cookies. We'll likely be assigned a session cookie
// straight off the bat, and then the server will remember the fact that
// this session is logged in as user "iamauser" after we've successfully
// logged in
crawler.cookies.addFromHeaders(response.headers["set-cookie"]);
// We want to get the names and values of all relevant inputs on the page,
// so that any CSRF tokens or similar things are included in the POST
// request
var $ = cheerio.load(body),
formDefaults = {},
// You should adapt these selectors so that they target the
// appropriate form and inputs
formAction = $("#login").attr("action"),
loginInputs = $("input");
// We loop over the input elements and extract their names and values so
// that we can include them in the login POST request
loginInputs.each(function(i, input) {
var inputName = $(input).attr("name"),
inputValue = $(input).val();
formDefaults[inputName] = inputValue;
});
// Time for the login request!
request.post(url.resolve(initialURL, formAction), {
// We can't be sure that all of the input fields have a correct default
// value. Maybe the user has to tick a checkbox or something similar in
// order to log in. This is something you have to find this out manually
// by logging in to the site in your browser and inspecting in the
// network panel of your favorite dev tools what parameters are included
// in the request.
form: Object.assign(formDefaults, {
username: "secretusername",
password: "secretpassword"
}),
// We want to include the saved cookies from the last request in this
// one as well
jar: true
}, function(error, response, body) {
// That should do it! We're now ready to start the crawler
crawler.interval = 10000 //600000 // 10 minutes
crawler.maxConcurrency = 1; // 1 active check at a time
crawler.maxDepth = 5;
crawler.start();
});
});
crawler.on("fetchcomplete", function(queueItem, responseBuffer, response) {
console.log("Fetched", queueItem.url, responseBuffer.toString());
});
// crawler.interval = 600000 // 10 minutes
// crawler.maxConcurrency = 1; // 1 active check at a time
// crawler.maxDepth = 5;
//
// crawler.start();
一件事需要注意的是,我将'no-cors'模式添加到了我的请求中,因此,只要我测试了这一点,我就可以停止发现CORS的问题,但这可能是导致此问题的原因吗?
谢谢!
编辑:我使用Browserify在浏览器中使用require()的东西。我无法发布bundle.js中的实际代码,因为它非常长,并且不适合这里。只是想澄清一点。谢谢!
EDIT2:这里是我给什么,当我尝试做的console.log(错误):
Error: Invalid value for opts.mode
at new module.exports (bundle.js:108605)
at Object.http.request (bundle.js:108428)
at Object.https.request (bundle.js:97056)
at Request.start (bundle.js:54843)
at Request.end (bundle.js:55613)
at end (bundle.js:54655)
at bundle.js:54669
at Item.run (bundle.js:103977)
at drainQueue (bundle.js:103947)
试着弄清楚'error'的内容是什么,并检查'response.status'。看起来你的http请求中存在“一些错误”。如果没有更多的信息,我可以说。 – James
我试着检查错误,但问题是它给了我:错误:opts.mode的值无效(原始文章中的完整跟踪)。而且我无法检查response.status,因为响应未定义。 –
@OmarBaradei那么,最终的答案是否帮助你? –