我工作的一个链接检查器/断开的链接查找器,我收到很多误报,经过双重检查后,我注意到许多错误代码返回webexceptions,但他们实际上是可下载的,但在其他情况代码是404,我可以从浏览器访问页面。链接检查器;如何避免误报
所以这里是代码,它的相当丑陋,和id喜欢有更多的东西,ID说实用。如果用于过滤那些我不想添加到brokenlink的所有状态代码,因为它们是有效的链接(我测试了它们全部)。我需要修复的是结构(如果可能的话)以及如何不弄错404.
谢谢!
try
{
HttpWebRequest request = (HttpWebRequest) WebRequest.Create (uri);
request.Method = "Head";
request.MaximumResponseHeadersLength = 32; // FOR IE SLOW SPEED
request.AllowAutoRedirect = true;
using (HttpWebResponse response = (HttpWebResponse) request.GetResponse())
{
request.Abort();
}
/* WebClient wc = new WebClient();
wc.DownloadString(uri); */
_validlinks.Add (strUri);
}
catch (WebException wex)
{
if ( !wex.Message.Contains ("The remote name could not be resolved:") &&
wex.Status != WebExceptionStatus.ServerProtocolViolation)
{
if (wex.Status != WebExceptionStatus.Timeout)
{
HttpStatusCode code = ((HttpWebResponse) wex.Response).StatusCode;
if (
code != HttpStatusCode.OK &&
code != HttpStatusCode.BadRequest &&
code != HttpStatusCode.Accepted &&
code != HttpStatusCode.InternalServerError &&
code != HttpStatusCode.Forbidden &&
code != HttpStatusCode.Redirect &&
code != HttpStatusCode.Found
)
{
_brokenlinks.Add (new Href (new Uri (strUri , UriKind.RelativeOrAbsolute) , UrlType.External));
}
else _validlinks.Add (strUri);
}
else _brokenlinks.Add (new Href (new Uri (strUri , UriKind.RelativeOrAbsolute) , UrlType.External));
}
else _validlinks.Add (strUri);
}
请正确缩进代码! – 2010-06-10 14:55:56
@Anthony:大声笑 - 纠正(抱歉宠坏你的笑话)。 – 2010-06-10 15:01:45