HtmlAgilityPack XPath的情况下忽略

当我使用HtmlAgilityPack XPath的情况下忽略

SelectSingleNode("//meta[@name='keywords']")

这是行不通的，但是当我使用的是原始文档中使用相同的情况下，它的工作原理好：

SelectSingleNode("//meta[@name='Keywords']")

所以，问题是我怎么能设置案件忽略？

来源

2012-02-05 kseen

XPath是刻意区分大小写：电子应该支持不区分大小写的匹配新的LINQ语法？ – CarneyCode 2012-02-05 08:05:30

@Carnotaurus是的。 – Tomalak 2012-02-05 08:11:22

如果您需要更全面的解决方案，可以为XPath处理器编写一个扩展函数，它将执行不区分大小写的比较。这是相当多的代码，但你只写一次。

实现扩展后，您可以如下

"//meta[@name[Extensions:CaseInsensitiveComparison('Keywords')]]"

哪里Extensions:CaseInsensitiveComparison是在下面的示例中实现的扩展功能编写查询。

注意：这没有很好的测试我只是把它扔在一起这个响应，所以错误处理等是不存在的！

以下是自定义XSLT上下文提供一个或多个扩展功能

代码

using System; 
using System.Xml.XPath; 
using System.Xml.Xsl; 
using System.Xml; 
using HtmlAgilityPack; 

public class XsltCustomContext : XsltContext 
{ 
    public const string NamespaceUri = "http://XsltCustomContext"; 

    public XsltCustomContext() 
    { 
    } 

    public XsltCustomContext(NameTable nt) 
    : base(nt) 
    {  
    } 

    public override IXsltContextFunction ResolveFunction(string prefix, string name, XPathResultType[] ArgTypes) 
    { 
    // Check that the function prefix is for the correct namespace 
    if (this.LookupNamespace(prefix) == NamespaceUri) 
    { 
     // Lookup the function and return the appropriate IXsltContextFunction implementation 
     switch (name) 
     { 
     case "CaseInsensitiveComparison": 
      return CaseInsensitiveComparison.Instance; 
     } 
    } 

    return null; 
    } 

    public override IXsltContextVariable ResolveVariable(string prefix, string name) 
    { 
    return null; 
    } 

    public override int CompareDocument(string baseUri, string nextbaseUri) 
    { 
    return 0; 
    } 

    public override bool PreserveWhitespace(XPathNavigator node) 
    { 
    return false; 
    } 

    public override bool Whitespace 
    { 
    get { return true; } 
    } 

    // Class implementing the XSLT Function for Case Insensitive Comparison 
    class CaseInsensitiveComparison : IXsltContextFunction 
    { 
    private static XPathResultType[] _argTypes = new XPathResultType[] { XPathResultType.String }; 
    private static CaseInsensitiveComparison _instance = new CaseInsensitiveComparison(); 

    public static CaseInsensitiveComparison Instance 
    { 
     get { return _instance; } 
    }  

    #region IXsltContextFunction Members 

    public XPathResultType[] ArgTypes 
    { 
     get { return _argTypes; } 
    } 

    public int Maxargs 
    { 
     get { return 1; } 
    } 

    public int Minargs 
    { 
     get { return 1; } 
    } 

    public XPathResultType ReturnType 
    { 
     get { return XPathResultType.Boolean; } 
    } 

    public object Invoke(XsltContext xsltContext, object[] args, XPathNavigator navigator) 
    {     
     // Perform the function of comparing the current element to the string argument 
     // NOTE: You should add some error checking here. 
     string text = args[0] as string; 
     return string.Equals(navigator.Value, text, StringComparison.InvariantCultureIgnoreCase);   
    } 
    #endregion 
    } 
}

然后，您可以使用您的XPath查询上面的扩展功能，这里是我们的情况的一个例子

class Program 
{ 
    static string html = "<html><meta name=\"keywords\" content=\"HTML, CSS, XML\" /></html>"; 

    static void Main(string[] args) 
    { 
    HtmlDocument doc = new HtmlDocument(); 
    doc.LoadHtml(html); 

    XPathNavigator nav = doc.CreateNavigator(); 

    // Create the custom context and add the namespace to the context 
    XsltCustomContext ctx = new XsltCustomContext(new NameTable()); 
    ctx.AddNamespace("Extensions", XsltCustomContext.NamespaceUri); 

    // Build the XPath query using the new function 
    XPathExpression xpath = 
     XPathExpression.Compile("//meta[@name[Extensions:CaseInsensitiveComparison('Keywords')]]"); 

    // Set the context for the XPath expression to the custom context containing the 
    // extensions 
    xpath.SetContext(ctx); 

    var element = nav.SelectSingleNode(xpath); 

    // Now we have the element 
    } 
}

来源

2012-02-05 09:19:46

这可以应用于节点名称吗？ – 2013-01-28 11:59:04

如果实际值是未知的情况下，我认为你必须使用翻译。我相信这是：

SelectSingleNode("//meta[translate(@name,'ABCDEFGHIJKLMNOPQRSTUVWXYZ','abcdefghijklmnopqrstuvwxyz')='keywords']")

这是黑客，但它的XPath 1.0中的唯一选择（除了相反大写）。

来源

2012-02-05 08:09:55

这是我要做的事：

HtmlNodeCollection MetaDescription = document.DocumentNode.SelectNodes("//meta[@name='description' or @name='Description' or @name='DESCRIPTION']"); 

string metaDescription = MetaDescription != null ? HttpUtility.HtmlDecode(MetaDescription.FirstOrDefault().Attributes["content"].Value) : string.Empty;

来源

2012-05-13 18:59:55 formatc

你的方法并不像Chris Taylor那样普遍。 Chris的回答关注了char的情况。 – kseen 2012-05-14 03:07:14

@kseen我知道，但真的，有人可能把某些东西像“KeYwOrDs”？这是三种常用的方法，如果有人写这样的元名称，我怀疑你能够解析HTML文档中的任何内容。这是一个开箱即用的解决方案，需要两行代码，并且在大多数情况下运行良好，但这一切都取决于您的要求。 – formatc 2012-05-14 11:41:23

我试着保持规则“永远不要信任用户输入”，我也非常友好的建议。 – kseen 2012-05-14 12:21:42

或者使用J

 node = doc.DocumentNode.Descendants("meta") 
      .Where(meta => meta.Attributes["name"] != null) 
      .Where(meta => string.Equals(meta.Attributes["name"].Value, "keywords", StringComparison.OrdinalIgnoreCase)) 
      .Single();

但你必须做的属性丑陋空检查，以防止NullReferenceException ...

来源

2012-05-14 15:14:35 jessehouwing

HtmlAgilityPack XPath的情况下忽略

回答

相关问题