PHP正则表达式来获取特定的URL

我想从与从下面这些标签“../category/”开头的网页获取网址：PHP正则表达式来获取特定的URL

<a href="../category/product/pc.html" target="_blank">PC</a><br> 
<a href="../category/product/carpet.html" target="_blank">Carpet</a><br>

任何建议将是非常赞赏。

谢谢！

来源

2011-04-12 user704278

并与他们做什么？ – mcgrailm 2011-04-12 14:37:53

不需要正则表达式。与DOM的简单XPath查询就足够了：

$dom = new DOMDocument; 
$dom->loadHTML($html); 
$xpath = new DOMXPath($dom); 

$nodes = $xpath->query('//a[starts-with(@href, "../category/")]'); 
foreach ($nodes as $node) { 
    echo $node->nodeValue.' = '.$node->getAttribute('href').PHP_EOL; 
}

会打印：

PC = ../category/product/pc.html 
Carpet = ../category/product/carpet.html

来源

2011-04-12 14:45:33 netcoder

很棒的建议！ – 2011-04-12 14:46:33

对不起，但我没有使用过，我想从链接中获取内容。类似于“http://www.example.com/p/carpet.html”。我将如何将其添加到代码？ – user704278 2011-04-13 02:28:21

@ user704278：如果你想重写URL，只需要：'$ new_href ='example.com/p /'。basename（$ node-> getAttribute（'href'））;' – netcoder 2011-04-13 15:01:02

您../category/字符串此正则表达式搜索：

preg_match_all('#......="(\.\./category/.*?)"#', $test, $matches);

所有的文本文字被用于匹配。您可以替换.....以使其更具体。只有\.需要转义。 .*?寻找一个可变长度的字符串。并且()捕获匹配的路径名称，所以它出现在$匹配中。手册解释了其余的语法。 http://www.php.net/manual/en/book.pcre.php

来源

2011-04-12 14:48:04 mario

PHP正则表达式来获取特定的URL

回答

相关问题