如何从文本字符串获取网址？

我有一个字符串，精巧的URL和其他文本。我想将所有的URL都存入$matches数组中。但是，下面的代码将无法获得全部的URL中$matches阵列：如何从文本字符串获取网址？

$matches = array(); 
$text = "soundfly.us schoollife.edu hello.net some random news.yahoo.com text http://tinyurl.com/9uxdwc some http://google.com random text http://tinyurl.com/787988 and others will en.wikipedia.org/wiki/Country_music URL"; 
preg_match_all('$\b(https?|ftp|file)://[-A-Z0-9+&@#/%?=~_|!:,.;]*[-A-Z0-9+&@#/%=~_|]$i', $text, $matches); 
print_r($matches);

上面的代码将得到：

http://tinyurl.com/9uxdwc 
http://google.com 
http://tinyurl.com/787988

。

，但忽略了以下4个网址：

schoollife.edu 
hello.net 
news.yahoo.com 
en.wikipedia.org/wiki/Country_music

能否请你告诉我用一个例子，我怎么能修改上面的代码来获取所有的URL

来源

2013-04-27 Justin k

你的正则表达式强制指定一个http/https/ftp/file协议。使其可选。 – sevenseacat 2013-04-27 08:11:50

@sevenseacat我也有类似的问题。你可以用修改后的正则表达式来演示一个例子吗？ – 2013-04-27 08:45:00

查看我的更新回答 – 2013-04-27 08:57:51

这是你需要什么？

$matches = array(); 
$text = "soundfly.us schoollife.edu hello.net some random news.yahoo.com text http://tinyurl.com/9uxdwc some http://google.com random text http://tinyurl.com/787988 and others will en.wikipedia.org/wiki/Country_music URL"; 
preg_match_all('$\b((https?|ftp|file)://)?[-A-Z0-9+&@#/%?=~_|!:,.;]*\.[-A-Z0-9+&@#/%=~_|]+$i', $text, $matches); 
print_r($matches);

我所做的协议部分optionnal，增加劈裂域和TLD和使用点的“+”来获取点后满弦（TLD +额外信息）

结果是：

[0] => soundfly.us 
[1] => schoollife.edu 
[2] => hello.net 
[3] => news.yahoo.com 
[4] => http://tinyurl.com/9uxdwc 
[5] => http://google.com 
[6] => http://tinyurl.com/787988 
[7] => en.wikipedia.org/wiki/Country_music

也可以使用IP地址，因为强制存在点。用字符串“192.168.0.1”和“192.168.0.1/test/index.php”测试

来源

2014-06-10 17:00:00 niconoe

如何从文本字符串获取网址？

回答

相关问题