提取复杂的URL从文本

-3

文本包含像 https://www.yyyy.com/blablabla/https://www.foofoofoofoofoo/loremlorem/lorem/https:www.textext.net/提取复杂的URL从文本

他们都是相邻的URL。正则表达式没有帮助。我想这样解决; 搜索https://www 如果匹配，则提取（仅前10个字符）到数组。

来源

2015-10-16 NSAN

为什么每个项目后至少没有换行符？添加换行符以'\ n'分解字符串并循环。 –

我没有准备文字。它最多包含500个没有换行符的URL。 – NSAN

在http上爆炸 – 2015-10-16 20:50:55

一个解决办法是：

<?php 
$str = "https://www.yyyy.com/blablabla/https://www.foofoofoofoofoo/loremlorem/lorem/https:www.textext.net/"; 
    //add an space to explode it easily:  
    $my_str = preg_replace("*https:*", " https:", $str); 
    $values = explode(' ', $my_str);  
    var_dump($values); 
?>

编辑：

<?php 
     //First separate the url string: 
$str = "https://www.yyyy.com/blablabla/https://www.foofoofoofoofoo/loremlorem/lorem/https:www.textext.net/https://youtube.com/channels/uniqueID/about/foofoofoo/foo"; 
$breakpoint = "https:"; 
//add an space to explode it easily:  
$my_str = preg_replace("*" . $breakpoint . "/?/?*", " ", $str); 
$values = explode(' ', $my_str);  
var_dump($values); 

//Now, foreach url you can perform whatever you want: 
$end = "about/"; 
$a = array(); 
foreach($values as $value){ 
    if(preg_match("*" . $end . "*",$value)){ 
     //split string in parts: 
     $val = preg_split("*" . $end . "*",$value); 
     $a[] = $val[0]; 
    } 
} 

var_dump($a); 
?>

来源

2015-10-16 21:13:48

是否可以写一个正则表达式抓取链接的特定部分？ youtube.com/channels/uniqueID/about/foofoofoo/foo如何仅在https：和约 – NSAN

根据您的样品中的文字，我想使preg_split是你最好的选择：

$urls = preg_split('/(http){1}s?\:(\/\/)?/i', $text);

$ urls将是你想要的拆分网址数组。在几个全文中测试它&告诉我们

来源

2015-10-16 21:31:48 Arif

之间提取www.youtube.com/channels/uniqueID/此部分没有工作。是否可以编写正则表达式抓取链接的特定部分？ https://www.youtube.com/channels/uniqueID/about/foofoofoo/foo 我怎么能只提取这部分 www.youtube.com/channels/uniqueID/ HTTPS之间：约 – NSAN

HTTP： //sandbox.onlinephpfunctions.com/code/038e70049432593dcb2b48874ebc66835ed05e82它的工作原理...从模式的末尾删除g - PHP不理解这个修饰符 – Arif

关于URL提取的其他部分 - 是的，如果你能找到一个模式，它当然是可能的。示例 - 只有域名或域后最多2个斜线或一个常见的关键字，如'约'，uniqueid等找到适当的模式是诀窍 - 虽然它不可能总是，至少不是在您的URL的示例文本 – Arif

提取复杂的URL从文本

回答

相关问题