2016-11-07 99 views
-1

我有2阵列在我的代码,就像下面的图所示:和array_diff不起作用(PHP)

<?php 
 

 
$kalimat = "I just want to search something like visual odometry, dude"; 
 
$kata = array(); 
 
$eliminasi = " \n . ,;:-()?!"; 
 
$tokenizing = strtok($kalimat, $eliminasi); 
 

 
while ($tokenizing !== false) { 
 
\t $kata[] = $tokenizing; 
 
\t $tokenizing = strtok($eliminasi); 
 
} 
 
$sumkata = count($kata); 
 
print "<pre>"; 
 
print_r($kata); 
 
print "</pre>"; 
 

 

 
//stop list 
 
$file = fopen("stoplist.txt","r") or die("fail to open file"); 
 
$stoplist; 
 
$i = 0; 
 
while($row = fgets($file)){ 
 
\t $data = explode(",", $row); 
 
\t $stoplist[$i] = $data; 
 
\t $i++; 
 
} 
 
fclose($file); 
 
$count = count($stoplist); 
 

 
//Cange 2 dimention array become 1 dimention 
 
for($i=0;$i<$count;$i++){ 
 
for($j=0; $j<1; $j++){ 
 
\t $stopword[$i] = $stoplist[$i][$j]; 
 
} 
 
} \t 
 

 
//Filtering process 
 
$hasilfilter = array_diff($kata,$stopword); 
 
var_dump($hasilfilter); 
 
?>

$禁用词包含一些停用词像附着在http://xpo6.com/list-of-english-stop-words/

我想要做的是:我想检查是否保存数组$ kata中存在的元素,并且它不存在于数组$ stopword

所以我想删除数组$ kata和$ stopword中存在的所有元素。 我读了一些建议使用array_diff,但不知何故,它不适用于我。真的需要你的帮助:(谢谢

+1

你希望我们猜测'$ kata','$ stopword'的内容是什么? –

+0

我已经编辑它。抱歉。 – Berlian

回答

0

array_diff是你需要什么,你就在这里为你努力做一个简化版本:。

<?php 

// Your string $kalimat as an array of words, this already works in your example. 
$kata = ['I', 'just', 'want', 'to', '...']; 

// I can't test $stopword code, because I don't have your file. 
// So let's say it's a array with the word 'just' 
$stopword = ['just']; 

// array_diff gives you what you want 
var_dump(array_diff($kata,$stopword)); 

// It will display your array minus "just": ['I', 'want', 'to', '...'] 

你也应该仔细检查的$stopword值,我无法测试这个部分(没有你的文件)如果它不适合你,我想这个问题是由这个变量($stopword

0

你的$stopword数组有问题var_dump它看到的问题。array_diff正在工作正确。

试试下面的代码我写的,让您的$stopword阵列权:

<?php 

    $kalimat = "I just want to search something like visual odometry, dude"; 
    $kata = array(); 
    $eliminasi = " \n . ,;:-()?!"; 
    $tokenizing = strtok($kalimat, $eliminasi); 

    while ($tokenizing !== false) { 
     $kata[] = $tokenizing; 
     $tokenizing = strtok($eliminasi); 
    } 
    $sumkata = count($kata); 
    print "<pre>"; 
    print_r($kata); 
    print "</pre>"; 

    //stop list 
    $file = fopen("stoplist.txt","r") or die("fail to open file"); 
    $stoplist; 
    $i = 0; 
    while($row = fgets($file)){ 
     $data = explode(",", $row); 
     $stoplist[$i] = $data; 
     $i++; 
    } 
    fclose($file); 
    $count = count($stoplist); 
    //Cange 2 dimention array become 1 dimention 
    $stopword= call_user_func_array('array_merge', $stoplist); 
    $new = array(); 
    foreach($stopword as $st){ 
     $new[] = explode(' ', $st); 
    } 
    $new2= call_user_func_array('array_merge', $new); 
    foreach($new2 as &$n){ 
     $n = trim($n); 
    } 
    $new3 = array_unique($new2); 
    unset($stopword,$new,$new2); 
    $stopword = $new3; 
    unset($new3); 

    //Filtering process 
    $hasilfilter = array_diff($kata,$stopword); 
    print "<pre>"; 
    var_dump($hasilfilter); 
    print "</pre>"; 
    ?> 

我希望它能帮助