2012-02-14 92 views
1

我在一个文件夹中有很多PDF文件。我想使用xpdf从这些PDF中提取文本。例如:如何使用xpdf从PDF中提取文本?

  • example1.pdf提取物example1.txt
  • example2.pdf提取物example2.txt
  • 等。

这里是我的代码:

<?php 

$path = 'C:/AppServ/www/pdfs/'; 
$dir = opendir($path); 
$f = readdir($dir); 

while ($f = readdir($dir)) { 
    if (eregi("\.pdf",$f)){ 
     $content = shell_exec('C:/AppServ/www/pdfs/pdftotext '.$f.' '); 
     $read = strtok ($f,"."); 
     $testfile = "$read.txt"; 
     $file = fopen($testfile,"r"); 
     if (filesize($testfile)==0){} 
     else{ 
      $text = fread($file,filesize($testfile)); 
     fclose($file); 
     echo "</br>"; echo "</br>"; 
     } 
    } 
} 

我得到空白结果。我的代码有什么问题?

+0

你尝试过什么?如何放置良好的回声陈述 – 2012-02-15 00:15:57

回答

2

尝试:

$dir  = opendir($path); 
$filename = array(); 

while ($filename = readdir($dir)) { 
if (eregi("\.pdf",$filename)){ 
    $content = shell_exec('C:/AppServ/www/pdfs/pdftotext '.$filename.' '); 
    $read  = strtok ($filename,"."); 
    $testfile = "$read.txt"; 
    $file  = fopen($testfile,"r"); 
    if (filesize($testfile)==0){} 
    else{ 
     $text = fread($file,filesize($testfile)); 
     fclose($file); 
     echo "</br>"; echo "</br>"; 
    } 
} 
0

您不必创建一个临时txt文件

$command = '/AppServ/www/pdfs/pdftotext ' . $filename . ' -'; 
$a = exec($command, $text, $retval); 
echo $text; 

,如果它不工作,检查服务器的错误日志。使用这种

0

线条

echo "</br>"; 
echo "</br>"; 

应该

echo "</br>"; 
echo $text."</br>"; 

希望这有助于