使用PHP将大型xml文件导入/更新到MySQL中

我在XML文件中有大约30K条记录，并且此文件一直都在更新。使用PHP将大型xml文件导入/更新到MySQL中

我想插入，如果存在更新MySQL分贝。

这是我想使用的代码，但运行速度非常缓慢，有没有人有任何想法来提高其性能？

// getting xml file 
$dom = new DOMDocument(); 
$dom->load('products.xml'); 

// getting xml nodes using xpath 
$xpath = new DOMXPath($dom); 
$productid = $xpath->query('//NewDataSet/Product/ProductId'); 
$price = $xpath->query('//NewDataSet/Product/Price'); 

// Reading all nodes and if mach found in db update price, else insert as new record** 
for($i=0;$i<$allNodes->length;$i++){ 
    $testproductid = $productid->item($i)->nodeValue; 
    $testprice = $price->item($i)->nodeValue; 
    if(mysql_num_rows(mysql_query("Select productid from test where productid ='$testproductid'"))){ 
     mysql_query("UPDATE test SET price = '$testprice' WHERE productid = '$testproductid'"); 
    }else{ 
     mysql_query("INSERT INTO test (price, productid) VALUES ('$testprice','$testproductid')"); 
    } 
}

来源

2011-04-05 Vaidas

你需要的'/ /'在'XPath'中？它也会使事情变得缓慢...... – Wrikken 2011-04-05 21:30:49

-1，SQL注入。准备好的陈述可能会使这个速度明显加快。 – 2011-04-05 22:20:17

首先，这条线可能会导致不良行为：

if(mysql_num_rows(mysql_query("Select productid from test where productid ='$testproductid'")))

如果请求mysql_query（）失败，会发生什么？做这样的事情，而不是：

$res = mysql_query("Select productid from test where productid ='$testproductid'"); 
if ($res) { 
... CODE HERE ... 
}

是的productid的指数？此外，您可以将您的查询制定为：

Select productid from test where productid ='$testproductid' LIMIT 1

在这种情况下，MySQL不会查找更多记录。另外，尝试在单个INSERT语句中插入多条记录。看到这一点：

http://dev.mysql.com/doc/refman/5.0/en/insert-speed.html

看看REPLACE命令。这将取代SELECT/UPDATE/INSERT条件，但它可能不是性能的重大改进。

http://dev.mysql.com/doc/refman/5.0/en/replace.html

来源

2011-04-05 21:30:04 Raisen

嗨，没有productid不是索引，它在xml中唯一，但表中有另一个索引，因为我不能使用MySql REPLACE或DUPLICATE KEY，或者我可以吗？ – Vaidas 2011-04-05 21:36:57

是的，你可以使用。你为什么不制作一个独特的索引？它会加快你的查询。此外，在运行大量INSERT语句并将其启用之前禁用索引是一种很好的技术，否则mysql将不得不在每个INSERT语句中更新索引。 – Raisen 2011-04-05 22:06:12

首先我建议刷一些MySQL。第二关，通过使用您的 productid字段主键，你可以使用一个叫做一个更高级的SQL语句： insert ... on duplicate key update ...

It's gonna halve your database lookups for the first part，因为你插入/更新前做的一个额外的测试。

其次，XML可能不是您的跨平台文件的最佳解决方案。你使用这个的任何特定原因？

来源

2011-04-05 21:23:11 Khez

为什么两个查询哪里就够了？

$sql = "INSERT INTO test (price, productid) " . 
     "VALUES ('$testprice','$testproductid') " . 
     "ON DUPLICATE KEY UPDATE"; 

if(!$query = mysql_query($sql)) 
    trigger_error(mysql_error());

您也可以尝试SimpleXML代替DOMDocument，但是从我谷歌似乎可以有不被任何记录的速度差。

来源

2011-04-05 21:27:13 cantlin

在一个事务30K更新语句应该在合理的时间内完成（等待用户）。也许自动提交是？另外，如果你不介意被mysql特定的话，那么REPLACE会在一个语句中执行INSERT/UPDATE。或者您可以执行INSERT ... ON DUPLICATE KEY UPDATE。特别是，这会消除“if（mysql_num_rows（mysql_query（”Select productid from test where productid ='$ testproductid'“）））”。

来源

2011-04-05 21:30:12 Bittrance

嗨，没有productid不是索引，它在xml中唯一，但表中有另一个索引，因为我不能使用MySql REPLACE或DUPLICATE KEY，或者我可以吗？如何自动提交从php – Vaidas 2011-04-05 21:45:17

如果'productid'是唯一的，则应该在表中声明它为'UNIQUE'。然后你可以使用'ON DUPLICATE KEY UPDATE'。 – cantlin 2011-04-05 22:03:43

另外，如果你不介意被mysql特定的话，那么在一个语句中有REPLACE这样做INSERT/UPDATE。或者你可以做INSERT ... ON DUPLICATE KEY UPDATE。特别是，这消除了if(mysql_num_rows(mysql_query("Select productid from test where productid ='$testproductid'")))。

在一个事务中的30k更新语句应该在合理的时间内完成（对于等待用户）。也许自动提交是？

来源

2012-05-09 12:38:42 zxzx

REPLACE是一个语句中的组合DELETE/INSERT，它为插入的行创建一个新的自动增量值。 – Sven 2012-10-19 23:05:04

脚本由大块加载一个大文件将加载XML文件，读取条目的给定数量的一个时间，然后将其加载到数据库..

$lot =5000; 
$tempFiledir = '.'; 
$tempFile = 'temp.xml'; 
$table = 'mytable'; 
$db_username= 'root'; 
$db_password = 'mysql'; 

// count element 
    print(" Computing items..."); 
    $xml_reader = new XMLReader; 
    $xml_reader->open($xml_file); 
    while ($xml_reader->read() && $xml_reader->name != $node_name); 
    $totalItems =0; 
    while ($xml_reader->name == $node_name) { 
     $xml_reader->next($node_name); 
     $totalItems++; 
    } 
    $xml_reader->close(); 

    print("\r $totalItems items found.      "); 


//Truncat the table to load into 
$xmlload_cmd = sprintf ("$mysql_exe -u%s -p%s $database_temp -e \"TRUNCATE TABLE `%s`;\" ", $db_username, $db_password, $table); 
system($xmlload_cmd);       

// move the pointer to the first item 
$xml_reader = new XMLReader; 
$xml_reader->open($xml_file); 
while ($xml_reader->read() && $xml_reader->name != $node_name); 


// load by chunks 
$index = 0; 
while ($xml_reader->name == $node_name){ 

    $tempFileXMLOutput = fopen("$tempFiledir\\$tempFile", "w") or die("Unable to open file!"); 
    fwrite($tempFileXMLOutput,'<?xml version="1.0"?>'); 

    $index0=$index; 
    do {  
     // remove self closign tags from the rendred xml output and store it in the temp file 
     $data = preg_replace('/\<(\w+)\s*\/\s*\>/i', '<$1></$1>', $xml_reader->readOuterXML()); 
     fwrite($tempFileXMLOutput, "\n\t$data");  

     // move the pointer to the next item 
     $xml_reader->next($node_name); 
     $index++; 
    } 
    while ($xml_reader->name == $node_name && ($index % $lot != 0)); 

    // close the temp file 
    fclose($tempFileXMLOutput); 

    echo sprintf("\r Processing items from %6s to %6s [%3.0f%%]", $index0, $index, $index/$totalItems*100); 

    // run the LOAD XML comand on the temp xml file 
    $load_cmd = sprintf("LOAD XML LOCAL INFILE '%s' INTO TABLE `%s` ROWS IDENTIFIED BY '<Data>'", addslashes("$tempFiledir\\$tempFile"), $table);    

    $xmlload_cmd = sprintf ("$mysql_exe -u%s -p%s $database_temp -e \"$load_cmd\" ", $db_username, $db_password); 
    system($xmlload_cmd); 

    // remove the temp file 
    @unlink ("$tempFiledir\\$tempFile"); 
} 

$xml_reader->close();

来源

2015-06-24 13:59:21 Rochdi

使用PHP将大型xml文件导入/更新到MySQL中

回答

相关问题