2014-10-17 54 views
4

我使用oracle10gR2 10.2.0.4和Solaris10的64位插入到VARCHAR2列从XMLType列选择:极其缓慢

我需要从XML在XMLType列表中选择数据值(word.testmeta ) 并插入到另一个表(word.testwordyy)

desc word.testmeta; 
Name     Null? Type 
-------------------------------------- 
FILENAME    CHAR(2000) 
XMLDATA    XMLTYPE 

desc word.testwordyy; 
Name     Null? Type 
--------------------------------------- 
ID     VARCHAR2(255) 
KEYWORD    VARCHAR2(4000) 

和我使用XMLTABLE并执行:

insert /*+append */ into word.testwordyy(KEYWORD) 
select /*+ gather_plan_statistics */ dbms_lob.substr(xmltype.getclobval(b.KEWOR),254) 
from word.testmeta , xmltable 
(
'$B/mets/KEWOR' 
passing 
word.testmeta.XMLDATA as B 
columns 
KEWOR xmltype path '/KEWOR/text()' 
) 
b 

这里是解释计划select * from table(dbms_xplan.display_cursor(null,null,'iostats last'));

PLAN_TABLE_OUTPUT 
----------------------------------------------------------------------------------------------------------------------------------- 
SQL_ID 37ua3npnxx8su, child number 0 
------------------------------------- 
insert /*+append */ into word.testwordyy(KEYWORD) select /*+ gather_plan_statistics */ 
dbms_lob.substr(xmltype.getclobval(b.KEWOR),254) from word.testmeta , xmltable ('$B/mets/KEWOR' passing 
    > word.testmeta.XMLDATA as 
B columns KEWOR xmltype path '/KEWOR/text()') b 

Plan hash value: 875848213 
----------------------------------------------------------------------------------------------------------------------------------- 

PLAN_TABLE_OUTPUT 
----------------------------------------------------------------------------------------------------------------------------------- 
| Id | Operation       | Name     | Starts | E-Rows | A-Rows | A-Time | Buffers | Reads | Writes | 
----------------------------------------------------------------------------------------------------------------------------------- 
| 1 | LOAD AS SELECT      |      |  1 |  |  1 |00:10:32.72 | 16832 |  7 | 90 | 
| 2 | NESTED LOOPS      |      |  1 |  29M| 34688 |00:00:25.95 | 12639 |  5 | 0 | 
| 3 | TABLE ACCESS FULL    | TESTMETA    |  1 | 3638 | 3999 |00:00:00.08 |  909 |  0 | 0 | 
| 4 | COLLECTION ITERATOR PICKLER FETCH| XMLSEQUENCEFROMXMLTYPE | 3999 |  | 34688 |00:00:24.50 | 11730 |  5 | 0 | 

Note 
----- 
    - dynamic sampling used for this statement 


21 rows selected. 

,并在表word.testmeta行的次数越多,时间越长,每行花费

我的XML很简单,小而需要处理它们的数量很大(5000000) 并且处理速度非常慢,当行数超过8000时需要几个小时。 有没有优化或更快的方法?

+0

为什么在'from'中有'word.testmeta'? – Mat 2014-10-19 08:49:26

+0

testmeta在word schema中,所以word.testmeta – pangjiale 2014-10-21 13:12:08

回答

1

您已将KEYWOR列定义为XMLTYPE。这是为什么? XMLTABLE的全部重点是将XML结构转换为关系列。如果您将该列定义为简单字符串,则可避免大量不必要的转换。

“的标签的内容超过4000个字符>>>是否有任何方法来子串在XMLTABLE的标签的内容”

有是XPath substring功能。

insert /*+append */ into word.testwordyy(KEYWORD) 
select /*+ gather_plan_statistics */ b.KEWOR 
from word.testmeta 
    , xmltable 
     (
     '$B/mets/KEWOR' 
     passing 
     word.testmeta.XMLDATA as B 
     columns 
     KEWOR varchar2(4000) path 'substring(KEWOR, 254, 4000)' 
    ) b 

在这里,我已经开始在您的原始文章中使用偏移量为254的子字符串。我也明确地将其长度设置为4000.

我不认为您需要在声明列时明确引用text()节点。

+0

标签的内容超过4000个字符,所以将KEYWOR列定义为XMLTYPE可以使该错误无效:ORA-01706:用户函数结果值太大。是否有任何方法来将标记的内容在xmltable中进行子字符串排序,如下所示:KEWOR varchar2(4000)path'/KEWOR/text().substr(4000)' – pangjiale 2014-10-20 02:15:35

+0

我得到了LPX-00601:Invalid token in:' 'KEWOR varchar2(4000)'子字符串(KEWOR,2​​54,4000)'的子字符串(KEWOR,2​​54,4000)' – pangjiale 2014-10-20 14:25:14

+0

KEWOR varchar2(4000)'子字符串(KEWOR,2​​54,4000)'不起作用版本10gR2 10.2.0.4 – pangjiale 2014-10-21 03:06:44