2013-03-20 101 views
0

我想将文件分成两个文件。 如果文件名是example.txt那么它将被分成两个文件,如EX1.txtEX2.txt我想用perl脚本将一个文件分成两个不同的文件

拆分取决于每行中的第二个字段。例如:如果HDR线有TEA003890459作为第二场,那么输出将会去EX1.txt 但是如果HDR有TEA003886004那么输出进入EX2.txt。 我也想要统计索赔号码。

我要做到这一点使用以下逻辑:

if Header-Row then 
    if Dummy cost center then 
     write to Gas file 
     keep in mind: Claim-Nummer (eg. Array or Hash) 
    else 
     write to normal file 
    end if 
else if Detail-Row then 
    if kept Claim-Nummer then 
     write to Gas file 
    else 
     write to normal file 
    end if 
end if 

该文件包含以下数据:

HDR^TEA003890459^082582^Mohd Jamil^Jamili Fahmi Bin^^458^+^92000^+^92000^+^0000^+^0000^+^0000^^0^^0^^0^^0^^0^^0^20130307^^^^^^^222^MY0BD^2^[email protected]^  MY0BCC#6482362304         
DTL^TEA003890459^E^MY0BCC#6482362304    641301137^+^47000^MFA^20130209^Medical Expenses [Family]^^^0^^0^^0^^0^^0^^0^^0^^0^^C16Medical 
DTL^TEA003890459^E^MY0BCC#6482362304    641301137^+^45000^MGE^20130304^Medical Expenses (Employee clinica^^^0^^0^^0^^0^^0^^0^^0^^0^^C16Medical 
HDR^TEA003886004^082770^Bin Omar^Mohamad Fadzlizam^^458^+^135800^+^135800^+^0000^+^0000^+^0000^^0^^0^^0^^0^^0^^0^20130307^^^^^^^222^MY0BD^4^[email protected]^  MY0BCC#6485163100         
DTL^TEA003886004^E^MY0BCC#6485163100    641301137^+^25000^MFA^20130221^Medical Expenses [Family]^^^0^^0^^0^^0^^0^^0^^0^^0^^C16Claim 
DTL^TEA003886004^E^MY0BCC#6485163100    641301137^+^37150^MFA^20130224^Medical Expenses [Family]^^^0^^0^^0^^0^^0^^0^^0^^0^^C16Claim 
DTL^TEA003886004^E^MY0BCC#6485163100    641301137^+^23650^MFA^20130226^Medical Expenses [Family]^^^0^^0^^0^^0^^0^^0^^0^^0^^C16Claim 
DTL^TEA003886004^E^MY0BCC#6485163100    641301137^+^50000^MGE^20130304^Medical Expenses (Employee clinica^^^0^^0^^0^^0^^0^^0^^0^^0^^C16Claim 
HDR^TEA003886162^082792^Lim^Jia Jieh^^458^+^280400^+^280400^+^0000^+^0000^+^0000^^0^^0^^0^^0^^0^^0^20130305^^^^^^^222^MY0BD^4^[email protected]^  MY0BCC#6482363474         
DTL^TEA003886162^E^MY0BCC#6482363474    641301137^+^110000^MGE^20130131^Medical Expenses (Employee clinica^^^0^^0^^0^^0^^0^^0^^0^^0^^C16Medical claim 31/1,20/2,28/2 
DTL^TEA003886162^E^MY0BCC#6482363474    641301137^+^60000^MGE^20130220^Medical Expenses (Employee clinica^^^0^^0^^0^^0^^0^^0^^0^^0^^C16Medical claim 31/1,20/2,28/2 
DTL^TEA003886162^E^MY0BCC#6482363474    641301137^+^50400^MGE^20130220^Medical Expenses (Employee clinica^^^0^^0^^0^^0^^0^^0^^0^^0^^C16Medical claim 31/1,20/2,28/2 
DTL^TEA003886162^E^MY0BCC#6482363474    641301137^+^60000^MGE^20130228^Medical Expenses (Employee clinica^^^0^^0^^0^^0^^0^^0^^0^^0^^C16Medical claim 31/1,20/2,28/2 
+1

我试图改写你的问题。请检查我是否意外更改了意思。 **问题:**①什么是“TDR”? ②什么是“虚拟成本中心”? ③什么是“* Gas​​ *文件”? ③你的示例文件不会被分成三个文件吗? ④您的文件看起来像逗号分隔数据(CSV),使用'^'作为字段分隔符。你能承认这一点吗? – amon 2013-03-20 13:40:01

回答

0

你的解释和伪代码和示例数据似乎告诉一个不同的故事

但是要读取第二个字段,一旦文件打开并按照描述的顺序排序

open(my $ex1,">EX1.txt")||die"EX1.txt $!"; 
open(my $ex2,">EX2.txt")||die"EX2.txt $!"; 
$wanted="TEA003890459"; 
while($line = <$ifile>) { 

    @field=split('\^',$line); 
    if ($field[1] eq $wanted) { # fields start from 0 so 1 is the second field 
    print $ex1 $line; 
    else { 
    print $ex2 $line; 
    } 
} 

编辑:修复分裂ARG

+0

'split'中的第一个参数是一个正则表达式,其中'^'是一个元字符。 →'split/\^/'什么的。 – amon 2013-03-20 14:03:04

+0

很对,固定 – Vorsprung 2013-03-21 08:43:09

+0

而不是添加'EDIT:fix split arg'为什么你不把它放在编辑总结? – 2013-03-22 01:26:11

0

喜欢的东西:

#!/usr/bin/perl 

foreach (<>) { 
     my @out = split(/\^/,$_); 
     if ($out[0] eq 'HDR') { 
       close OUTFILE; 
       open OUTFILE,">>$out[1].txt" or die(); 
     } elsif ($out[0] eq 'DTL') { 
       print OUTFILE $_; 
     } 
} 

运行带:

./split.pl < infile.txt 

会分裂出档案的头型。您可以使用Linux wc命令为每个条目计数条目。

+1

为什么使用2 arg ['open'](http://p3rl.org/open“perldoc -f open”),而不是3 arg ['open'](http://p3rl.org/open“ perldoc -f打开“)?你为什么不在''或'['die']中包含['$!'](http://perldoc.perl.org/perlvar.html#%24!“perldoc -v'$!'”) (http://p3rl.org/die“perldoc -f die”)? – 2013-03-22 01:22:14