2010-11-04 47 views
0

数据我写了一个perl脚本从文本文件到数据库中插入数据,但我想知道如何添加一个质量把关,它即汉王我可以查看插入到数据库中的数据或不即它显示说数据已经成功插入..而且当日期已经从文本到数据库,它只是显示0000-00-00插入...什么是必须要做的变化...如何检查是否插入到数据库

我的代码is--

#!/usr/bin/perl 

#--------------------------------------------------------------------- 
# Description: Extract Lab data from text file and insert to database 
#--------------------------------------------------------------------- 

# Modules Required 
use DBI; # check drivers 


#print "vs2-001-001-ma-sampleFile\n"; 


my $filename = "vs2-001-001-ma-sampleFile.txt"; 

#initialize variable $count 
my $count = 0 ;  
#initialise variables for parameters 
my ($paraval, $paraname, $pararange, $paraunit); 
#uncomment it To use keyboard input. and type filename with extension 
# Ex: fileName.txt or fileName.csv  
#chomp($filename=<>);  
open (OUT,">>$filename.csv") || die print "No\t $!"; 
close OUT; 

open (IN,"$filename") || die print "Noo Input. $!"; 
my @file=<IN>; 

#join the lines with # dilimits 
my $string = join('#', @file); 

    $string =~s /[\r]//g; # To remove space. 
    $string =~s /[\n]//g; 
    $string =~s /[\t]//g; # To remove tab 


print "\n Parsing data now....\n"; 
# pattern under while loop will do the work. 
# it will take date as 13 Oct 2010 in $1 and rest values in $2 
# $string=~/Equine Profile Plus\s+#(.*?\s+)\s+.*?(Sample.*)##/g 

while($string=~/Equine Profile Plus\s+#(.*?\s+)\s+.*?(Sample.*?)##/g) 
{ 
    my($date,$line,$Sample_Type,$Patient_ID, $Sample_Id, 
     $Doctor_Id,$Location,$Rotor, $Serial,$para, 
     $QC,$HEM,$LIP,$ICT); 
    $count++; 

    $date=$1; 
    $line=$2;  
    if ($line=~/Sample Type:(.*?)#/gis){ 
     $Sample_Type=clean($1); 
    }if ($line=~/Patient ID:(.*?)#/gis){ 
     $Patient_ID=clean($1); 
    }if ($line=~/Sample ID:(.*?)#/gis){ 
     $Sample_Id=clean($1); 
    }if ($line=~/Doctor ID:(.*?)#/gis){ 
     $Doctor_Id=clean($1); 
    }if ($line=~/Location:(.*?)#/gis){ 
     $Location=clean($1); 
    }if ($line=~/Rotor Lot Number:(.*?)#/gis){ 
     $Rotor=clean($1); 
    }if ($line=~/Serial Number:(.*?)#/gis){ 
     $Serial=clean($1); 
    }if ($line=~/#(NA+.*?GLOB.*?)#/gis){ 

     $para=$1; 
     $para =~ s/#/;/g; 
     $para =~ s/\s\s/ /g; #remove spaces. 
     $para =~ s/\s\s/ /g; 
     $para =~ s/\s\s/ /g; 
     $para =~ s/\s\s/ /g; 
     $para =~ s/\s\s/ /g; 
     $para =~ s/\s\s/ /g; 
     $para =~ s/ /:/g; 

     if ($line=~/#QC(.*?) #HEM(.*?) LIP(.*?) ICT(.*?) /gis){ 
     $QC=clean($1); 
     $HEM=clean($2); 
     $LIP=clean($3); 
     $ICT=clean($4); 
    } 
     while($para =~ /(.*?):(.*?):(.*?);/g){ 
     $paraname = $1; 
     $paraval = $2; 
     $pararange = $3; 
     #$paraunit = $4;  

       #data from text file written to a CSV file. 
     open (OUT,">>$filename.csv") || die print "No";    
       print OUT "\"$count\",\"$date\",\"$Sample_Type\",\"$Patient_ID\", 
        \"$Sample_Id\",\"$Doctor_Id\",\"$Location\",\"$Rotor\", 
        \"$Serial\", \"$QC\",\"$HEM\",\"$LIP\",\"$ICT\", 
        \"$paraname\",\"$paraval\",\"$pararange\",\n"; 
     } 
    } 
} 
close OUT; 

#Load csv into mysql 
print "\n Inserting into data base \n"; 
# comment it while not loading into the database. 

&loaddata('$filename.csv');  
print "\n Database insert completed \n"; 
sub clean 
{ 
my ($line) = shift (@_); 
$line =~ s/\n//g; 
$line =~ s/\r//g; 
$line =~ s/^\s+//g; 
$line =~ s/\s\s//g; 
$line =~ s/\s+$//g; 
$line =~ s/#//g; 
return ($line); 
} 



#init the mysql DB 
sub init_dbh{ 

$db="parameters"; 
$host="localhost"; 
$user="**"; 
$password="**"; 

my $dbh = DBI->connect ("DBI:mysql:database=$db:host=$host", 
          $user, 
          $password) 
          or die "Can't connect to database: $DBI::errstr\n"; 

     return $dbh; 

} 

#Load data to mysql table 
sub loaddata{ 
     my ($name) = @_; 
     my $DBH = init_dbh(); 
     my $STH_GO = $DBH->prepare(q{ 
      LOAD DATA LOCAL INFILE 'vs2-001-001-ma-sampleFile.txt.csv' 
      INTO TABLE parameter FIELDS TERMINATED BY ',' ENCLOSED BY 
      '"' LINES TERMINATED BY '\n'; })or die "ERROR: ". $DBI::errstr; 
     $STH_GO->execute(); 

     } 
+2

无关,而是因为你是输出CSV,你可能想看看*文字:: CSV *解析您的输入和产生的输出。另外*使用严格*和*使用警告*和词法文件句柄。另外,没有看到使用*#*连接文件中所有行的优点,而只需通过相同的效果一行一行地连接即可。 – MkV 2010-11-04 14:50:54

回答

2

检查的execute返回值,为一两件事。

2

我通常从我的代码编程方式加载数据,而不是依赖于数据库加载它。这样我可以在插入前验证记录。另一个优点是,我知道记录是否无法插入,并且可以选择尝试找出问题所在,然后重试插入操作,或者将记录推送到另一个文件以便以后手动检查。

在你的代码正在处理数据,然后将它推回了DB加载文件。为什么不在处理它们时加载数据行?让数据库执行批量加载会更快,但不会提供良好的粒度;通常这是一个全部或没有的事情,如果没有什么,你的返回的错误不会告诉你很多,除非文件没有加载。

你也啜文件到内存中,所以我建议你阅读PerlFaq 5,这对How can I read in an entire file all at once?良好的部分。 Perl Slurp Ease页面可能比您想知道的要多。

相关问题