2012-02-03 65 views
0

我有一个.csv文件(我无法控制数据),并且出于某种原因,它包含了引号中的所有内容。删除文件助手中的引号

"Date","Description","Original Description","Amount","Type","Category","Name","Labels","Notes" 
"2/02/2012","ac","ac","515.00","a","b","","javascript://" 
"2/02/2012","test","test","40.00","a","d","c",""," " 

我使用filehelpers,我想知道删除所有这些报价将是最好的方法是什么?有没有说“如果我看到引号删除,如果没有引用找到什么都不做”?

这与数据弄乱我将有"\"515.00\""与不需要额外的引号(尤其是因为我想在这种情况下,它是一个小数不是一个字符串”。

我也不清楚是什么‘的JavaScript’是一回事,它为什么产生,但是这是从服务,我没有控制权。

编辑 我这是怎么消耗的CSV文件。

using (TextReader textReader = new StreamReader(stream)) 
     { 
      engine.ErrorManager.ErrorMode = ErrorMode.SaveAndContinue; 

      object[] transactions = engine.ReadStream(textReader); 
     } 
+0

我们可以看到代码? – 2012-02-03 18:51:27

回答

6

可以使用FieldQuoted属性描述最好here的属性页上。请注意,该属性可以应用于任何FileHelpers字段(即使它输入Decimal)。 (请记住,FileHelpers类描述了您的导入文件的规格。因此,当您将Decimal字段标记为FieldQuoted时,您在文件中说的是,此字段将被引用。

你甚至可以指定该报价是否是可选的

[FieldQuoted('"', QuoteMode.OptionalForBoth)] 

这里是一个控制台应用程序,它与您的数据的工作原理:

class Program 
{ 
    [DelimitedRecord(",")] 
    [IgnoreFirst(1)] 
    public class Format1 
    { 
     [FieldQuoted] 
     [FieldConverter(ConverterKind.Date, "d/M/yyyy")] 
     public DateTime Date; 
     [FieldQuoted] 
     public string Description; 
     [FieldQuoted] 
     public string OriginalDescription; 
     [FieldQuoted] 
     public Decimal Amount; 
     [FieldQuoted] 
     public string Type; 
     [FieldQuoted] 
     public string Category; 
     [FieldQuoted] 
     public string Name; 
     [FieldQuoted] 
     public string Labels; 
     [FieldQuoted] 
     [FieldOptional] 
     public string Notes; 
    } 

    static void Main(string[] args) 
    { 
     var engine = new FileHelperEngine(typeof(Format1)); 

     // read in the data 
     object[] importedObjects = engine.ReadString(@"""Date"",""Description"",""Original Description"",""Amount"",""Type"",""Category"",""Name"",""Labels"",""Notes"" 
""2/02/2012"",""ac"",""ac"",""515.00"",""a"",""b"","""",""javascript://"" 
""2/02/2012"",""test"",""test"",""40.00"",""a"",""d"",""c"","""","" """); 

     // check that 2 records were imported 
     Assert.AreEqual(2, importedObjects.Length); 

     // check the values for the first record 
     Format1 customer1 = (Format1)importedObjects[0]; 
     Assert.AreEqual(DateTime.Parse("2/02/2012"), customer1.Date); 
     Assert.AreEqual("ac", customer1.Description); 
     Assert.AreEqual("ac", customer1.OriginalDescription); 
     Assert.AreEqual(515.00, customer1.Amount); 
     Assert.AreEqual("a", customer1.Type); 
     Assert.AreEqual("b", customer1.Category); 
     Assert.AreEqual("", customer1.Name); 
     Assert.AreEqual("javascript://", customer1.Labels); 
     Assert.AreEqual("", customer1.Notes); 

     // check the values for the second record 
     Format1 customer2 = (Format1)importedObjects[1]; 
     Assert.AreEqual(DateTime.Parse("2/02/2012"), customer2.Date); 
     Assert.AreEqual("test", customer2.Description); 
     Assert.AreEqual("test", customer2.OriginalDescription); 
     Assert.AreEqual(40.00, customer2.Amount); 
     Assert.AreEqual("a", customer2.Type); 
     Assert.AreEqual("d", customer2.Category); 
     Assert.AreEqual("c", customer2.Name); 
     Assert.AreEqual("", customer2.Labels); 
     Assert.AreEqual(" ", customer2.Notes); 
    } 
} 

(注意,你的第一线数据似乎有8个字段,而不是9个,所以我用FieldOptional标记了Notes字段)。

0

这里是做这件事的一种方法:

string[] lines = new string[] 
{ 
    "\"Date\",\"Description\",\"Original Description\",\"Amount\",\"Type\",\"Category\",\"Name\",\"Labels\",\"Notes\"", 
    "\"2/02/2012\",\"ac\",\"ac\",\"515.00\",\"a\",\"b\",\"\",\"javascript://\"", 
    "\"2/02/2012\",\"test\",\"test\",\"40.00\",\"a\",\"d\",\"c\",\"\",\" \"", 
}; 

string[][] values = 
    lines.Select(line => 
     line.Trim('"') 
      .Split(new string[] { "\",\"" }, StringSplitOptions.None) 
      .ToArray() 
     ).ToArray(); 

lines数组表示你的样品中的线条。在C#字符串文字中,每个"字符必须以\"的格式转义。

对于每一行,我们首先删除第一个和最后一个"字符,然后继续使用","字符序列作为分隔符将其拆分为一组子字符串。

注意,上面的代码将不起作用,如果你有"字符你的价值观中自然产生的(即使逃脱)。

编辑:如果您的CSV是从流中读取,你的所有需要​​做的是:

var lines = new List<string>(); 
using (var streamReader = new StreamReader(stream)) 
    while (!streamReader.EndOfStream) 
     lines.Add(streamReader.ReadLine()); 

上面的代码的其余部分将工作完好。

编辑:鉴于你的新代码,检查是否您正在寻找这样的事情:

for (int i = 0; i < transactions.Length; ++i) 
{ 
    object oTrans = transactions[i]; 
    string sTrans = oTrans as string; 
    if (sTrans != null && 
     sTrans.StartsWith("\"") && 
     sTrans.EndsWith("\"")) 
    { 
     transactions[i] = sTrans.Substring(1, sTrans.Length - 2); 
    } 
} 
+0

我给出的代码是一个.csv文件的例子,它可以从流中上传并读取。 – chobo2 2012-02-03 19:18:25

+0

那么他们得到了一些构建方法“引擎”,返回一个对象数组。查看更改。 – chobo2 2012-02-03 19:54:46

0

我有同样的困境,我更换了引号,当我值加载到我的列表对象:

using System; 
using System.Collections.Generic; 
using System.IO; 
using System.Windows.Forms; 

namespace WindowsFormsApplication6 
{ 
    public partial class Form1 : Form 
    { 
     public Form1() 
     { 
      InitializeComponent(); 
     } 

     private void Form1_Load(object sender, EventArgs e) 
     { 
      LoadCSV(); 
     } 

     private void LoadCSV() 
     { 
      List<string> Rows = new List<string>(); 
      string m_CSVFilePath = "<Path to CSV File>"; 

      using (StreamReader r = new StreamReader(m_CSVFilePath)) 
      { 
       string row; 

       while ((row = r.ReadLine()) != null) 
       { 
        Rows.Add(row.Replace("\"", "")); 
       } 

       foreach (var Row in Rows) 
       { 
        if (Row.Length > 0) 
        { 
         string[] RowValue = Row.Split(','); 

         //Do something with values here 
        } 
       } 
      } 
     } 

    } 
} 
+0

我一直在寻找的选择,似乎像这个字段属性可能会做的伎俩。 FieldQuoted(QuoteMode.OptionalForBoth)。我认为他们错过了一个选项(一个只会忽略读和写的引号) – chobo2 2012-02-03 20:33:30

+0

@ chobo2 - 这很好,但如果你使用Filehelpers,你仍然需要在客户端机器上安装一个dll吗?我的解决方案只是使用框架而不需要任何额外的文件。 – 2012-02-03 20:41:23

0

此代码可以帮助我发展:

using (StreamReader r = new StreamReader("C:\\Projects\\Mactive\\Audience\\DrawBalancing\\CSVFiles\\Analytix_ABC_HD.csv")) 
{ 
    string row; 

    int outCount; 
     StringBuilder line=new StringBuilder() ; 
     string token=""; 
     char chr; 
     string Eachline; 

    while ((row = r.ReadLine()) != null) 
    { 
     outCount = row.Length; 
     line = new StringBuilder(); 
     for (int innerCount = 0; innerCount <= outCount - 1; innerCount++) 
     {     
      chr=row[innerCount]; 

      if (chr != '"') 
      { 
       line.Append(row[innerCount].ToString()); 
      } 
      else if(chr=='"') 
      { 
       token = ""; 
       innerCount = innerCount + 1; 
       for (; innerCount < outCount - 1; innerCount++) 
       { 
        chr=row[innerCount]; 
        if(chr=='"') 
        { 
         break; 
        } 

        token = token + chr.ToString();        
       } 

       if(token.Contains(",")){token=token.Replace(",","");} 
       line.Append(token); 
      }     
     } 
     Eachline = line.ToString(); 
     Console.WriteLine(Eachline); 
    } 
}