比方说,我们有一个逗号分隔的文件(CSV)这样的:去除场报价在csv文件
"name of movie","starring","director","release year"
"dark knight rises","christian bale, anna hathaway","christopher nolan","2012"
"the dark knight","christian bale, heath ledger","christopher nolan","2008"
"The "day" when earth stood still","Michael Rennie,the 'strong' man","robert wise","1951"
"the 'gladiator'","russel "the awesome" crowe","ridley scott","2000"
正如你可以从上面看到,行4 & 5有引号内的引号。 输出应该是这个样子:
"name of movie","starring","director","release year"
"dark knight rises","christian bale, anna hathaway","christopher nolan","2012"
"the dark knight","christian bale, heath ledger","christopher nolan","2008"
"The day when earth stood still","Michael Rennie,the strong man","robert wise","1951"
"the gladiator","russel the awesome crowe","ridley scott","2000"
如何摆脱这样的行情中出现一个CSV文件,这样的报价(单,双)的。请注意,单个字段中的逗号是可以的,因为解析器确定它在引号内并将其作为一个字段。这只是安排csv文件的预处理步骤,以便它可以反馈到多个解析器中以转换为我们所需的任何格式。 Bash,awk,python都可以工作。请不要perl,我厌倦了这种语言:D 在此先感谢!
我不清楚以及如何删除第一个和最后一个报价将有所帮助。要求是在csv文件中的每个字段周围都有双引号。如果我们在每个字段之间没有引号,那么在它们中包含逗号的字段值不能被分析。 – crazyim5 2012-08-17 17:53:17
我的想法是,CSV阅读器将无法解析该文件,因为有非双引号。我想你必须自己解析它,因此我的建议。虽然因为它们会被删除,但删除第一个和最后一个引号也是不必要的。我以为你已经在使用csv模块了...我猜不是。 – 2012-08-17 18:18:03
我不明白为什么我的问题得到-1:/ – crazyim5 2012-08-17 18:18:47