2017-10-11 67 views
1

我需要将csv规范文件转换为YAML文件以满足项目需要。我为它写了一小段python代码,但不能按预期工作。我无法使用任何在线转换器,因为我正在工作的客户端不会接受该转换器。这里是Python代码,我有:使用Python脚本CSV到Yaml盖度

import csv 
csvfile = open('custInfo.csv', 'r') 

datareader = csv.reader(csvfile, delimiter=',', quotechar='"') 
data_headings = [] 

yaml_pretext = "sourceTopic : 'BIG_PARTY'" 
yaml_pretext += "\n"+'validationRequired : true'+"\n" 
yaml_pretext += "\n"+'columnMappingEntityList :'+"\n" 
for row_index, row in enumerate(datareader): 
    if row_index == 0: 
     data_headings = row 
    else: 
     # new_yaml = open('outfile.yaml', 'w') 
     yaml_text = "" 
     for cell_index, cell in enumerate(row): 
      lineSeperator = " " 
      cell_heading = data_headings[cell_index].lower().replace(" ", "_").replace("-", "") 
      if (cell_heading == "source"): 
       lineSeperator = ' - ' 

      cell_text = lineSeperator+cell_heading + " : " + cell.replace("\n", ", ") + "\n" 

      yaml_text += cell_text 
     print yaml_text 

csvfile.close() 

CSV文件中有4列,那就是:

source    destination  type  childFields 
fra:AppData   app_data   array application_id,institute_nm 
fra:ApplicationId application_id  string null 
fra:InstituteName institute_nm  string null 
fra:CustomerData  customer_data  array name,customer_address,telephone_number 
fra:Name    name    string null 
fra:CustomerAddress customer_address array street,pincode 
fra:Street   street    string null 
fra:Pincode   pincode   string null 
fra:TelephoneNumber telephone_number string null 

这里是YAML文件中我得到的输出

- source : fra:AppData 
    destination : app_data 
    type : array 
    childfields : application_id,institute_nm 

    - source : fra:ApplicationId 
    destination : application_id 
    type : string 
    childfields : null 

    - source : fra:InstituteName 
    destination : institute_nm 
    type : string 
    childfields : null 

    - source : fra:CustomerData 
    destination : customer_data 
    type : array 
    childfields : name,customer_address,telephone_number 

    - source : fra:Name 
    destination : name 
    type : string 
    childfields : null 

    - source : fra:CustomerAddress 
    destination : customer_address 
    type : array 
    childfields : street,pincode 

    - source : fra:Street 
    destination : street 
    type : string 
    childfields : null 

    - source : fra:Pincode 
    destination : pincode 
    type : string 
    childfields : null 

    - source : fra:TelephoneNumber 
    destination : telephone_number 
    type : string 
    childfields : null 

当类型是数组时,我需要输出为childField,而不是新行。所以期望的输出将是:

- source : fra:AppData 
    destination : app_data 
    type : array 
    childfields : application_id,institute_nm 
     - source : fra:ApplicationId 
     destination : application_id 
     type : string 
     childfields : null 

     - source : fra:InstituteName 
     destination : institute_nm 
     type : string 
     childfields : null 

    - source : fra:CustomerData 
    destination : customer_data 
    type : array 
    childfields : name,customer_address,telephone_number 
     - source : fra:Name 
     destination : name 
     type : string 
     childfields : null 

     - source : fra:CustomerAddress 
     destination : customer_address 
     type : array 
     childfields : street,pincode 
      - source : fra:Street 
      destination : street 
      type : string 
      childfields : null 

      - source : fra:Pincode 
      destination : pincode 
      type : string 
      childfields : null 

     - source : fra:TelephoneNumber 
     destination : telephone_number 
     type : string 
     childfields : null 

任何人都可以请帮我我怎么能得到这个?感谢您的帮助提前

克里希纳

+0

所以,你需要两个主头只 - 应用程序数据或CustomerData? –

+0

不完全。它不是关于拥有主标题,如果类型是数组,它将有子字段。那么子字段将会有一些缩进 – user3444971

回答

1

您目前没有使用任何YAML库生成输出

感谢。这是不好的做法,因为你不检查你输出的字符串内容是否包含需要引用的YAML特殊字符。

接下来,这是无效的YAML:

childfields : application_id,institute_nm 
     - source : fra:ApplicationId 
     destination : application_id 
     type : string 
     childfields : null 

childfields不可兼得的标量值(application_id,institute_nm)和序列值(开始与该项目- source : fra:ApplicationId)。

尝试生成列表和类型的字典您的结构,然后转储结构:

import yaml,csv 

csvfile = open('custInfo.csv', 'r') 
datareader = csv.reader(csvfile, delimiter=",", quotechar='"') 
result = list() 
type_index = -1 
child_fields_index = -1 

for row_index, row in enumerate(datareader): 
    if row_index == 0: 
    # let's do this once here 
    data_headings = list() 
    for heading_index, heading in enumerate(row): 
     fixed_heading = heading.lower().replace(" ", "_").replace("-", "") 
     data_headings.append(fixed_heading) 
     if fixed_heading == "type": 
     type_index = heading_index 
     elif fixed_heading == "childfields": 
     child_fields_index = heading_index 
    else: 
    content = dict() 
    is_array = False 
    for cell_index, cell in enumerate(row): 
     if cell_index == child_fields_index and is_array: 
     content[data_headings[cell_index]] = [{ 
      "source" : "fra:" + value.capitalize(), 
      "destination" : value, 
      "type" : "string", 
      "childfields" : "null" 
      } for value in cell.split(",")] 
     else: 
     content[data_headings[cell_index]] = cell 
     is_array = (cell_index == type_index) and (cell == "array") 
    result.append(content) 
print yaml.dump(result)