2014-10-27 75 views
1

我在我的xml中有一些部分需要重新排序,我知道xml不需要重新排序,但这是我需要做的,但可以'弄清楚“正确”的方式来做到这一点。我正在使用lxml并已使用.insert命令重新排序。我需要重新整理各各<asset type="preview">,这样它看起来像这里面的标签:Python -lxml重新订购xml标签

<asset type="preview"> 
     <territories> 
      <territory>SE</territory> 
     </territories> 
     <data_file role="source"> 
      <locale name="es"/> 
      <file_name>some_name_nor-preview-sv.mov</file_name> 
      <size>1715119116</size> 
      <checksum type="md5">55cd94d051700be34014b2892e925fa1</checksum> 
      <attribute name="crop.top">25</attribute> 
      <attribute name="crop.bottom">25</attribute> 
      <attribute name="crop.left">4</attribute> 
      <attribute name="crop.right">4</attribute> 
      <attribute name="image.burned_subtitles.locale">sv</attribute> 
      <attribute name="image.textless_master">false</attribute> 
     </data_file> 
    </asset> 

我有时也有多个<asset type="preview">,有时没有。此外,有时每个<asset type="preview">都不包含此处列出的所有标签。 这是xml的一节,我试图按照上面的顺序重新排序。

<asset type="preview"> 
     <data_file role="source"> 
      <size>1657800204</size> 
      <file_name>some_name_nor-preview.mov</file_name> 
      <checksum type="md5">c61dfa7139ab04560cac41cf5ba8a1f2</checksum> 
      <locale name="es"/> 
      <attribute name="crop.top">25</attribute> 
      <attribute name="crop.right">4</attribute> 
      <attribute name="crop.bottom">25</attribute> 
      <attribute name="crop.left">4</attribute> 
     </data_file> 
     <territories> 
      <territory>WW</territory> 
     </territories> 
     <data_file role="notes"> 
      <size>9642</size> 
      <file_name>some_name_nor-preview-notes.pdf</file_name> 
      <checksum type="md5">4d0dc3534cd1d0f9885afbfda9be8b71</checksum> 
     </data_file> 
    </asset> 
    <asset type="preview"> 
     <data_file role="source"> 
      <size>1715119116</size> 
      <file_name>some_name_nor-preview-sv.mov</file_name> 
      <checksum type="md5">55cd94d051700be34014b2892e925fa1</checksum> 
      <locale name="es"/> 
      <attribute name="image.burned_subtitles.locale">sv</attribute> 
      <attribute name="crop.top">25</attribute> 
      <attribute name="crop.right">4</attribute> 
      <attribute name="image.textless_master">false</attribute> 
      <attribute name="crop.left">4</attribute> 
      <attribute name="crop.bottom">25</attribute> 
     </data_file> 
     <territories> 
      <territory>SE</territory> 
     </territories> 
    </asset> 
    <asset type="preview"> 
     <data_file role="source"> 
      <size>1709158524</size> 
      <file_name>some_name_nor-preview-fi.mov</file_name> 
      <checksum type="md5">58c5fcfa718393f76cb9b2d8f7c10362</checksum> 
      <locale name="es"/> 
      <attribute name="crop.bottom">25</attribute> 
      <attribute name="crop.top">25</attribute> 
      <attribute name="crop.left">4</attribute> 
      <attribute name="image.textless_master">false</attribute> 
      <attribute name="crop.right">4</attribute> 
      <attribute name="image.burned_subtitles.locale">fi</attribute> 
     </data_file> 
     <territories> 
      <territory>FI</territory> 
     </territories> 
    </asset> 
    <asset type="preview"> 
     <territories> 
      <territory>NO</territory> 
     </territories> 
     <data_file role="source"> 
      <size>1718632572</size> 
      <file_name>some_name_nor-preview-no.mov</file_name> 
      <checksum type="md5">41734d9d8dd4165416a4369f4ce9c8e1</checksum> 
      <locale name="es"/> 
      <attribute name="crop.left">4</attribute> 
      <attribute name="crop.top">25</attribute> 
      <attribute name="crop.bottom">25</attribute> 
      <attribute name="image.textless_master">false</attribute> 
      <attribute name="image.burned_subtitles.locale">no</attribute> 
      <attribute name="crop.right">4</attribute> 
     </data_file> 
    </asset> 
    <asset type="preview"> 
     <territories> 
      <territory>DK</territory> 
     </territories> 
     <data_file role="source"> 
      <size>1721312028</size> 
      <file_name>some_name_nor-preview-da.mov</file_name> 
      <checksum type="md5">919abd17baf680161a220dbae8409918</checksum> 
      <locale name="es"/> 
      <attribute name="image.textless_master">false</attribute> 
      <attribute name="crop.bottom">25</attribute> 
      <attribute name="image.burned_subtitles.locale">da</attribute> 
      <attribute name="crop.right">4</attribute> 
      <attribute name="crop.left">4</attribute> 
      <attribute name="crop.top">25</attribute> 
     </data_file> 
    </asset> 

这是我目前“不工作”的代码,它不是重新排序attribute[@name=标签,不知道这是正确的做法:

 a = 0 
     b = 0 
     for node_search in tree.xpath("//video/assets/asset[@type='preview']/data_file[@role='source']"): 
      for element in node_search.iter(tag='locale'): 
       node_products = tree.xpath("//video/assets/asset[@type='preview']/data_file[@role='source']")[b] 
       node_type = tree.xpath("//video/assets/asset[@type='preview']/data_file[@role='source']/locale")[b] 
       node_products.insert(a, node_type) 
       b = b+1 
      a = a+1 
      b = 0 
     for node_search in tree.xpath("//video/assets/asset[@type='preview']/data_file[@role='source']"): 
      for element in node_search.iter(tag='file_name'): 
       node_products = tree.xpath("//video/assets/asset[@type='preview']/data_file[@role='source']")[b] 
       node_type = tree.xpath("//video/assets/asset[@type='preview']/data_file[@role='source']/file_name")[b] 
       node_products.insert(a, node_type) 
       b = b+1 
      a = a+1   
      b = 0 
     for node_search in tree.xpath("//video/assets/asset[@type='preview']/data_file[@role='source']"): 
      for element in node_search.iter(tag='size'): 
       node_products = tree.xpath("//video/assets/asset[@type='preview']/data_file[@role='source']")[b] 
       node_type = tree.xpath("//video/assets/asset[@type='preview']/data_file[@role='source']/size")[b] 
       node_products.insert(a, node_type) 
       b = b+1 
      a = a+1 
      b = 0 
     for node_search in tree.xpath("//video/assets/asset[@type='preview']/data_file[@role='source']"): 
      for element in node_search.iter(tag='checksum'): 
       node_products = tree.xpath("//video/assets/asset[@type='preview']/data_file[@role='source']")[b] 
       node_type = tree.xpath("//video/assets/asset[@type='preview']/data_file[@role='source']/checksum")[b] 
       node_products.insert(a, node_type) 
       b = b+1 
      a = a+1 
      b = 0 
     for node_search in tree.xpath("//video/assets/asset[@type='preview']/data_file[@role='source']"): 
      for element in node_search.iter(tag="attribute[@name='crop.top']"): 
       node_products = tree.xpath("//video/assets/asset[@type='preview']/data_file[@role='source']")[b] 
       node_type = tree.xpath("//video/assets/asset[@type='preview']/data_file[@role='source']/attribute[@name='crop.top']")[b] 
       node_products.insert(a, node_type) 
       b = b+1 
      a = a+1 
      b = 0 
     for node_search in tree.xpath("//video/assets/asset[@type='preview']/data_file[@role='source']"): 
      for element in node_search.iter(tag="attribute[@name='crop.bottom']"): 
       node_products = tree.xpath("//video/assets/asset[@type='preview']/data_file[@role='source']")[b] 
       node_type = tree.xpath("//video/assets/asset[@type='preview']/data_file[@role='source']/attribute[@name='crop.bottom']")[b] 
       node_products.insert(a, node_type) 
       b = b+1 
      a = a+1 
      b = 0 
     for node_search in tree.xpath("//video/assets/asset[@type='preview']/data_file[@role='source']"): 
      for element in node_search.iter(tag="attribute[@name='crop.left']"): 
       node_products = tree.xpath("//video/assets/asset[@type='preview']/data_file[@role='source']")[b] 
       node_type = tree.xpath("//video/assets/asset[@type='preview']/data_file[@role='source']/attribute[@name='crop.left']")[b] 
       node_products.insert(a, node_type) 
       b = b+1 
      a = a+1 
      b = 0 
     for node_search in tree.xpath("//video/assets/asset[@type='preview']/data_file[@role='source']"): 
      for element in node_search.iter(tag="attribute[@name='crop.right']"): 
       node_products = tree.xpath("//video/assets/asset[@type='preview']/data_file[@role='source']")[b] 
       node_type = tree.xpath("//video/assets/asset[@type='preview']/data_file[@role='source']/attribute[@name='crop.right']")[b] 
       node_products.insert(a, node_type) 
       b = b+1 
      a = a+1 
      b = 0 
     for node_search in tree.xpath("//video/assets/asset[@type='preview']/data_file[@role='source']"): 
      for element in node_search.iter(tag="attribute[@name='image.burned_forced_narrative.locale']"): 
       node_products = tree.xpath("//video/assets/asset[@type='preview']/data_file[@role='source']")[b] 
       node_type = tree.xpath("//video/assets/asset[@type='preview']/data_file[@role='source']/attribute[@name='image.burned_forced_narrative.locale']")[b] 
       node_products.insert(a, node_type) 
       b = b+1 
      a = a+1 
      b = 0 
     for node_search in tree.xpath("//video/assets/asset[@type='preview']/data_file[@role='source']"): 
      for element in node_search.iter(tag="attribute[@name='image.burned_subtitles.locale']"): 
       node_products = tree.xpath("//video/assets/asset[@type='preview']/data_file[@role='source']")[b] 
       node_type = tree.xpath("//video/assets/asset[@type='preview']/data_file[@role='source']/attribute[@name='image.burned_subtitles.locale']")[b] 
       node_products.insert(a, node_type) 
       b = b+1 
      a = a+1 
      b = 0 
     for node_search in tree.xpath("//video/assets/asset[@type='preview']/data_file[@role='source']"): 
      for element in node_search.iter(tag="attribute[@name='image.textless_master']"): 
       node_products = tree.xpath("//video/assets/asset[@type='preview']/data_file[@role='source']")[b] 
       node_type = tree.xpath("//video/assets/asset[@type='preview']/data_file[@role='source']/attribute[@name='image.textless_master']")[b] 
       node_products.insert(a, node_type) 
       b = b+1 
      a = a+1 
      b = 0 
+0

向我们展示你的代码 – 2014-10-27 15:04:36

+0

我已经更新了问题,包括我当前的代码。 – speedyrazor 2014-10-27 15:14:22

+0

可能的重复:http://stackoverflow.com/questions/25039195/python-lxml-write-to-file-in-predefined-order – 2014-10-27 15:27:32

回答

1

我不是你的需求完全清楚。下面的代码排序顺序每个asset_preview

unknown tags 
<territories> 
unknown <data_file> roles 
<data_file role=source> 
<data_file role=notes> 

和排序每个data_file像这样:

unknown tags 
<locale> 
<file_name> 
<size> 
<checksum> 
unknown attributes 
<attribute name="crop.top"> 
other <attributes>, in a specific order. 

理解这个技术的关键是要认识到一个节点列表,可以重新排序你重新排列任何列表的方式。在我的情况下,我用sorted()自定义键。

在这里你去:

from lxml import etree 

def preview_key(et): 
    major_ordering = ['territories', 'data_file'] 
    minor_ordering = ['source', 'notes'] 
    try: 
     major = major_ordering.index(et.tag) 
    except ValueError: 
     major = -1 
    try: 
     minor = minor_ordering.index(et.get('role', None)) 
    except ValueError: 
     minor = -1 
    return major, minor 

def data_file_key(et): 
    major_ordering = ['locale', 'file_name', 'size', 'checksum', 'attribute'] 
    minor_ordering = [ 
      "crop.top", 
      "crop.bottom", 
      "crop.left", 
      "crop.right", 
      "image.burned_subtitles.locale", 
      "image.textless_master"] 
    try: 
     major = major_ordering.index(et.tag) 
    except ValueError: 
     major = -1 
    try: 
     minor = minor_ordering.index(et.get('name', None)) 
    except ValueError: 
     minor = -1 
    return major, minor 



with open('input.xml') as input_file: 
    parser = etree.XMLParser(remove_blank_text=True) 
    tree = etree.parse(input_file, parser) 
root = tree.getroot() 

for preview in tree.xpath("//asset[@type='preview']"): 
    preview[:] = sorted(preview, key=preview_key) 

for data_file in tree.xpath("//data_file"): 
    data_file[:] = sorted(data_file, key=data_file_key) 

with open('output.xml', 'w') as output_file: 
    output_file.write(etree.tostring(tree, pretty_print = True))