2017-04-13 54 views
1

对于我的C++ RTOS我正在Python中使用pyparsing模块编写devicetree“源”文件(.dts)的解析器。我能够将devicetree的结构解析为(嵌套)字典,其中属性名称或节点名称是字典键(字符串),属性值或节点是字典值(字符串或嵌套字典)。用pyparsing解析devicetree到结构化词典中

假设我有下面的例子中的DeviceTree结构:

/ { 
    property1 = "string1"; 
    property2 = "string2"; 
    node1 { 
     property11 = "string11"; 
     property12 = "string12"; 
     node11 { 
      property111 = "string111"; 
      property112 = "string112"; 
     }; 
    }; 
    node2 { 
     property21 = "string21"; 
     property22 = "string22"; 
    }; 
}; 

我能够解析到类似的东西:

{'/': {'node1': {'node11': {'property111': ['string111'], 'property112': ['string112']}, 
       'property11': ['string11'], 
       'property12': ['string12']}, 
     'node2': {'property21': ['string21'], 'property22': ['string22']}, 
     'property1': ['string1'], 
     'property2': ['string2']}} 

但是我需要我宁愿这数据结构不同。我想将所有属性作为关键“属性”的嵌套字典,并将所有子节点作为关键“子”的嵌套字典。原因在于devicetree(特别是节点)有一些我希望只具有键值对的“元数据”,这要求我将节点的实际“内容”移动到“较低”的一个级别以避免任何名称冲突为关键。所以,我宁愿上面的例子是这样的:

{'/': { 
    'properties': { 
    'property1': ['string1'], 
    'property2': ['string2'] 
    }, 
    'nodes': { 
    'node1': { 
     'properties': { 
     'property11': ['string11'], 
     'property12': ['string12'] 
     } 
     'nodes': { 
     'node11': { 
      'properties': { 
      'property111': ['string111'], 
      'property112': ['string112'] 
      } 
      'nodes': { 
      } 
     } 
     } 
    }, 
    'node2': { 
     'properties': { 
     'property21': ['string21'], 
     'property22': ['string22'] 
     } 
     'nodes': { 
     } 
    } 
    } 
} 
} 

我试图“名”添加到解析令牌,但这会导致“翻番”字典元素(这是意料之中的,因为这种行为在pyparsing文档中描述)。这可能不是问题,但从技术上讲节点或属性可以被命名为“属性”或“孩子”(或任何我选择的),所以我不认为这样的解决方案是健壮的。

我也试图用setParseAction()令牌转换成字典片段(我希望我能转化成{'key': 'value'}{'properties': {'key': 'value'}}),但这并没有在所有的工作......

这是在所有可能直接与pyparsing?我准备只做第二阶段来将原始字典转换为我需要的任何结构,但作为完美主义者,如果可能的话,我宁愿使用单运行pyparsing-only解决方案。

有关此处参考的示例代码(Python 3),它将devicetree源代码转换为“非结构化”字典。请注意,此代码只是一种简化,不支持.dts(除字符串,值列表,单元地址,标签等之外的任何数据类型)中的所有功能 - 它只支持字符串属性和节点嵌套。

#!/usr/bin/env python 

import pyparsing 
import pprint 

nodeName = pyparsing.Word(pyparsing.alphas, pyparsing.alphanums + ',._+-', max = 31) 
propertyName = pyparsing.Word(pyparsing.alphanums + ',._+?#', max = 31) 
propertyValue = pyparsing.dblQuotedString.setParseAction(pyparsing.removeQuotes) 
property = pyparsing.Dict(pyparsing.Group(propertyName + pyparsing.Group(pyparsing.Literal('=').suppress() + 
     propertyValue) + pyparsing.Literal(';').suppress())) 
childNode = pyparsing.Forward() 
rootNode = pyparsing.Dict(pyparsing.Group(pyparsing.Literal('/') + pyparsing.Literal('{').suppress() + 
     pyparsing.ZeroOrMore(property) + pyparsing.ZeroOrMore(childNode) + 
     pyparsing.Literal('};').suppress())) 
childNode <<= pyparsing.Dict(pyparsing.Group(nodeName + pyparsing.Literal('{').suppress() + 
     pyparsing.ZeroOrMore(property) + pyparsing.ZeroOrMore(childNode) + 
     pyparsing.Literal('};').suppress())) 

dictionary = rootNode.parseString(""" 
/{ 
    property1 = "string1"; 
    property2 = "string2"; 
    node1 { 
     property11 = "string11"; 
     property12 = "string12"; 
     node11 { 
      property111 = "string111"; 
      property112 = "string112"; 
     }; 
    }; 
    node2 { 
     property21 = "string21"; 
     property22 = "string22"; 
    }; 
}; 
""").asDict() 
pprint.pprint(dictionary, width = 120) 

回答

1

你真的很亲密。我只是做了以下内容:

  • 添加Group S和结果的名称为您的“属性”和“节点”小节
  • 改变了一些标点符号文字常量的(Literal("};")将无法​​匹配,如果有右括号和分号之间的空间,但RBRACE + SEMI将容纳空格)
  • rootNode

代码除去最外面的Dict

LBRACE,RBRACE,SLASH,SEMI,EQ = map(pyparsing.Suppress, "{}/;=") 
nodeName = pyparsing.Word(pyparsing.alphas, pyparsing.alphanums + ',._+-', max = 31) 
propertyName = pyparsing.Word(pyparsing.alphanums + ',._+?#', max = 31) 
propertyValue = pyparsing.dblQuotedString.setParseAction(pyparsing.removeQuotes) 
property = pyparsing.Dict(pyparsing.Group(propertyName + EQ 
              + pyparsing.Group(propertyValue) 
              + SEMI)) 
childNode = pyparsing.Forward() 
rootNode = pyparsing.Group(SLASH + LBRACE 
          + pyparsing.Group(pyparsing.ZeroOrMore(property))("properties") 
          + pyparsing.Group(pyparsing.ZeroOrMore(childNode))("children") 
          + RBRACE + SEMI) 
childNode <<= pyparsing.Dict(pyparsing.Group(nodeName + LBRACE 
              + pyparsing.Group(pyparsing.ZeroOrMore(property))("properties") 
              + pyparsing.Group(pyparsing.ZeroOrMore(childNode))("children") 
              + RBRACE + SEMI)) 

转换为与asDict和印刷用pprint的字典给出:

pprint.pprint(result[0].asDict()) 
{'children': {'node1': {'children': {'node11': {'children': [], 
               'properties': {'property111': ['string111'], 
                   'property112': ['string112']}}}, 
         'properties': {'property11': ['string11'], 
             'property12': ['string12']}}, 
       'node2': {'children': [], 
         'properties': {'property21': ['string21'], 
             'property22': ['string22']}}}, 
'properties': {'property1': ['string1'], 'property2': ['string2']}} 

您还可以使用附带pyparsing的ParseResults类,以帮助可视化的列表和字典/命名空间中的dump()方法按原样访问结果,不需要任何转换呼叫

print(result[0].dump()) 

[[['property1', ['string1']], ['property2', ['string2']]], [['node1', [['property11', ['string11']], ['property12', ['string12']]], [['node11', [['property111', ['string111']], ['property112', ['string112']]], []]]], ['node2', [['property21', ['string21']], ['property22', ['string22']]], []]]] 
- children: [['node1', [['property11', ['string11']], ['property12', ['string12']]], [['node11', [['property111', ['string111']], ['property112', ['string112']]], []]]], ['node2', [['property21', ['string21']], ['property22', ['string22']]], []]] 
    - node1: [[['property11', ['string11']], ['property12', ['string12']]], [['node11', [['property111', ['string111']], ['property112', ['string112']]], []]]] 
    - children: [['node11', [['property111', ['string111']], ['property112', ['string112']]], []]] 
     - node11: [[['property111', ['string111']], ['property112', ['string112']]], []] 
     - children: [] 
     - properties: [['property111', ['string111']], ['property112', ['string112']]] 
      - property111: ['string111'] 
      - property112: ['string112'] 
    - properties: [['property11', ['string11']], ['property12', ['string12']]] 
     - property11: ['string11'] 
     - property12: ['string12'] 
    - node2: [[['property21', ['string21']], ['property22', ['string22']]], []] 
    - children: [] 
    - properties: [['property21', ['string21']], ['property22', ['string22']]] 
     - property21: ['string21'] 
     - property22: ['string22'] 
- properties: [['property1', ['string1']], ['property2', ['string2']]] 
    - property1: ['string1'] 
    - property2: ['string2'] 
+0

非常感谢!还有一个问题 - 是否可以将空键作为空字典'{}'而不是空列表'[]'(这里可以看到 - 'node11':{'children':[] ...' )?或者,如果它们是空的,也许根本就没有这样的钥匙? –