用pyparsing解析devicetree到结构化词典中

对于我的C++ RTOS我正在Python中使用pyparsing模块编写devicetree“源”文件（.dts）的解析器。我能够将devicetree的结构解析为（嵌套）字典，其中属性名称或节点名称是字典键（字符串），属性值或节点是字典值（字符串或嵌套字典）。用pyparsing解析devicetree到结构化词典中

假设我有下面的例子中的DeviceTree结构：

/ { 
    property1 = "string1"; 
    property2 = "string2"; 
    node1 { 
     property11 = "string11"; 
     property12 = "string12"; 
     node11 { 
      property111 = "string111"; 
      property112 = "string112"; 
     }; 
    }; 
    node2 { 
     property21 = "string21"; 
     property22 = "string22"; 
    }; 
};

我能够解析到类似的东西：

{'/': {'node1': {'node11': {'property111': ['string111'], 'property112': ['string112']}, 
       'property11': ['string11'], 
       'property12': ['string12']}, 
     'node2': {'property21': ['string21'], 'property22': ['string22']}, 
     'property1': ['string1'], 
     'property2': ['string2']}}

但是我需要我宁愿这数据结构不同。我想将所有属性作为关键“属性”的嵌套字典，并将所有子节点作为关键“子”的嵌套字典。原因在于devicetree（特别是节点）有一些我希望只具有键值对的“元数据”，这要求我将节点的实际“内容”移动到“较低”的一个级别以避免任何名称冲突为关键。所以，我宁愿上面的例子是这样的：

{'/': { 
    'properties': { 
    'property1': ['string1'], 
    'property2': ['string2'] 
    }, 
    'nodes': { 
    'node1': { 
     'properties': { 
     'property11': ['string11'], 
     'property12': ['string12'] 
     } 
     'nodes': { 
     'node11': { 
      'properties': { 
      'property111': ['string111'], 
      'property112': ['string112'] 
      } 
      'nodes': { 
      } 
     } 
     } 
    }, 
    'node2': { 
     'properties': { 
     'property21': ['string21'], 
     'property22': ['string22'] 
     } 
     'nodes': { 
     } 
    } 
    } 
} 
}

我试图“名”添加到解析令牌，但这会导致“翻番”字典元素（这是意料之中的，因为这种行为在pyparsing文档中描述）。这可能不是问题，但从技术上讲节点或属性可以被命名为“属性”或“孩子”（或任何我选择的），所以我不认为这样的解决方案是健壮的。

我也试图用setParseAction()令牌转换成字典片段（我希望我能转化成{'key': 'value'}{'properties': {'key': 'value'}}），但这并没有在所有的工作......

这是在所有可能直接与pyparsing？我准备只做第二阶段来将原始字典转换为我需要的任何结构，但作为完美主义者，如果可能的话，我宁愿使用单运行pyparsing-only解决方案。

有关此处参考的示例代码（Python 3），它将devicetree源代码转换为“非结构化”字典。请注意，此代码只是一种简化，不支持.dts（除字符串，值列表，单元地址，标签等之外的任何数据类型）中的所有功能 - 它只支持字符串属性和节点嵌套。

#!/usr/bin/env python 

import pyparsing 
import pprint 

nodeName = pyparsing.Word(pyparsing.alphas, pyparsing.alphanums + ',._+-', max = 31) 
propertyName = pyparsing.Word(pyparsing.alphanums + ',._+?#', max = 31) 
propertyValue = pyparsing.dblQuotedString.setParseAction(pyparsing.removeQuotes) 
property = pyparsing.Dict(pyparsing.Group(propertyName + pyparsing.Group(pyparsing.Literal('=').suppress() + 
     propertyValue) + pyparsing.Literal(';').suppress())) 
childNode = pyparsing.Forward() 
rootNode = pyparsing.Dict(pyparsing.Group(pyparsing.Literal('/') + pyparsing.Literal('{').suppress() + 
     pyparsing.ZeroOrMore(property) + pyparsing.ZeroOrMore(childNode) + 
     pyparsing.Literal('};').suppress())) 
childNode <<= pyparsing.Dict(pyparsing.Group(nodeName + pyparsing.Literal('{').suppress() + 
     pyparsing.ZeroOrMore(property) + pyparsing.ZeroOrMore(childNode) + 
     pyparsing.Literal('};').suppress())) 

dictionary = rootNode.parseString(""" 
/{ 
    property1 = "string1"; 
    property2 = "string2"; 
    node1 { 
     property11 = "string11"; 
     property12 = "string12"; 
     node11 { 
      property111 = "string111"; 
      property112 = "string112"; 
     }; 
    }; 
    node2 { 
     property21 = "string21"; 
     property22 = "string22"; 
    }; 
}; 
""").asDict() 
pprint.pprint(dictionary, width = 120)

来源

2017-04-13 Freddie Chopin

你真的很亲密。我只是做了以下内容：

添加Group S和结果的名称为您的“属性”和“节点”小节
改变了一些标点符号文字常量的（Literal("};")将无法匹配，如果有右括号和分号之间的空间，但RBRACE + SEMI将容纳空格）
上rootNode

代码除去最外面的Dict：

LBRACE,RBRACE,SLASH,SEMI,EQ = map(pyparsing.Suppress, "{}/;=") 
nodeName = pyparsing.Word(pyparsing.alphas, pyparsing.alphanums + ',._+-', max = 31) 
propertyName = pyparsing.Word(pyparsing.alphanums + ',._+?#', max = 31) 
propertyValue = pyparsing.dblQuotedString.setParseAction(pyparsing.removeQuotes) 
property = pyparsing.Dict(pyparsing.Group(propertyName + EQ 
              + pyparsing.Group(propertyValue) 
              + SEMI)) 
childNode = pyparsing.Forward() 
rootNode = pyparsing.Group(SLASH + LBRACE 
          + pyparsing.Group(pyparsing.ZeroOrMore(property))("properties") 
          + pyparsing.Group(pyparsing.ZeroOrMore(childNode))("children") 
          + RBRACE + SEMI) 
childNode <<= pyparsing.Dict(pyparsing.Group(nodeName + LBRACE 
              + pyparsing.Group(pyparsing.ZeroOrMore(property))("properties") 
              + pyparsing.Group(pyparsing.ZeroOrMore(childNode))("children") 
              + RBRACE + SEMI))

转换为与asDict和印刷用pprint的字典给出：

pprint.pprint(result[0].asDict()) 
{'children': {'node1': {'children': {'node11': {'children': [], 
               'properties': {'property111': ['string111'], 
                   'property112': ['string112']}}}, 
         'properties': {'property11': ['string11'], 
             'property12': ['string12']}}, 
       'node2': {'children': [], 
         'properties': {'property21': ['string21'], 
             'property22': ['string22']}}}, 
'properties': {'property1': ['string1'], 'property2': ['string2']}}

您还可以使用附带pyparsing的ParseResults类，以帮助可视化的列表和字典/命名空间中的dump()方法按原样访问结果，不需要任何转换呼叫

print(result[0].dump()) 

[[['property1', ['string1']], ['property2', ['string2']]], [['node1', [['property11', ['string11']], ['property12', ['string12']]], [['node11', [['property111', ['string111']], ['property112', ['string112']]], []]]], ['node2', [['property21', ['string21']], ['property22', ['string22']]], []]]] 
- children: [['node1', [['property11', ['string11']], ['property12', ['string12']]], [['node11', [['property111', ['string111']], ['property112', ['string112']]], []]]], ['node2', [['property21', ['string21']], ['property22', ['string22']]], []]] 
    - node1: [[['property11', ['string11']], ['property12', ['string12']]], [['node11', [['property111', ['string111']], ['property112', ['string112']]], []]]] 
    - children: [['node11', [['property111', ['string111']], ['property112', ['string112']]], []]] 
     - node11: [[['property111', ['string111']], ['property112', ['string112']]], []] 
     - children: [] 
     - properties: [['property111', ['string111']], ['property112', ['string112']]] 
      - property111: ['string111'] 
      - property112: ['string112'] 
    - properties: [['property11', ['string11']], ['property12', ['string12']]] 
     - property11: ['string11'] 
     - property12: ['string12'] 
    - node2: [[['property21', ['string21']], ['property22', ['string22']]], []] 
    - children: [] 
    - properties: [['property21', ['string21']], ['property22', ['string22']]] 
     - property21: ['string21'] 
     - property22: ['string22'] 
- properties: [['property1', ['string1']], ['property2', ['string2']]] 
    - property1: ['string1'] 
    - property2: ['string2']

来源

2017-04-14 03:38:28 PaulMcG

非常感谢！还有一个问题 - 是否可以将空键作为空字典'{}'而不是空列表'[]'（这里可以看到 - 'node11'：{'children'：[] ...' ）？或者，如果它们是空的，也许根本就没有这样的钥匙？ –

用pyparsing解析devicetree到结构化词典中

回答

相关问题