如何基于与蟒蛇缩进

我有一个储存与缩进/空格中源会计师树解析层次：如何基于与蟒蛇缩进

Income 
    Revenue 
     IAP 
     Ads 
    Other-Income 
Expenses 
    Developers 
     In-house 
     Contractors 
    Advertising 
    Other Expenses

有水平的固定号码，所以我想扁平化层次结构，通过使用3个字段（实际数据具有6个级别，简化例如）：

for rownum in range(6,ws.max_row+1): 
    accountName = str(ws.cell(row=rownum,column=1).value) 
    indent = len(accountName) - len(accountName.lstrip(' ')) 
    if indent == 0: 
     l1 = accountName 
     l2 = '' 
     l3 = '' 
    elif indent == 3: 
     l2 = accountName 
     l3 = '' 
    else: 
     l3 = accountName 

    w.writerow([l1,l2,l3])

：

L1  L2   L3 
Income 
Income Revenue 
Income Revenue  IAP 
Income Revenue  Ads 
Income Other-Income 
Expenses Developers In-house 
... etc

我可以通过检查之前的帐户名的空格数要这样做

有没有一种更灵活的方式来实现这一点，基于当前行的缩进与前一行相比，而不是假设它每个级别总是3个空格？ L1将始终没有缩进，并且我们可以相信较低的级别会比其父级进一步缩进，但每个级别可能不总是3个空格。

更新，最终以此作为逻辑的肉，因为我最终希望拥有内容的帐户列表，似乎最简单的方法是使用缩进来决定是重置，追加还是弹出列表：

 if indent == 0: 
      accountList = [] 
      accountList.append((indent,accountName)) 
     elif indent > prev_indent: 
      accountList.append((indent,accountName)) 
     elif indent <= prev_indent: 
      max_indent = int(max(accountList,key=itemgetter(0))[0]) 
      while max_indent >= indent: 
       accountList.pop() 
       max_indent = int(max(accountList,key=itemgetter(0))[0]) 
      accountList.append((indent,accountName))

所以在输出的每一行accountList都是完整的。

来源

2017-08-30 Hart CO

你可以模仿Python实际解析缩进的方式。首先，创建一个包含缩进级别的堆栈。在每一行上：

如果压痕大于堆栈顶部，则按下它并增加深度级别。
如果相同，继续在同一级别。
如果较低，则弹出堆栈顶部，高于新缩进。如果在查找完全相同之前发现较低的缩进级别，则会出现缩进错误。

indentation = [] 
indentation.append(0) 
depth = 0 

f = open("test.txt", 'r') 

for line in f: 
    line = line[:-1] 

    content = line.strip() 
    indent = len(line) - len(content) 
    if indent > indentation[-1]: 
     depth += 1 
     indentation.append(indent) 

    elif indent < indentation[-1]: 
     while indent < indentation[-1]: 
      depth -= 1 
      indentation.pop() 

     if indent != indentation[-1]: 
      raise RuntimeError("Bad formatting") 

    print(f"{content} (depth: {depth})")

随着其含量 “的test.txt” 文件是为您提供：

Income 
    Revenue 
     IAP 
     Ads 
    Other-Income 
Expenses 
    Developers 
     In-house 
     Contractors 
    Advertising 
    Other Expenses

这里是输出：

Income (depth: 0) 
Revenue (depth: 1) 
IAP (depth: 2) 
Ads (depth: 2) 
Other-Income (depth: 1) 
Expenses (depth: 0) 
Developers (depth: 1) 
In-house (depth: 2) 
Contractors (depth: 2) 
Advertising (depth: 1) 
Other Expense (depth: 1)

所以，你可以你这样做？假设你想构建嵌套列表。首先，创建一个数据堆栈。

当您找到缩进时，在数据堆栈的末尾附加一个新列表。
当您发现一个unindentation时，弹出顶部列表，并将其追加到新的顶部。

而且，无论如何，对于每一行，都会将内容附加到数据堆栈顶部的列表中。

下面是相应的实施：

for line in f: 
    line = line[:-1] 

    content = line.strip() 
    indent = len(line) - len(content) 
    if indent > indentation[-1]: 
     depth += 1 
     indentation.append(indent) 
     data.append([]) 

    elif indent < indentation[-1]: 
     while indent < indentation[-1]: 
      depth -= 1 
      indentation.pop() 
      top = data.pop() 
      data[-1].append(top) 

     if indent != indentation[-1]: 
      raise RuntimeError("Bad formatting") 

    data[-1].append(content) 

while len(data) > 1: 
    top = data.pop() 
    data[-1].append(top)

你的嵌套列表是在您data堆栈的顶部。为同一文件的输出是：

['Income', 
    ['Revenue', 
     ['IAP', 
     'Ads' 
     ], 
    'Other-Income' 
    ], 
'Expenses', 
    ['Developers', 
     ['In-house', 
     'Contractors' 
     ], 
    'Advertising', 
    'Other Expense' 
    ] 
]

这是比较容易操纵，虽然相当深度嵌套。您可以通过级联项访问数据访问：

>>> l = data[0] 
>>> l 
['Income', ['Revenue', ['IAP', 'Ads'], 'Other-Income'], 'Expenses', ['Developers', ['In-house', 'Contractors'], 'Advertising', 'Other Expense']] 
>>> l[1] 
['Revenue', ['IAP', 'Ads'], 'Other-Income'] 
>>> l[1][1] 
['IAP', 'Ads'] 
>>> l[1][1][0] 
'IAP'

来源

2017-08-30 15:54:39

感谢这个，我最终希望能够输出在与行的内容沿每一行的层次，所以我稍作修改，但这让我朝着正确的方向前进。 –

如果压痕是空间固定金额（这里3个空格），可以简化缩进级别的计算。

注：我用StringIO的模拟文件

import io 
import itertools 

content = u"""\ 
Income 
    Revenue 
     IAP 
     Ads 
    Other-Income 
Expenses 
    Developers 
     In-house 
     Contractors 
    Advertising 
    Other Expenses 
""" 

stack = [] 
for line in io.StringIO(content): 
    content = line.rstrip() # drop \n 
    row = content.split(" ") 
    stack[:] = stack[:len(row) - 1] + [row[-1]] 
    print("\t".join(stack))

你得到：

Income 
Income Revenue 
Income Revenue IAP 
Income Revenue Ads 
Income Other-Income 
Expenses 
Expenses Developers 
Expenses Developers In-house 
Expenses Developers Contractors 
Expenses Advertising 
Expenses Other Expenses

编辑：压痕不固定

如果缩进不是固定（你并不总是有3个空格），如下例所示：

content = u"""\ 
Income 
    Revenue 
    IAP 
    Ads 
    Other-Income 
Expenses 
    Developers 
     In-house 
     Contractors 
    Advertising 
    Other Expenses 
"""

你需要估计在每一个新行转移：

stack = [] 
last_indent = u"" 
for line in io.StringIO(content): 
    indent = "".join(itertools.takewhile(lambda c: c == " ", line)) 
    shift = 0 if indent == last_indent else (-1 if len(indent) < len(last_indent) else 1) 
    index = len(stack) + shift 
    stack[:] = stack[:index - 1] + [line.strip()] 
    last_indent = indent 
    print("\t".join(stack))

来源

2017-08-30 16:17:12

如何基于与蟒蛇缩进

回答

相关问题