2015-01-21 76 views
-1

嘿,我需要创建简单的python随机数发生器。例如输入:python随机数发生器 - 在双层嵌套级别的大括号​​中获得随机文本

{{hey|hello|hi}|{privet|zdravstvuy|kak dela}|{bonjour|salut}}, can {you|u} give me advice? 

和输出应该是:

hello, can you give me advice 

我有一个脚本,它可以做到这一点,但只有一层嵌套

with open('text.txt', 'r') as text: 
    matches = re.findall('([^{}]+)', text.read()) 
words = [] 
for match in matches: 
    parts = match.split('|') 
    if parts[0]: 
     words.append(parts[random.randint(0, len(parts)-1)]) 
message = ''.join(words) 

这是不够的,我)

+0

对我而言,似乎你的输入遵循的语法对于简单的正则表达式有点复杂。我会说,构建一个适当的词法分析器,由分析器调用以产生输出。如果你不熟悉这个概念,我建议你先阅读理论:) – 2015-01-21 11:28:34

+1

你正在寻找递归正则表达式匹配。见:http://stackoverflow.com/questions/1656859/how-can-a-recursive-regexp-be-implemented-in-python – 2015-01-21 11:29:30

+0

@ KarelKubat哦,不,我不需要这个。我只想从大括号中得到随机文本,其中包含另一个大括号 – 2015-01-21 11:29:57

回答

2

Python的正则表达式不支持嵌套结构,所以你必须找到一些其他的方式来解析st环。

这里是我的快速杂牌组装电脑:

def randomize(text): 
    start= text.find('{') 
    if start==-1: #if there are no curly braces, there's nothing to randomize 
     return text 

    # parse the choices we have 
    end= start 
    word_start= start+1 
    nesting_level= 0 
    choices= [] # list of |-separated values 
    while True: 
     end+= 1 
     try: 
      char= text[end] 
     except IndexError: 
      break # if there's no matching closing brace, we'll pretend there is. 
     if char=='{': 
      nesting_level+= 1 
     elif char=='}': 
      if nesting_level==0: # matching closing brace found - stop parsing. 
       break 
      nesting_level-= 1 
     elif char=='|' and nesting_level==0: 
      # put all text up to this pipe into the list 
      choices.append(text[word_start:end]) 
      word_start= end+1 
    # there's no pipe character after the last choice, so we have to add it to the list now 
    choices.append(text[word_start:end]) 
    # recursively call this function on each choice 
    choices= [randomize(t) for t in choices] 
    # return the text up to the opening brace, a randomly chosen string, and 
    # don't forget to randomize the text after the closing brace 
    return text[:start] + random.choice(choices) + randomize(text[end+1:]) 
+0

哦,谢谢你!) – 2015-01-21 12:47:26

1

正如我前面所说,嵌套基本上是无用的在这里,但如果你想保持你当前的语法,来处理它的方法之一是更换一个循环括号直到没有更多:

import re, random 

msg = '{{hey|hello|hi}|{privet|zdravstvuy|kak dela}|{bonjour|salut}}, can {you|u} give me advice?' 


while re.search(r'{.*}', msg): 
    msg = re.sub(
     r'{([^{}]*)}', 
     lambda m: random.choice(m.group(1).split('|')), 
     msg) 

print msg 
# zdravstvuy, can u give me advice?