2016-08-20 112 views
1

转换价值,关键我有以下内容的TXT文件:从字典

water=45 
melon=8 
apple=35 
pineapple=67 
I=43 
to=90 
eat=12 
tastes=100 
sweet=21 
it=80 
watermelon=98 
want=70 

和我有以下文本另一个文件:

I want to eat watermelon 
it tastes sweet pineapple 

我要输出到:

I want to eat watermelon = 43,70,90,12,98 
it tastes sweet pineapple = 80,100,21,67 

这是我到目前为止有:

import nltk 
f = open(r'C:\folder\dic\file.txt','r') 
answer = {} 
for line in f: 
    k, v = line.strip().split('=') 
    answer[k.strip()] = v.strip() 

f.close() 

print answer.values() 

h = open(r'C:\folder\dic\file2.txt','r') 
raw=h.read() 
tokens = nltk.sent_tokenize(raw) 
text = nltk.Text(tokens) 


for line in text: 
    word = line 
    for value in answer.values(): 
     if value == word: 
      word=answer[keys] 
     else: 
      word="not found" 

print word 

在Python中这样做的最好方法是什么?

+0

'[答案并[c]对C在 'adfac']' –

回答

1

请检查这个代码。

import re 
f = open(r'C:\Users\dinesh_pundkar\Desktop\val.txt','r') 
val_dict = {} 
for line in f: 
    k, v = line.strip().split('=') 
    val_dict[k.strip()] = v.strip() 
f.close() 

print val_dict 

h = open(r'C:\Users\dinesh_pundkar\Desktop\str_txt.txt','r') 
str_list = [] 
for line in h: 
    str_list.append(str(line).strip()) 

print str_list 

tmp_str = '' 
for val in str_list: 
    tmp_str = val 
    for k in val_dict.keys(): 
      if k in val: 
       replace_str = str(val_dict[k]).strip() + "," 
       tmp_str= re.sub(r'\b{0}\b'.format(k),replace_str,tmp_str,flags=re.IGNORECASE) 

    tmp_str = tmp_str.strip(",") 
    print val, " = ", tmp_str 
    tmp_str = '' 

输出:

C:\Users\dinesh_pundkar\Desktop>python demo.py 
{'apple': '35', 'I': '43', 'sweet': '21', 'it': '80', 'water': '45', 'to': '90', 
'taste': '100', 'watermelon': '98', 'want': '70', 'pineapple': '67', 'melon': ' 
8', 'eat': '12'} 
['I want to eat watermelon', 'it taste sweet pineapple'] 
I want to eat watermelon = 43, 70, 90, 12, 98 
it taste sweet pineapple = 80, 100, 21, 67 
+0

谢谢的Dinesh它的工作,但存在一个问题,如果数字2位例如= 23 b = 56 c = 21,结果将以逗号分隔.2,3,5,6,2,1。 –

+0

@RiskaNanda - 请检查现在编辑的代码。早些时候,我用'26'等替换'a',然后用逗号','拆分和连接字符串。这就是为什么'26'正在转换为'2,6'。现在我将用'1'替换'a',用'26'替换'z',然后从最后删除“,”。 –

+0

我尝试了另一种情况下,数据为val_dict:'水= 45 瓜= 8 苹果= 35 菠萝= 67 I = 43 到= 90 吃= 12 味道= 100 甜= 21 它= 80 西瓜= 98 想= 70' 和用于为str_list数据:'我想吃西瓜 它味甜pineapple' 我得到的结果:'我想吃西瓜= 43,70,90,12, 45,8 它口味甜菠萝= 43,t 100,s 21,p43,ne35' 是否可以按照字? –