2017-11-10 132 views
1

我有一个字符串的数据帧列。现在我想用来自另一个具有要替换的单词的含义的数据框的值替换这些字符串中的特定单词。我目前正在使用iterrrows(),这需要大约2分钟25000行。我想知道是否有更有效的方式来做到这一点。使用字典替换数据帧列中的值

syn = pd.ExcelFile("C:/Key-Value.xlsx") 
df_syn = syn.parse("Keys") 

for idx, row in df_syn.iterrows(): 
    df['col'] = df['col'].str.replace(r"\b"+row['synonym']+r"\b", row['word']) 

回答

1
IIUC

设置

df_syn = pd.DataFrame(dict(synonym=['hug', 'kiss'], word=['warm', 'tender'])) 
df = pd.DataFrame(dict(col=['I want a hug', 'a kiss would be great'])) 

print(df_syn, df, sep='\n\n') 

    synonym word 
0  hug warm 
1 kiss tender 

        col 
0   I want a hug 
1 a kiss would be great 

mapping = df_syn.assign(
    synonym=df_syn.synonym.radd(r'\b').add(r'\b') 
).set_index('synonym').word.to_dict() 

df.replace({'col': mapping}, regex=True) 

         col 
0   I want a warm 
1 a tender would be great