2014-09-21 104 views
1

给定两个文件A和B,是否有方法可以编辑B中与A中的字符串重叠时匹配两个文件的字符串的字体,颜色等?不匹配的字符串应保持原样,因此输出文件应保持与输入相同的长度。匹配两个文件之间的行并标记匹配的字符串

例子:

文件中的

NM_134083 mmu-miR-96-5p NM_134083  0.96213 -0.054 
NM_177305 mmu-miR-96-5p NM_177305  0.95707 -0.099 
NM_026184 mmu-miR-93-3p NM_026184  0.9552 -0.01 

文件B

NM_134083 
NM_177305 
NM_17343052324 

输出

**NM_134083** mmu-miR-96-5p **NM_134083**  0.96213 -0.054 
**NM_177305** mmu-miR-96-5p **NM_177305**  0.95707 -0.099 
+2

应如何我想象,没有任何的例子吗? – user1767754 2014-09-21 13:22:36

+1

请执行以下操作:http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example – lilster 2014-09-21 13:54:33

+0

为什么标记为R? – 2014-09-21 15:24:25

回答

1

你给原始文本,并且不指定那种格式化你的想做。留下格式化详细信息,是的,您可以用格式化的内容替换FileB中也在FileB中的文本。

import re 
with open('fileA.txt') as A: 
    A_content=[x.strip() for x in A] 
with open('fileB.txt') as B: 
    B_content=[x.strip() for x in B] 
output=[] 
for line_A in A_content: 
    for line_B in B_content: 
     #do whatever formatting you need on the text, 
     # I am just surrounding it with *'s here 

     replace = "**" + line_B + "**" 

     #use re.sub, 
     # details here: https://docs.python.org/2/library/re.html#re.sub 

     line_A = re.sub(line_B, replace , line_A) 
    #I am adding everything to the output array but you can check if it is 
    # different from the initial content. I leave that for you to do 
    output.append(line_A) 

输出

**NM_134083** mmu-miR-96-5p **NM_134083**  0.96213 -0.054 
**NM_177305** mmu-miR-96-5p **NM_177305**  0.95707 -0.099 
NM_026184 mmu-miR-93-3p NM_026184  0.9552 -0.01 
+0

粗体格式将如何显示? – user3741035 2014-09-21 16:32:21