2017-04-25 131 views
-1

我想使用给定的多个分隔符分割文本。但是,我仍然希望将文本前面的分隔符保留为字典的键,而不是返回普通列表。分隔多个分隔符,同时保持分隔符为字典键

我试着顺序地使用我的分隔符列表,但它产生了列表列表,所以我尝试使用正则表达式(re)代替。但前re,我不能跟踪我的分裂后。我想知道是否有一种方法可以使用分隔符来分隔字符串,同时将它们作为关键字。

这里是我目前使用re的解决方案,它给出了输出列表。

import re 

abstract = """ 
BACKGROUND\nN-Methyl-D-aspartate (NMDA) receptors are glutamate-activated ion channels that are assembled from NR1 and NR2 subunits. 
These receptors are highly enriched in brain neurons and are considered to be an important target for the acute and chronic effects of ethanol. 
NR2 subunits (A-D) arise from separate genes and are expressed in a developmental and brain region-specific manner. 
The NR1 subunit has 8 isoforms that are generated by alternative splicing of a single gene. 
The heteromeric subunit makeup of the NMDA receptor determines the pharmacological and biophysical properties of the receptor and provides for functional receptor heterogeneity. 
Although results from previous studies suggest that NR2 subunits affect the ethanol sensitivity of NMDA receptors, the role of the NR1 subunit and its multiple splice variants is less well known. 
\n\n\nMETHODS\nIn this study, all 8 NR1 splice variants were individually coexpressed with each NR2 subunit in human embryonic kidney 293 (HEK293) cells and tested for inhibition by ethanol using patch-clamp electrophysiology. 
\n\n\nRESULTS\nAll 32 subunit combinations tested gave reproducible glutamate-activated currents and all receptors were inhibited to some degree by 100 mM ethanol. 
The sensitivity of individual receptors to ethanol was affected by the specific NR1 splice variant expressed with receptors containing the NR1-3 and NR1-4 subunits among the least inhibited by ethanol. 
\n\n\nCONCLUSIONS\nThese results suggest that regional, developmental, or compensatory changes in the expression of NR1 splice variants may significantly affect ethanol inhibition of NMDA receptors. 
""" 

delimiters = ['BACKGROUND\n', 'CONCLUSIONS\n', 'OBJECTIVES\n', 
       'METHODS\n', 'OBJECTIVE\n', 'RESULTS\n'] 

sections = re.split('|'.join(delimiters), abstract) 

输出

['', 
'N-Methyl-D-as ..., 
'In this study, ...', 
'All 32 subunit ...', 
...] 

欲望输出

{'BACKGROUND\n': 'N-Methyl-D-as ...', 
'METHODS\n': 'In this study, ...', 
...} 

回答

1
import re 

abstract = """ 
BACKGROUND\nN-Methyl-D-aspartate (NMDA) receptors are glutamate-activated ion channels that are assembled from NR1 and NR2 subunits. 
These receptors are highly enriched in brain neurons and are considered to be an important target for the acute and chronic effects of ethanol. 
NR2 subunits (A-D) arise from separate genes and are expressed in a developmental and brain region-specific manner. 
The NR1 subunit has 8 isoforms that are generated by alternative splicing of a single gene. 
The heteromeric subunit makeup of the NMDA receptor determines the pharmacological and biophysical properties of the receptor and provides for functional receptor heterogeneity. 
Although results from previous studies suggest that NR2 subunits affect the ethanol sensitivity of NMDA receptors, the role of the NR1 subunit and its multiple splice variants is less well known. 
\n\n\nMETHODS\nIn this study, all 8 NR1 splice variants were individually coexpressed with each NR2 subunit in human embryonic kidney 293 (HEK293) cells and tested for inhibition by ethanol using patch-clamp electrophysiology. 
\n\n\nRESULTS\nAll 32 subunit combinations tested gave reproducible glutamate-activated currents and all receptors were inhibited to some degree by 100 mM ethanol. 
The sensitivity of individual receptors to ethanol was affected by the specific NR1 splice variant expressed with receptors containing the NR1-3 and NR1-4 subunits among the least inhibited by ethanol. 
\n\n\nCONCLUSIONS\nThese results suggest that regional, developmental, or compensatory changes in the expression of NR1 splice variants may significantly affect ethanol inhibition of NMDA receptors. 
""" 

delimiters = ['BACKGROUND\n', 'CONCLUSIONS\n', 'OBJECTIVES\n', 
       'METHODS\n', 'OBJECTIVE\n', 'RESULTS\n'] 

values = re.split('|'.join(delimiters), abstract) 
values.pop(0) # remove the initial empty string 
keys = re.findall('|'.join(delimiters), abstract) 
output = dict(zip(keys, values)) 

print(output) 
# {'BACKGROUND\n': 'N-Methyl-D-aspartate (NMDA) receptors are glutamate-activated ion channels that are assembled from NR1 and NR2 subunits. \nThese receptors are highly enriched in brain neurons and are considered to be an important target for the acute and chronic effects of ethanol. \nNR2 subunits (A-D) arise from separate genes and are expressed in a developmental and brain region-specific manner. \nThe NR1 subunit has 8 isoforms that are generated by alternative splicing of a single gene. \nThe heteromeric subunit makeup of the NMDA receptor determines the pharmacological and biophysical properties of the receptor and provides for functional receptor heterogeneity. \nAlthough results from previous studies suggest that NR2 subunits affect the ethanol sensitivity of NMDA receptors, the role of the NR1 subunit and its multiple splice variants is less well known.\n\n\n\n', 'METHODS\n': 'In this study, all 8 NR1 splice variants were individually coexpressed with each NR2 subunit in human embryonic kidney 293 (HEK293) cells and tested for inhibition by ethanol using patch-clamp electrophysiology.\n\n\n\n', 'RESULTS\n': 'All 32 subunit combinations tested gave reproducible glutamate-activated currents and all receptors were inhibited to some degree by 100 mM ethanol. \nThe sensitivity of individual receptors to ethanol was affected by the specific NR1 splice variant expressed with receptors containing the NR1-3 and NR1-4 subunits among the least inhibited by ethanol.\n\n\n\n', 'CONCLUSIONS\n': 'These results suggest that regional, developmental, or compensatory changes in the expression of NR1 splice variants may significantly affect ethanol inhibition of NMDA receptors.\n'} 
+0

啊,谢谢@Philip!我很快就能接受你的回答! – titipata