2016-07-26 59 views
-1

我有一个数据框下面包含正则表达式和数字,我想删除值EE后面的字符。然而,gsub或sub输出是矢量,而不是数据帧。R gsub删除模式并返回数据帧格式

输入:

C01 C02 C03 C04 C05 C06 C07 C08 
98 EE|0.3302 EE|0.3302 EE|0.3302 EE|0.3302 EE|0.3302 EE|0.3302 EE|0.3302 EE|0.3302 
99 EE|0.4050 EE|0.4050 EE|0.4050 EE|0.4050 EE|0.4050 EE|0.3818 EE|0.4050 EE|0.4050 
100 EE|0.2199 EE|0.0000 EE|0.2199 EE|0.2176 EE|0.2199 EE|0.2199 EE|0.2199 EE|0.2199 
102 EE|0.3449 EE|0.3449 EE|0.3449 EE|0.3449 EE|0.3449 EE|0.3449 EE|0.3449 EE|0.3449 
105 EE|0.6669 EE|0.6669 EE|0.6669 EE|0.6669 EE|0.6669 EE|0.6669 EE|0.6669 EE|0.6669 
107 EE|0.8352 EE|0.8352 EE|0.8352 EE|0.8352 EE|0.8352 EE|0.8352 EE|0.8352 EE|0.8352 
108 EE|0.4309 EE|0.4309 EE|0.4309 EE|0.4309 EE|0.4309 EE|0.4309 EE|0.4309 EE|0.4309 
109 EE|0.5634 EE|0.5634 EE|0.5634 EE|0.5634 EE|0.5634 EE|0.5634 EE|0.5634 EE|0.5634 
110 EE|0.5969 EE|0.5969 EE|0.5969 EE|0.5969 EE|0.5969 EE|0.5969 EE|0.5969 EE|0.5969 
111 EE|0.6486 EE|0.6486 EE|0.6486 EE|0.6486 EE|0.6486 EE|0.6486 EE|0.6486 EE|0.6486 
112 EE|0.3616 EE|0.3616 EE|0.3616 EE|0.3616 EE|0.3616 EE|0.3616 EE|0.3616 EE|0.3616 
113 EE|0.3770 EE|0.3770 EE|0.3770 EE|0.3770 EE|0.3770 EE|0.3770 EE|0.3770 EE|0.3770 
114 EE|0.3616 EE|0.3616 EE|0.3616 EE|0.3616 EE|0.3616 EE|0.3616 EE|0.3616 EE|0.3616 
115 EE|0.3218 EE|0.3218 EE|0.3218 EE|0.3218 EE|0.3218 EE|0.3218 EE|0.3218 EE|0.3218 
116 EE|0.6402 EE|0.6402 EE|0.6402 EE|0.6402 EE|0.6402 EE|0.6402 EE|0.6402 EE|0.6402 
120 EE|0.2944 EE|0.2944 EE|0.2944 EE|0.2944 EE|0.2944 EE|0.2944 EE|0.2944 EE|0.2944 
121 EE|0.3616 EE|0.3616 EE|0.3616 EE|0.3616 EE|0.3616 EE|0.3616 EE|0.3616 EE|0.3616 

输出:

C01 C02 C03 C04 C05 C06 C07 C08 
98 EE EE EE EE EE EE EE EE 
99 EE EE EE EE EE EE EE EE 
100 EE EE EE EE EE EE EE EE 
102 EE EE EE EE EE EE EE EE 
105 EE EE EE EE EE EE EE EE 
107 EE EE EE EE EE EE EE EE 
108 EE EE EE EE EE EE EE EE 
109 EE EE EE EE EE EE EE EE 
110 EE EE EE EE EE EE EE EE 
111 EE EE EE EE EE EE EE EE 
112 EE EE EE EE EE EE EE EE 
113 EE EE EE EE EE EE EE EE 
114 EE EE EE EE EE EE EE EE 
115 EE EE EE EE EE EE EE EE 
116 EE EE EE EE EE EE EE EE 
120 EE EE EE EE EE EE EE EE 
121 EE EE EE EE EE EE EE EE 
+3

'DAT [] < - lapply(DAT,GSUB,图案= “yourpattern”,替换= “”)'' – dayne

+2

库(dplyr); df%>%mutate_all(sub,pattern ='\\ |。*',replacement ='')'或'df%>%mutate_all(funs(sub('\\ |。*','',。)) )' – alistaire

+0

'dat [] < - sub(“\\ |。* $”,“”,as.matrix(dat))' – Jota

回答

3

我们可以遍历的列和使用sub以匹配与\\|随后开始模式的一个或多个字符(.*)和取代它与空白('')。

df1[] <- lapply(df1, sub, pattern = "\\|.*", replacement = "") 
df1 
#  C01 C02 C03 C04 C05 C06 C07 C08 
#98 EE EE EE EE EE EE EE EE 
#99 EE EE EE EE EE EE EE EE 
#100 EE EE EE EE EE EE EE EE 
#102 EE EE EE EE EE EE EE EE 
#105 EE EE EE EE EE EE EE EE 
#107 EE EE EE EE EE EE EE EE 
#108 EE EE EE EE EE EE EE EE 
#109 EE EE EE EE EE EE EE EE 
#110 EE EE EE EE EE EE EE EE 
#111 EE EE EE EE EE EE EE EE 
#112 EE EE EE EE EE EE EE EE 
#113 EE EE EE EE EE EE EE EE 
#114 EE EE EE EE EE EE EE EE 
#115 EE EE EE EE EE EE EE EE 
#116 EE EE EE EE EE EE EE EE 
#120 EE EE EE EE EE EE EE EE 
#121 EE EE EE EE EE EE EE EE