2015-10-16 109 views
0

如何从这个字符向量中有效地删除重复项?从字符向量中删除重复的字符串

> dput(data[1:30]) 
c("AT2G27020 AT3G26340", "AT1G56450 AT3G26340", "AT1G13060 AT3G26340", 
"AT3G22630 AT3G26340", "AT3G22110 AT3G26340", "AT2G05840 AT3G26340", 
"AT1G47250 AT3G26340", "AT1G79210 AT3G26340", "AT2G27020 AT5G40580", 
"AT3G27430 AT5G40580", "AT4G31300 AT5G40580", "AT3G14290 AT5G40580", 
"AT3G22630 AT5G40580", "AT3G22110 AT5G40580", "AT5G35590 AT5G40580", 
"AT2G05840 AT5G40580", "AT3G60820 AT5G40580", "AT1G79210 AT5G40580", 
"AT2G27020 AT3G27430", "AT2G27020 AT4G31300", "AT1G53850 AT2G27020", 
"AT2G27020 AT5G66140", "AT2G27020 AT3G51260", "AT1G21720 AT2G27020", 
"AT1G56450 AT2G27020", "AT1G13060 AT2G27020", "AT2G27020 AT3G22630", 
"AT2G27020 AT4G14800", "AT2G27020 AT3G22110", "AT2G27020 AT5G35590" 
) 

我曾尝试使用简单的功能为:uniqueduplicated但遗憾的是它没有工作。

这是我的不好。通过重复我是指相同的AGIs,因此它们中的一些一起存储在“”中并不重要。我希望每个“ATXG ...”只有一次在我的向量中。在开始时我并不知道矢量包含它们对...对不起。

+3

你究竟做了什么(代码),什么没有工作? _entire_字符串上的'unique'和'duplicated'工作。你想删除什么“重复”? – hrbrmstr

+1

您的示例不包含重复项... – Cath

+1

您的字符串格式为“”text1 text2“”。你是否想看看这两个值是否相等? 'text1 == text2'? –

回答

3
unique(unlist(strsplit(x, " "))) 
#[1] "AT2G27020" "AT3G26340" "AT1G56450" "AT1G13060" "AT3G22630" "AT3G22110" 
#[7] "AT2G05840" "AT1G47250" "AT1G79210" "AT5G40580" "AT3G27430" "AT4G31300" 
#[13] "AT3G14290" "AT5G35590" "AT3G60820" "AT1G53850" "AT5G66140" "AT3G51260" 
#[19] "AT1G21720" "AT4G14800"