2016-01-06 127 views
-3

我有这个表充满了字符和数字,并希望只有前3频率,加上自己的变量。根据图像,我想有结果的表只包括AZ 520,然后AE 488,然后AU 399顶部/最大值在R

Var1 Freq 
1 AE 488 
2 AR 12 
3 AU 399 
4 AW 56 
5 AZ 520 
6 BA 2 
7 BB 84 
8 BG 246 
9 BH 85 
10 BI 6 




as.data.frame(table(training.data.raw$destinationcountry)) 

回答

2

重塑你的数据如下,假设列名,name,并value

training.data.raw <- data_frame(name = c("IN", "IS", "IT", "JO", "JP",  "KZ", "MA", "MZ", "NG", "NO", "NZ", "PE", "PH", "PR", "RO", "RU", "SA", "SE", "SY", "TM", "TN", "TR", "UK", "US", "WS"), 
           value = c(999, 1, 1885, 1098, 2, 584, 858, 11, 10, 522, 193, 29, 2, 1, 1603, 353, 6, 2, 4, 33, 228, 3201, 852, 1363, 1)); 

可以使用top_n功能在dplyr包轻松地获得您想要的结果(在帮助文件?top_n详细信息):

library(dplyr); 
top_3 <- top_n(x=training.data.raw, n=3); 
top_3; 

编辑基于注释:如果你有性格的因素,而不是常规的特征向量,可以先mutate他们的字符:

training.data.characters <- mutate(training.data.raw, name = as.character(name)); 

# Now top_n() will take it 
# Can also explicity state wt argument to tell it to sort by value 
top_3 <- top_n(x=training.data.characters, n=3, wt=value); 
top_3; 
+0

谢谢,但我收到此消息 '错误UseMethod(“tbl_vars”): 没有适用于'tbl_vars'适用于类“factor”类对象的方法 –

+0

好吧,这意味着您的命名变量是'因素'。这是尴尬的。你可以先用'mutate'变换它们。我会更新答案。 –

+0

谢谢!我会检查出来 –