2017-10-13 54 views
0

我有一个数据帧,看起来像这样:数转换成规模

> example 
          Country Urban 
1      Afghanistan  0 
2       Albania 40 
3       Algeria 50 
4       Andorra 50 
5       Angola 60 
6    Antigua and Barbuda 32 
7       Argentina 60 
8       Armenia 90 
9       Australia 50 
10       Austria 50 
11      Azerbaijan 60 
12       Bahrain 60 
13      Bangladesh  0 
14       Barbados 80 
15       Belarus 60 
16       Belgium 50 
17       Belize 40 
18       Benin  
19       Bhutan 30 
20 Bolivia (Plurinational State of) 40 

我想分类数尺度(0-49)为2。因此,摆脱空行后,我尝试:

example <- as.data.frame(sapply(example, gsub, pattern = c(0:49), replacement = 2)) 

它没有工作。

下面是使用dput产生的重复的样品:

structure(list(Country = structure(1:20, .Label = c("Afghanistan", 
"Albania", "Algeria", "Andorra", "Angola", "Antigua and Barbuda", 
"Argentina", "Armenia", "Australia", "Austria", "Azerbaijan", 
"Bahrain", "Bangladesh", "Barbados", "Belarus", "Belgium", "Belize", 
"Benin", "Bhutan", "Bolivia (Plurinational State of)"), class = "factor"), 
    Urban = structure(c(2L, 7L, 12L, 12L, 14L, 5L, 14L, 19L, 
    12L, 12L, 14L, 14L, 2L, 18L, 14L, 12L, 8L, 1L, 4L, 7L), .Label = c("", 
    "0", "100", "30", "32", "35", "40", "40 ", "45", "48", "48 ", 
    "50", "56 ", "60", "64 ", "65", "70", "80", "90"), class = "factor")), .Names = c("Country", 
"Urban"), row.names = c(NA, 20L), class = "data.frame") 

回答

0

使用ifelse

df$Urban = with(df, ifelse(Urban > 0 & Urban < 49, 2, Urban)) 

结果:

> df 
          Country Urban 
1      Afghanistan  0 
2       Albania  2 
3       Algeria 50 
4       Andorra 50 
5       Angola 60 
6    Antigua and Barbuda  2 
7       Argentina 60 
8       Armenia 90 
9       Australia 50 
10       Austria 50 
11      Azerbaijan 60 
12       Bahrain 60 
13      Bangladesh  0 
14       Barbados 80 
15       Belarus 60 
16       Belgium 50 
17       Belize  2 
18       Benin NA 
19       Bhutan  2 
20 Bolivia (Plurinational State of)  2 

数据:

df = structure(list(Country = structure(1:20, .Label = c("Afghanistan", 
                 "Albania", "Algeria", "Andorra", "Angola", "Antigua and Barbuda", 
                 "Argentina", "Armenia", "Australia", "Austria", "Azerbaijan", 
                 "Bahrain", "Bangladesh", "Barbados", "Belarus", "Belgium", "Belize", 
                 "Benin", "Bhutan", "Bolivia (Plurinational State of)"), class = "factor"), 
        Urban = structure(c(2L, 7L, 12L, 12L, 14L, 5L, 14L, 19L, 
             12L, 12L, 14L, 14L, 2L, 18L, 14L, 12L, 8L, 1L, 4L, 7L), .Label = c("", 
                              "0", "100", "30", "32", "35", "40", "40 ", "45", "48", "48 ", 
                              "50", "56 ", "60", "64 ", "65", "70", "80", "90"), class = "factor")), .Names = c("Country", 
                                                   "Urban"), row.names = c(NA, 20L), class = "data.frame") 

df$Urban = as.numeric(as.character(df$Urban)) 
+2

使用提取物和替换('[]')也是一个选项:'DF $城市[DF在%$城市%0:49] < - 2' – AkselA