2016-07-30 65 views
0

我得到一个名为full的数据集,其中一个列是Breed,如下所示。数据列中的错误标记-R

Breed 

Shetland Sheepdog Mix 
Domestic Shorthair Mix 
Pit Bull Mix 
Domestic Shorthair Mix 
Lhasa Apso/Miniature Poodle 
Cairn Terrier/Chihuahua Shorthair 
Domestic Shorthair Mix 
Domestic Shorthair Mix 
American Pit Bull Terrier Mix 
Cairn Terrier 
Domestic Shorthair Mix 
Miniature Schnauzer Mix 
Pit Bull Mix 
Yorkshire Terrier Mix 
Great Pyrenees Mix 
Domestic Shorthair Mix 
Domestic Shorthair Mix 
Pit Bull Mix 
Angora Mix 
Flat Coat Retriever Mix 
Queensland Heeler Mix 
Domestic Shorthair Mix 
Plott Hound/Boxer 

我需要的是,

我需要获得频率列中的每个唯一值。

我已经提取了BreedType和频率,如下所示。 (品种列的名称为BreedType) 然后,如果每个BreedType的频率小于66,使用if条件,我需要有一个新的列'F',如果大于66需要分配列'Breedtype'的值。

对品种赋值FALSE其中品种频率小于66

df$Breed <- data.frame(full$Breed) 

setDT(df) 
dt1 <- copy(df) 

dt1[, c("Frequency", "TrueFalse") := .(.N, ifelse(.N < 66, "FALSE", Breed)), by = Breed] 

dt1<-data.frame(dt1) 

但我的结果集得到答案这样设置与显示错误。

enter image description here

错误[.data.table(DT1,:=(C( “频率”, “TrueFalse”),(N,:。 类型RHS的( '整数')必须LHS('字符匹配')。对于最快的情况,检查和胁迫会对性能影响太大,或者改变目标列的类型,或者强制:=你自己的RHS(例如,通过使用1L而不是1)

我试过几个次,但我无法得到结果看。请问有人请帮助

当再次使用完整的$品种时,结果集看起来就像这样。而不是什么预期,但频率正确给予,

df$Breed <- data.frame(full$Breed) 

setDT(df) 
dt1 <- copy(df) 

dt1[, c("Frequency", "TrueFalse") := .(.N, ifelse(.N < 66, "FALSE", full$Breed)), by = full$Breed] 

dt1<-data.frame(dt1) 

Full<-cbind2(dt1, full) 

enter image description here

一个人可以帮到figureout是什么问题!

+0

你尝试'DT1 [C( “频率”, “TrueFalse”):=。(.N,ifelse(.N <66,FALSE,Breed)),by = Breed]'(省略'FALSE'附近的引号)? – Jaap

+0

是的,当它被测试,给出相同的错误, 错误在'[.data.table'(dt1,,':='(c(“Frequency”,“TrueFalse”),。(。N,: Type RHS('逻辑')必须与LHS('字符')匹配。检查和胁迫会对最快情况的性能影响太大。要么改变目标列的类型,要么强制:=你自己的RHS使用1L而不是1) – user3789200

+0

对示例数据进行了测试,带**引号的代码**适用于我的PC。 – Jaap

回答

0

你可以使用dplyr:

library(dplyr) 
df%>%group_by(Breed)%>%summarize(Frequency=n())%>%mutate(TrueFalse=ifelse(Frequency<66,"F",as.character(Breed))) 

导致:

Source: local data frame [14 x 3] 

            Breed Frequency    TrueFalse 
            <fctr>  <int>     <chr> 
    1  American Pit Bull Terrier Mix   4      F 
    2       Angora Mix   2      F 
    3      Cairn Terrier   4      F 
    4 Cairn Terrier/Chihuahua Shorthair   4      F 
    5    Domestic Shorthair Mix  519 Domestic Shorthair Mix 
    6    Flat Coat Retriever Mix   2      F 
    7     Great Pyrenees Mix   4      F 
    8   Lhasa Apso/Miniature Poodle   4      F 
    9    Miniature Schnauzer Mix   4      F 
    10      Pit Bull Mix  10      F 
    11     Plott Hound/Boxer  73  Plott Hound/Boxer 
    12    Queensland Heeler Mix   2      F 
    13    Yorkshire Terrier Mix   4      F 
    14    Shetland Sheepdog Mix  75 Shetland Sheepdog Mix 

其中df是:

df<-structure(list(Breed = structure(c(14L, 5L, 10L, 5L, 8L, 4L, 
5L, 5L, 1L, 3L, 5L, 9L, 10L, 13L, 7L, 5L, 5L, 10L, 2L, 6L, 12L, 
5L, 11L, 14L, 5L, 10L, 5L, 8L, 4L, 5L, 5L, 1L, 3L, 5L, 9L, 10L, 
13L, 7L, 5L, 5L, 10L, 2L, 6L, 12L, 5L, 11L, 14L, 5L, 10L, 5L, 
8L, 4L, 5L, 5L, 1L, 3L, 5L, 9L, 10L, 13L, 7L, 14L, 5L, 10L, 5L, 
8L, 4L, 5L, 5L, 1L, 3L, 5L, 9L, 10L, 13L, 7L, 5L, 11L, 14L, 5L, 
5L, 11L, 14L, 5L, 5L, 11L, 14L, 5L, 5L, 11L, 14L, 5L, 5L, 11L, 
14L, 5L, 5L, 11L, 14L, 5L, 5L, 11L, 14L, 5L, 5L, 11L, 14L, 5L, 
5L, 11L, 14L, 5L, 5L, 11L, 14L, 5L, 5L, 11L, 14L, 5L, 5L, 11L, 
14L, 5L, 5L, 11L, 14L, 5L, 5L, 11L, 14L, 5L, 5L, 11L, 14L, 5L, 
5L, 11L, 14L, 5L, 5L, 11L, 14L, 5L, 5L, 11L, 14L, 5L, 5L, 11L, 
14L, 5L, 5L, 11L, 14L, 5L, 5L, 11L, 14L, 5L, 5L, 11L, 14L, 5L, 
5L, 11L, 14L, 5L, 5L, 11L, 14L, 5L, 5L, 11L, 14L, 5L, 5L, 11L, 
14L, 5L, 5L, 11L, 14L, 5L, 5L, 11L, 14L, 5L, 5L, 11L, 14L, 5L, 
5L, 11L, 14L, 5L, 5L, 11L, 14L, 5L, 5L, 11L, 14L, 5L, 5L, 11L, 
14L, 5L, 5L, 11L, 14L, 5L, 5L, 11L, 14L, 5L, 5L, 11L, 14L, 5L, 
5L, 11L, 14L, 5L, 5L, 11L, 14L, 5L, 5L, 11L, 14L, 5L, 5L, 11L, 
14L, 5L, 5L, 11L, 14L, 5L, 5L, 11L, 14L, 5L, 5L, 11L, 14L, 5L, 
5L, 11L, 14L, 5L, 5L, 11L, 14L, 5L, 5L, 11L, 14L, 5L, 5L, 11L, 
14L, 5L, 5L, 11L, 14L, 5L, 5L, 11L, 14L, 5L, 5L, 11L, 14L, 5L, 
5L, 11L, 14L, 5L, 5L, 11L, 14L, 5L, 5L, 11L, 14L, 5L, 5L, 11L, 
14L, 5L, 5L, 11L, 14L, 5L, 5L, 11L, 14L, 5L, 5L, 11L, 14L, 5L, 
5L, 11L, 14L, 5L, 5L, 11L, 14L, 5L, 5L, 11L, 14L, 5L, 5L, 11L, 
14L, 5L, 5L, 11L, 14L, 5L, 5L, 11L, 14L, 5L, 5L, 11L, 14L, 5L, 
5L, 11L, 14L, 5L, 5L, 11L, 14L, 5L, 5L, 11L, 14L, 5L, 5L, 11L, 
14L, 5L, 5L, 11L, 14L, 5L, 5L, 11L, 14L, 5L, 5L, 11L, 14L, 5L, 
5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 
5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 
5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 
5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 
5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 
5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 
5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 
5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 
5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 
5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 
5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 
5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 
5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 
5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 
5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 
5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 
5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 
5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 
5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 
5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 
5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 
5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L), .Label = c(" American Pit Bull Terrier Mix", 
" Angora Mix", " Cairn Terrier", " Cairn Terrier/Chihuahua Shorthair", 
" Domestic Shorthair Mix", " Flat Coat Retriever Mix", " Great Pyrenees Mix", 
" Lhasa Apso/Miniature Poodle", " Miniature Schnauzer Mix", " Pit Bull Mix", 
" Plott Hound/Boxer", " Queensland Heeler Mix", " Yorkshire Terrier Mix", 
"Shetland Sheepdog Mix"), class = "factor")), .Names = "Breed", class = "data.frame", row.names = c(NA, 
-711L))