2015-04-05 88 views
1

从这里开始R程序员,我想在数据框中找到与单个人最兼容的人员。兼容性基于将数据点分配给数据帧中某些值的算法。 我有一个数据帧称为kewl.d00dz,它看起来像这样:如何从循环中的输出中创建循环中找到第二高值R

name dream.name birth.state birth.month birth.date major 
1 stephen  butch   CO   oct   11 ELEC 
2  clark  richard   VA   jan   19 BUAD 
3 anthony   bo   NJ   mar   26 BUAD 
4  jack  kordell   VA   jul   27 BUAD 
5  eric  adrian   ND   jun   17 GEOG 
6  tyler  anthony   VA   apr   12 CPSC 
7 olivia isabella   VA   may   29 MATH 
8  brad  harvey   HI   aug   21 BUAD 
9 hannah  charlie   VA   aug   28 PSYC 
10  will  ronald   VA   may   11 BUAD 
11  noor   ani   CA   apr   14 BUAD 
12 victoria elizabeth   VA   jan   11 MATH 
13 morgan c  lauren   FL   jun   15 BUAD 
14 morgan w elizabeth   VA   feb   21 ARTS 
15 helena  helena   VA   apr   26 BIOL 
16 amber amber leigh   VA   dec   6 PSCI 
17  ekta  kate   VA   apr   14 ARTH 
18 caroline  georgia   DC   jun   20 BUAD 
19  anna  abby   VA   sep   21 BUAD 
20  nate  julio   VA   sep   5 ECON 
21 jessica jeanette   VA   oct   7 BUAD 
22 shaina  skylar   VA   sep   2 BUAD 
23  ruth  lucy   VA   jan   4 CPSC 
24 sohyun caroline  Seoul   nov   16 PSYC 
25 aaron   don   VA   sep   1 ECON 
26  alex  axel   VA   sep   6 BIOL 
     cell num.bills num.states 
1  none   5   41 
2  apple   8   14 
3  apple   4   14 
4  apple  19   10 
5  apple   6   19 
6 samsung   1   10 
7  apple   3   8 
8  apple   1   18 
9  apple   2   16 
10 apple   5   20 
11 apple   3   19 
12 apple   5   17 
13 apple   3   15 
14 apple   4   24 
15 android   0   18 
16 apple   1   12 
17 apple   1   19 
18 apple   0   22 
19 apple   0   27 
20 samsung   4   32 
21 samsung   5   11 
22 apple   0   15 
23 apple   7   30 
24 apple  10   10 
25 motorola   8   18 
26  htc   3   20 

我需要找到最兼容的人与任何人,我输入我的函数是这样的:

source("compatibility.R") 
find.most.compatible<-function(x){ 
    a<-which(kewl.d00dz$name==x) 
    x<-as.list(kewl.d00dz[a,]) 
    pts<-list() 
    namez<-list() 
    for (i in 1:nrow(kewl.d00dz)){ 
    y<-as.list(kewl.d00dz[i,]) 
    pts[i]<-compatibility(x,y) 
    namez[i]<-kewl.d00dz[i,"name"] 
    names(pts)<-namez 
    } 
    n<-length(pts) 
    (which(pts == sort(pts,partial=n-1)[n-1])) 
} 

我希望它向我返回第二高的价值,因为如果它返回第一个人将与他们自己最兼容。然而,它给了我此错误消息:

> find.most.compatible("stephen") 
02727312231332325212224261723292219149302611312321 
Error in sort.int(x, na.last = na.last, decreasing = decreasing, ...) : 
    'x' must be atomic 

这里是我前面提到的功能 内调用,我不希望更改代码的功能:

compatibility<-function(x,y){ 
    #start point bag 
    com.points<-0 

    #number of bills compatibility points 
    com.points<-com.points +(10-abs(as.integer(x[["num.bills"]] - y[["num.bills"]]))) 

    #different number of states compatibility points 
    diff.states<-abs(as.integer(x[["num.states"]]-y[["num.states"]])) 
    cat(diff.states) 
    if(diff.states<5){ 
    com.points<-com.points+5 
    } else if(diff.states<10){ 
    com.points<com.points+3 
    } else { 
    com.points<-com.points 
    } 
    #birth month compatibility points 
    if(x[["birth.month"]]== "dec"||x[["birth.month"]]== "jan"||x[["birth.month"]]== "feb"){ 
    season1<-"winter" 
    } else if(x[["birth.month"]]== "mar"|| x[["birth.month"]]== "apr" || x[["birth.month"]]== "may"){ 
    season1<-"spring" 
    } else if(x[["birth.month"]]== "jun"||x[["birth.month"]]== "jul"||x[["birth.month"]]== "aug"){ 
    season1<-"summer" 
    } else { 
    season1<-"fall" 
    } 

    if(y[["birth.month"]]== "dec" || y[["birth.month"]]== "jan" || y[["birth.month"]] == "feb"){ 
    season2<-"winter" 
    } else if(y[["birth.month"]]== "mar"||y[["birth.month"]]== "apr"||y[["birth.month"]]== "may"){ 
    season2<-"spring" 
    } else if(y[["birth.month"]]== "jun"||y[["birth.month"]]== "jul"||y[["birth.month"]]== "aug"){ 
    season2<-"summer" 
    } else { 
    season2<-"fall" 
    } 
    if (x[["birth.month"]] == y[["birth.month"]]){ 
    com.points<-com.points + 3 
    } else if(season1==season2){ 
    com.points<-com.points + 1 
    } else { 
    com.points<-com.points 
    } 
    #birth state compatibility points 
    if (x[["birth.state"]]==y[["birth.state"]]){ 
    com.points<-com.points + 1 
    } else { 
     com.points<-com.points 
    } 
    #major compatibility points 
    if (x[["major"]]==y[["major"]]){ 
    com.points<-com.points + 4 
    } else { 
    com.points<-com.points 
    } 

    #cellular provider compatibility points 
    if(x[["cell"]] == y[["cell"]]){ 
    com.points<-com.points + 2 
    } else { 
    com.points<-com.points 
    }  
return(com.points) 
} 

可有人请解决我的代码,而不使用任何特殊的功能,如申请,子集等?

只允许使用which.max等。

回答

0

我还没有尝试过你的所有代码,但我可以看到你需要修改你的循环到这样的东西 - 否则你的函数将在第一次迭代时返回。

我注释掉的名字(PTS)线B/C,这也可以去你的循环之外一旦所有的项目都英寸

pts <- list() # if you actually want a list. You could also do c() for a vector 

for (i in 1:nrow(kewl.d00dz)) { 
    y <- as.list(kewl.d00dz[i,]) 
    pts[i] <- compatibility(x,y) 
    # names(pts) <- sprintf(kewl.d00dz[i,"name"],1:length(pts)) 
} 

return(pts)