2017-07-02 186 views
0

我有一个带有customer_id和product_name的R数据框。一个客户可以有多个产品。在客户栏中,由于他们有多个产品,因此存在重复的customer_id。使用数据中的Arules和ArulesViz的关联规则

我正在尝试做一个基本的apriori分析,并确定一起购买的产品的一些关联规则。我想用R中的Arules和ArulesViz包来做到这一点。

当我试着运行这个我通常得到0规则或lhs产品 - > rhs customer_id。所以我不相信我会正确加载数据以查看多个产品给单个客户来推导关联。

任何帮助,将不胜感激!

基本数据帧举例

df <- data.frame(cust_id = as.factor(c('1aa2j', '1aa2j', '2b345', 
'2b345', 'g78a8', 'y67r3')), product = as.factor(c("Bat", "Sock", 
"Hat", "Shirt", "Ball", "Shorts"))) 

rules <- apriori(df) inspect(rules) 

lhs rhs support confidence lift 1 {product=Bat} => {cust_id=1aa2j} 0.167 1 3 
2 {product=Sock} => {cust_id=1aa2j} 0.167 1 3 
3 {product=Hat} => {cust_id=2b345} 0.167 1 3 
4 {product=Shirt} => {cust_id=2b345} 0.167 1 3 
5 {cust_id=g78a8} => {product=Ball} 0.167 1 6 
6 {product=Ball} => {cust_id=g78a8} 0.167 1 6 
7 {cust_id=y67r3} => {product=Shorts} 0.167 1 6 
8 {product=Shorts} => {cust_id=y67r3} 0.167 1 6 

回答

1

这是从实例采取transactions(略有修改):

library(arules) 
df <- data.frame(cust_id = as.factor(c('1aa2j', '1aa2j', '2b345', 
'2b345', 'g78a8', 'y67r3')), product = as.factor(c("Bat", "Sock", 
"Hat", "Shirt", "Ball", "Shorts"))) 

trans <- as(split(df[,"product"], df[,"cust_id"]), "transactions") 
inspect(trans) 

    items  transactionID 
[1] {Bat,Sock} 1aa2j   
[2] {Hat,Shirt} 2b345   
[3] {Ball}  g78a8   
[4] {Shorts} y67r3 

现在你可以使用transapriori

+0

This Works,thank you! – Andre