R_Studio(关联)对dvdtrans.csv数据进行关联规则分析

 

 


  dvdtrans.csv数据:该原始数据仅仅包含了两个字段(ID, Item) 用户ID,商品名称(共30条)

  

 

  

#导入arules包
#install.packages("arules")
library (arules)

setwd('D:\\data') 
Gary=read.csv(file="dvdtrans.csv",header=T)

# 将数据转换为arules关联规则方法apriori 可以处理的数据形式.交易数据
# transactions "事务"
Gary<- as(split(Gary$Item, Gary$ID),"transactions")

# 查看一下数据
#attributes(Gary)
summary(Gary)

# 使用apriori函数生成关联规则
rules <- apriori(Gary, parameter=list(support=0.3,confidence=0.5))

# 查看一下数据
inspect(rules)
Gary.R

 

 

实现过程

 

  导入arules包

  对数据进行预处理

#导入arules包
#install.packages("arules")
library (arules)

setwd('D:\\data') 
Gary=read.csv(file="dvdtrans.csv",header=T)

# 将数据转换为arules关联规则方法apriori 可以处理的数据形式.交易数据
# transactions "事务"
Gary<- as(split(Gary$Item, Gary$ID),"transactions")

 

> # 查看一下数据
> #attributes(Gary)
> summary(Gary)
transactions as itemMatrix in sparse format with
 10 rows (elements/itemsets/transactions) and            10行(元素/项集/事务)
 10 columns (items) and a density of 0.3                10列(项)和0.3的密度

most frequent items:                           最常见的项目(频率):
    Gladiator       Patriot   Sixth Sense    Green Mile Harry Potter1       (Other) 
            7             6             6             2             2             7 

element (itemset/transaction) length distribution:          元素(项集/事务)长度分布:
sizes
2 3 4 5 
3 5 1 1 

   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
   2.00    2.25    3.00    3.00    3.00    5.00 

includes extended item information - examples:
      labels
1 Braveheart
2  Gladiator
3 Green Mile

includes extended transaction information - examples:
  transactionID
1             1
2             2
3             3

 

  生成关联规则

 

> # 使用apriori函数生成关联规则
> rules <- apriori(Gary, parameter=list(support=0.3,confidence=0.5))
Apriori

Parameter specification:
 confidence minval smax arem  aval originalSupport maxtime support minlen maxlen target   ext
        0.5    0.1    1 none FALSE            TRUE       5     0.3      1     10  rules FALSE

Algorithmic control:
 filter tree heap memopt load sort verbose
    0.1 TRUE TRUE  FALSE TRUE    2    TRUE

Absolute minimum support count: 3 

set item appearances ...[0 item(s)] done [0.00s].
set transactions ...[10 item(s), 10 transaction(s)] done [0.00s].
sorting and recoding items ... [3 item(s)] done [0.00s].
creating transaction tree ... done [0.00s].
checking subsets of size 1 2 3 done [0.00s].
writing ... [12 rule(s)] done [0.00s].
creating S4 object  ... done [0.00s].
> 
> # 查看一下数据
> inspect(rules)
     lhs                        rhs           support confidence lift     count
[1]  {}                      => {Patriot}     0.6     0.6000000  1.000000 6    
[2]  {}                      => {Sixth Sense} 0.6     0.6000000  1.000000 6    
[3]  {}                      => {Gladiator}   0.7     0.7000000  1.000000 7    
[4]  {Patriot}               => {Sixth Sense} 0.4     0.6666667  1.111111 4    
[5]  {Sixth Sense}           => {Patriot}     0.4     0.6666667  1.111111 4    
[6]  {Patriot}               => {Gladiator}   0.6     1.0000000  1.428571 6    
[7]  {Gladiator}             => {Patriot}     0.6     0.8571429  1.428571 6    
[8]  {Sixth Sense}           => {Gladiator}   0.5     0.8333333  1.190476 5    
[9]  {Gladiator}             => {Sixth Sense} 0.5     0.7142857  1.190476 5    
[10] {Patriot,Sixth Sense}   => {Gladiator}   0.4     1.0000000  1.428571 4    
[11] {Gladiator,Patriot}     => {Sixth Sense} 0.4     0.6666667  1.111111 4    
[12] {Gladiator,Sixth Sense} => {Patriot}     0.4     0.8000000  1.333333 4   

 

posted @ 2018-11-02 23:15  Cynical丶Gary  阅读(582)  评论(0编辑  收藏  举报