电影推荐系统-[离线推荐部分](一)前期准备

Posted on 2020-09-20 19:35  MissRong  阅读(152)  评论(0)    收藏  举报

电影推荐系统-[离线推荐部分](一)前期准备

1.新建子项目

 

2. 修改pom.xml配置文件

<properties>
    <mongodb-spark.version>2.0.0</mongodb-spark.version>
    <casbah.version>3.1.1</casbah.version>
    <jblas.version>1.2.1</jblas.version>
</properties>

<dependencies>
    <!--Scala里面做数学运算的工具包-->
    <dependency>
        <groupId>org.scalanlp</groupId>
        <artifactId>jblas</artifactId>
        <version>${jblas.version}</version>
    </dependency>
    <!--spark依赖-->
    <dependency>
        <groupId>org.apache.spark</groupId>
        <artifactId>spark-core_2.11</artifactId>
    </dependency>
    <dependency>
        <groupId>org.apache.spark</groupId>
        <artifactId>spark-sql_2.11</artifactId>
    </dependency>
    <dependency>
        <groupId>org.apache.spark</groupId>
        <artifactId>spark-mllib_2.11</artifactId>
    </dependency>
    <!--scala-->
    <dependency>
        <groupId>org.scala-lang</groupId>
        <artifactId>scala-library</artifactId>
    </dependency>
    <!--mongoDB-->
    <!-- 用于代码方式连接MongoDB -->
    <dependency>
        <groupId>org.mongodb</groupId>
        <artifactId>casbah-core_2.11</artifactId>
        <version>${casbah.version}</version>
    </dependency>
    <!-- 用于Spark和MongoDB的对接 -->
    <dependency>
        <groupId>org.mongodb.spark</groupId>
        <artifactId>mongo-spark-connector_2.11</artifactId>
        <version>${mongodb-spark.version}</version>
    </dependency>
</dependencies>

3. 引入Scala插件、修改java目录为scala、新建包+类

4. 拷贝之前的Model

注意:为了不和Spark Mllib里面提供的Rating混淆,这里将之前的Rating表名改成MovieRating

package offlineRecommender

/**
  * @Author : ASUS and xinrong
  * @Version : 2020/9/4
  *  数据格式转换类
  *  ---------------电影表------------------------
  *  1
  *  Toy Story (1995)
  *
  *  81 minutes
  *  March 20, 2001
  *  1995
  *  English
  *  Adventure|Animation|Children|Comedy|Fantasy
  *  Tom Hanks|Tim Allen|Don Rickles|Jim Varney|Wallace Shawn|John Ratzenberger|Annie Potts|John Morris|Erik von Detten|Laurie Metcalf|R. Lee Ermey|Sarah Freeman|Penn Jillette|Tom Hanks|Tim Allen|Don Rickles|Jim Varney|Wallace Shawn
  *  John Lasseter
  */
case class Movie(val mid:Int,val name:String,val descri:String,
                 val timelong:String,val cal_issue:String,val shoot:String,
                 val language:String,val genres :String,val actors:String,val directors:String)
/**
  * -----用户对电影的评分数据集--------
  * 1,31,2.5,1260759144
  */
case class MovieRating(val uid:Int,val mid:Int,val score:Double,val timastamp:Int)

/**
  * --------用户对电影的标签数据集--------
  * 15,339,sandra 'boring' bullock,1138537770
  */
case class Tag(val uid:Int,val mid:Int,val tag:String,val timestamp:Int)

/**
  *
  * MongoDB配置对象
  * @param uri
  * @param db
  */
case class MongoConfig(val uri:String,val db:String)

/**
  * ES配置对象
  * @param httpHosts
  * @param transportHosts:保存的是所有ES节点的信息
  * @param clusterName
  */
case class EsConfig(val httpHosts:String,val transportHosts:String,val index:String,val clusterName:String)

/**
  * recs的二次封装数据类
  * @param mid
  * @param res
  */
case class Recommendation(mid:Int,res:Double)

/**
  * Key-Value封装数据类
  * @param genres
  * @param recs
  */
case class GenresRecommendation(genres:String,recs:Seq[Recommendation])

 

博客园  ©  2004-2025
浙公网安备 33010602011771号 浙ICP备2021040463号-3