摘要: test regression/lasso in spark-shellimport org.apache.spark.SparkContextimport org.apache.spark.mllib.regression.LassoModelimport org.apache.spark.mllib.regression.LassoWithSGDimport org.apache.spark.mllib.util.LinearDataGenerator...simpleprinciple introduction 阅读全文
posted @ 2014-03-11 06:27 enyun 阅读(288) 评论(0) 推荐(0) 编辑
摘要: default spark is built with spark 1.xI replace hadoop*1.0.4.jar in lib_managed with hadoop*2.2.0.jar,but all jar files will be updated back into 1.0.4 after sbt assembly.The below command will lead you to the success :SPARK_HADOOP_VERSION=2.2.0 sbt/sbt clean compile package assembly 阅读全文
posted @ 2014-03-09 15:22 enyun 阅读(208) 评论(0) 推荐(0) 编辑
摘要: (1) operation history:SPARK_HADOOP_VERSION=2.2.0 SPARK_YARN=true sbt/sbt assembly按官方教程失败,但去掉红色的部分,很容易编译成功,这条指令会带来什么?修改为SPARK_HADOOP_VERSION=2.2.0SPARK_YARN=true sbt/sbt clean compile package assembly正常通过(2) 启动测试程序export YARN_CONF_DIR=$HADOOP_HOMESPARK_JAR=./assembly/target/scala-2.9.3/spark-assembly 阅读全文
posted @ 2014-02-27 00:29 enyun 阅读(437) 评论(0) 推荐(0) 编辑
摘要: on mac os x . recordsome problems I met.1 export JAVA_HOME="$(/usr/libexec/java_home)"2 export HADOOP_HOME=/Users/admin/work/hadoop/hadoop.tar.2.2.0/hadoop-2.2.0/3 export HADOOP_MAPRED_HOME=/Users/admin/work/hadoop/hadoop.tar.2.2.0/hadoop-2.2.0/4 export HADOOP_COMMON_HOME=$HADOOP_HOME5 exp 阅读全文
posted @ 2014-02-26 20:13 enyun 阅读(335) 评论(0) 推荐(0) 编辑
摘要: 轻易不赞美,这个不得不赞,几个例子看一看 三个算法就全部明白了。看来自己属于learning by example/doing 的类型。An Introduction to 隐马尔科夫模型(viterbi算法)关键词:Introduction隐马尔科夫模型viterbi算法 An Introduction to 隐马尔科夫模型(viterbi算法)我要推荐给大家这个网站http://www.comp.leeds.ac.uk/roger/HiddenMarkovModels/html_dev/main.html这是一个介绍HMM的网页,里面的内容言简意赅,而且深入浅出,特别介绍了在一阶HMM模型 阅读全文
posted @ 2011-06-29 17:15 enyun 阅读(269) 评论(0) 推荐(0) 编辑
摘要: PseudocodeIn the following algorithm, the codeu:= vertex inQwith smallest dist[], searches for the vertexuin the vertex setQthat has the leastdist[u]value. That vertex is removed from the setQand returned to the user.dist_between(u,v)calculates the length between the two neighbor-nodesuandv. The var 阅读全文
posted @ 2011-06-29 08:04 enyun 阅读(561) 评论(0) 推荐(0) 编辑
摘要: Instatistics,analysis of variance (ANOVA)is a collection ofstatistical models, and their associated procedures, in which the observedvariancein a particular variable is partitioned into components attributable to different sources of variation. In its simplest form ANOVA provides astatistical testof 阅读全文
posted @ 2011-06-28 18:52 enyun 阅读(269) 评论(0) 推荐(0) 编辑
摘要: bash-3.2$ cat drg.data | python bellman_ford.py3 20 100001 02 2import sys;lineNum=0;nodeNum=0;vetexList = [];dist = [];pre = [];sourceNode = -1;dataFile = open('drg.data', 'r');[ nodeNum, sourceNode ] = dataFile.readline().split("\t");dataFile.close();print (nodeNum, source 阅读全文
posted @ 2011-06-28 18:39 enyun 阅读(668) 评论(0) 推荐(0) 编辑