随笔分类 - spark
摘要:PLSA.py 1 # coding:utf8 2 from pyspark import SparkContext 3 from pyspark import RDD 4 import numpy as np 5 from numpy.random import RandomState ...
阅读全文
摘要:windows7 spark单机环境搭建
+ follow this link "how to run apache spark on windows7" pycharm 访问本机 spark
+ 安装py4j
+ 配置pycharm 在PYTHON_HOME\lib\site packa...
阅读全文
摘要:课程主要实用内容:1.spark实验环境的搭建2.4个lab的内容3.常用函数4.变量共享1.spark实验环境的搭建(windows)a. 下载,安装visualbox 管理员身份运行;课程要求最新版4.3.28,如果c中遇到虚拟机打不开的,可以用4.2.12,不影响b. 下载,安装vagrant...
阅读全文
摘要:该函数官方的api,说的不是很明白:aggregate(zeroValue, seqOp, combOp)Aggregate the elements of each partition, and then the results for all the partitions, using a given combine functions and a neutral “zero value.”T...
阅读全文
摘要:import random as rdimport mathclass LogisticRegressionPySpark: def __init__(self,MaxItr=100,eps=0.01,c=0.1): self.max_itr = MaxItr se...
阅读全文

浙公网安备 33010602011771号