2017 年 7月 12 日随笔档案 - bonelee

2017年7月12日

spark 按照key 分组然后统计每个key对应的最大、最小、平均值思路——使用groupby，或者reduceby

摘要： example.groupByKey().mapValues(list) 阅读全文

posted @ 2017-07-12 16:28 bonelee 阅读(9317) 评论(0) 推荐(1)

摘要： distinct(numPartitions=None) Return a new RDD containing the distinct elements in this RDD. >>> sorted(sc.parallelize([1, 1, 2, 3]).distinct().collect 阅读全文

posted @ 2017-07-12 14:07 bonelee 阅读(2855) 评论(0) 推荐(0)

spark rdd median 中位数求解

摘要： lookup(key) Return the list of values in the RDD for key key. This operation is done efficiently if the RDD has a known partitioner by only searching 阅读全文

posted @ 2017-07-12 10:47 bonelee 阅读(3206) 评论(0) 推荐(0)

python spark 求解最大最小平均

摘要： rdd = sc.parallelizeDoubles(testData); rdd = sc.parallelizeDoubles(testData); rdd = sc.parallelizeDoubles(testData); Now we’ll calculate the mean of o 阅读全文

posted @ 2017-07-12 10:15 bonelee 阅读(595) 评论(0) 推荐(0)

python spark 求解最大最小平均中位数

摘要：上面是粗暴的做法简单的做法：阅读全文

posted @ 2017-07-12 09:50 bonelee 阅读(1286) 评论(0) 推荐(0)

将者，智、信、仁、勇、严也。

Hi，我是李智华，华为-安全AI算法专家，欢迎来到安全攻防对抗的有趣世界。

公告