2017 年 12月随笔档案 - crr121

摘要：maven的setting.xm需要配置jdk全局 jdk-1.8 true 1.8 1.8 1.8 1.8 局部 ... 阅读全文

posted @ 2017-12-31 18:47 crr121 阅读(673) 评论(0) 推荐(0)

摘要：maven的setting.xm需要配置jdk全局 jdk-1.8 true 1.8 1.8 1.8 1.8 局部 ... 阅读全文

posted @ 2017-12-31 18:47 crr121 阅读(577) 评论(0) 推荐(0)

摘要：1、处理输入文本为对，继承Mapper方法package com.cr.hdfs;import org.apache.hadoop.io.IntWritable;import org.apache.hadoop.io.LongWritable;import org.a... 阅读全文

posted @ 2017-12-31 11:23 crr121 阅读(142) 评论(0) 推荐(0)

MapReduce进行本地的单词统计

摘要：1、处理输入文本为对，继承Mapper方法package com.cr.hdfs;import org.apache.hadoop.io.IntWritable;import org.apache.hadoop.io.LongWritable;import org.a... 阅读全文

posted @ 2017-12-31 11:23 crr121 阅读(145) 评论(0) 推荐(0)

Hadoop上传文件到HDFS失败

摘要：错误提示：INFO hdfs.DFSClient: Exception in createBlockOutputStreamjava.net.NoRouteToHostException: No route to host at sun.nio.ch.S... 阅读全文

posted @ 2017-12-29 16:56 crr121 阅读(179) 评论(0) 推荐(0)

Hadoop上传文件到HDFS失败

摘要：错误提示：INFO hdfs.DFSClient: Exception in createBlockOutputStreamjava.net.NoRouteToHostException: No route to host at sun.nio.ch.S... 阅读全文

posted @ 2017-12-29 16:56 crr121 阅读(1217) 评论(0) 推荐(0)

ERROR util.Shell: Failed to locate the winutils binary in the hadoop binary path

摘要：缺少了winutil.exe 点击打开链接阅读全文

posted @ 2017-12-29 16:55 crr121 阅读(345) 评论(0) 推荐(0)

ERROR util.Shell: Failed to locate the winutils binary in the hadoop binary path

摘要：缺少了winutil.exe 点击打开链接阅读全文

posted @ 2017-12-29 16:55 crr121 阅读(398) 评论(0) 推荐(0)

Unable to load native-hadoop library for your platform... using builtin-java classes where applica

摘要：看了很多博客，对于这个问题的解决，但是最后都没有成功后来我重新在官网下载了一个Hadoop安装包配置好环境变量将Hadoop相关的jar包加入到项目里面重新编译就可以了注意这里加入的jar包来自于由于这里面的jar包包含了sources包和tests包我们把他分离出来... 阅读全文

posted @ 2017-12-29 16:55 crr121 阅读(559) 评论(0) 推荐(0)

Unable to load native-hadoop library for your platform... using builtin-java classes where applica

摘要：看了很多博客，对于这个问题的解决，但是最后都没有成功后来我重新在官网下载了一个Hadoop安装包配置好环境变量将Hadoop相关的jar包加入到项目里面重新编译就可以了注意这里加入的jar包来自于由于这里面的jar包包含了sources包和tests包我们把他分离出来... 阅读全文

posted @ 2017-12-29 16:55 crr121 阅读(437) 评论(0) 推荐(0)

Hadoop通过url地址访问HDFS

摘要：新建Java工程package com.cr.java;import org.apache.hadoop.fs.FsUrlStreamHandlerFactory;import org.junit.Test;import javax.print.DocFlavor;i... 阅读全文

posted @ 2017-12-29 16:54 crr121 阅读(1224) 评论(0) 推荐(0)

Hadoop通过url地址访问HDFS

摘要：新建Java工程package com.cr.java;import org.apache.hadoop.fs.FsUrlStreamHandlerFactory;import org.junit.Test;import javax.print.DocFlavor;i... 阅读全文

posted @ 2017-12-29 16:54 crr121 阅读(3251) 评论(0) 推荐(0)

Hadoop通过API访问HDFS

摘要：1、version_1 /** * 通过Hadoop API访问HDFS * @throws IOException */ @Test public void readFileByAPI() throws IOException {... 阅读全文

posted @ 2017-12-29 16:53 crr121 阅读(219) 评论(0) 推荐(0)

Hadoop通过API访问HDFS

摘要：1、version_1 /** * 通过Hadoop API访问HDFS * @throws IOException */ @Test public void readFileByAPI() throws IOException {... 阅读全文

posted @ 2017-12-29 16:53 crr121 阅读(347) 评论(0) 推荐(0)

maven项目测试HDFS读取文件

摘要：1、读取文件 /** * 测试读取文件 * @throws IOException */ @Test public void testSave() throws IOException { Configuratio... 阅读全文

posted @ 2017-12-29 16:53 crr121 阅读(117) 评论(0) 推荐(0)

maven项目测试HDFS读取文件

摘要：1、读取文件 /** * 测试读取文件 * @throws IOException */ @Test public void testSave() throws IOException { Configuratio... 阅读全文

posted @ 2017-12-29 16:53 crr121 阅读(179) 评论(0) 推荐(0)

查看镜像文件

摘要：1、查看镜像文件[xiaoqiu@s150 /home/xiaoqiu/hadoop_tmp/dfs/name/current]$ ls -h |grep fsimagefsimage_0000000000000000000fsimage_00000000000000... 阅读全文

posted @ 2017-12-29 16:53 crr121 阅读(145) 评论(0) 推荐(0)

查看镜像文件

摘要：1、查看镜像文件[xiaoqiu@s150 /home/xiaoqiu/hadoop_tmp/dfs/name/current]$ ls -h |grep fsimagefsimage_0000000000000000000fsimage_00000000000000... 阅读全文

posted @ 2017-12-29 16:53 crr121 阅读(561) 评论(0) 推荐(0)

Hadoop启动命令

摘要：start-all.sh --->start-dfs.sh + start-yarn.shstart-dfs.sh ---->hadoop-daemon.sh start namenode + hadoop-daemons.sh start datano... 阅读全文

posted @ 2017-12-29 16:52 crr121 阅读(1436) 评论(0) 推荐(0)

Hadoop启动命令

摘要：start-all.sh --->start-dfs.sh + start-yarn.shstart-dfs.sh ---->hadoop-daemon.sh start namenode + hadoop-daemons.sh start datano... 阅读全文

posted @ 2017-12-29 16:52 crr121 阅读(163) 评论(0) 推荐(0)

HDFS设置配额管理

摘要：1、目录配额[xiaoqiu@s150 /home/xiaoqiu/hadoop_tmp]$ hadoop fs -lsr /lsr: DEPRECATED: Please use 'ls -R' instead.drwxr-xr-x - xiaoqiu supe... 阅读全文

posted @ 2017-12-29 16:52 crr121 阅读(115) 评论(0) 推荐(0)

HDFS设置配额管理

摘要：1、目录配额[xiaoqiu@s150 /home/xiaoqiu/hadoop_tmp]$ hadoop fs -lsr /lsr: DEPRECATED: Please use 'ls -R' instead.drwxr-xr-x - xiaoqiu supe... 阅读全文

posted @ 2017-12-29 16:52 crr121 阅读(555) 评论(0) 推荐(0)

HDFS快照

摘要：在data目录下启用快照[xiaoqiu@s150 /home/xiaoqiu/hadoop_tmp]$ hdfs dfsadmin -allowSnapshot data创建快照[xiaoqiu@s150 /home/xiaoqiu/hadoop_tmp]$ hdf... 阅读全文

posted @ 2017-12-29 16:51 crr121 阅读(129) 评论(0) 推荐(0)

HDFS快照

摘要：在data目录下启用快照[xiaoqiu@s150 /home/xiaoqiu/hadoop_tmp]$ hdfs dfsadmin -allowSnapshot data创建快照[xiaoqiu@s150 /home/xiaoqiu/hadoop_tmp]$ hdf... 阅读全文

posted @ 2017-12-29 16:51 crr121 阅读(268) 评论(0) 推荐(0)

intellij自动补全变量名和变量属性

摘要：CTRL+ALT+V 阅读全文

posted @ 2017-12-28 10:22 crr121 阅读(160) 评论(0) 推荐(0)

Hadoop进程50010，50070，50090端口被占用

摘要：用root用户登陆，杀死进程[xiaoqiu@s150 /soft/hadoop/etc/hadoop]$ su rootPassword:su: Authentication failure[xiaoqiu@s150 /soft/hadoop/etc/hadoop]... 阅读全文

posted @ 2017-12-27 18:00 crr121 阅读(254) 评论(0) 推荐(0)

Hadoop进程50010，50070，50090端口被占用

摘要：用root用户登陆，杀死进程[xiaoqiu@s150 /soft/hadoop/etc/hadoop]$ su rootPassword:su: Authentication failure[xiaoqiu@s150 /soft/hadoop/etc/hadoop]... 阅读全文

posted @ 2017-12-27 18:00 crr121 阅读(1471) 评论(0) 推荐(0)

python安装.whl文件失败

摘要：安装wheel pip install wheel以安装scipy为例，在官网下载安装包https://pypi.python.org/pypi/scipy一定要注意这里的版本一定要和你的python所支持的版本一直否则会出现C:\Users\xiaoqiu>pip... 阅读全文

posted @ 2017-12-27 14:26 crr121 阅读(380) 评论(0) 推荐(0)

python安装.whl文件失败

摘要：安装wheel pip install wheel以安装scipy为例，在官网下载安装包https://pypi.python.org/pypi/scipy一定要注意这里的版本一定要和你的python所支持的版本一直否则会出现C:\Users\xiaoqiu>pip... 阅读全文

posted @ 2017-12-27 14:26 crr121 阅读(1610) 评论(0) 推荐(0)

linux 脚本

摘要：Linux的脚本需要放在/usr/local/bin目录下[xiaoqiu@s150 /usr/local/bin]$ sudo touch xcall.sh[sudo] password for xiaoqiu:Sorry, try again.[sudo] pas... 阅读全文

posted @ 2017-12-26 22:23 crr121 阅读(105) 评论(0) 推荐(0)

linux 脚本

摘要：Linux的脚本需要放在/usr/local/bin目录下[xiaoqiu@s150 /usr/local/bin]$ sudo touch xcall.sh[sudo] password for xiaoqiu:Sorry, try again.[sudo] pas... 阅读全文

posted @ 2017-12-26 22:23 crr121 阅读(351) 评论(0) 推荐(0)

list indices must be integers or slices, not tuple

摘要：File "E:\Python36\regtree.py", line 45, in chooseBestSplit if len(set(dataSet[:,-1].T.tolist()[0])) == 1: #exit cond 1TypeError: l... 阅读全文

posted @ 2017-12-26 17:11 crr121 阅读(558) 评论(0) 推荐(0)

list indices must be integers or slices, not tuple

摘要：File "E:\Python36\regtree.py", line 45, in chooseBestSplit if len(set(dataSet[:,-1].T.tolist()[0])) == 1: #exit cond 1TypeError: l... 阅读全文

posted @ 2017-12-26 17:11 crr121 阅读(11210) 评论(0) 推荐(0)

矩阵转列表

摘要：矩阵转列表>>> testMatmatrix([[ 1., 0., 0., 0.], [ 0., 1., 0., 0.], [ 0., 0., 1., 0.], [ 0., 0., 0., 1.]])>... 阅读全文

posted @ 2017-12-26 15:05 crr121 阅读(186) 评论(0) 推荐(0)

矩阵转列表

摘要：矩阵转列表>>> testMatmatrix([[ 1., 0., 0., 0.], [ 0., 1., 0., 0.], [ 0., 0., 1., 0.], [ 0., 0., 0., 1.]])>... 阅读全文

posted @ 2017-12-26 15:05 crr121 阅读(151) 评论(0) 推荐(0)

根据某列值进行样本的分类

摘要：根据某列值进行样本的分类'''dataSet:数据集feature：待划分的特征value：对应的特征值'''def binSplitDataSet(dataSet, feature, value): #dataSet[:,feature]取出该列特征值 ... 阅读全文

posted @ 2017-12-26 11:34 crr121 阅读(129) 评论(0) 推荐(0)

根据某列值进行样本的分类

摘要：根据某列值进行样本的分类'''dataSet:数据集feature：待划分的特征value：对应的特征值'''def binSplitDataSet(dataSet, feature, value): #dataSet[:,feature]取出该列特征值 ... 阅读全文

posted @ 2017-12-26 11:34 crr121 阅读(134) 评论(0) 推荐(0)

HDFS文件系统

摘要：Hadoop fs 等价于 hdfs dfs [xiaoqiu@s150 bin]$ hdfs dfsUsage: hadoop fs [generic options] fs run a generic filesystem u... 阅读全文

posted @ 2017-12-25 15:47 crr121 阅读(91) 评论(0) 推荐(0)

HDFS文件系统

摘要：Hadoop fs 等价于 hdfs dfs [xiaoqiu@s150 bin]$ hdfs dfsUsage: hadoop fs [generic options] fs run a generic filesystem u... 阅读全文

posted @ 2017-12-25 15:47 crr121 阅读(106) 评论(0) 推荐(0)

rsync实现远程同步

摘要：如果采用scp会将符号链接转为目录，所以我们采用rsync实现远程同步启动所有slave节点的Hadoop进程的脚本[xiaoqiu@s150 bin]$ cat xcall.sh#!/usr/bin/env bashi=150params=$@for((i=150;... 阅读全文

posted @ 2017-12-22 18:03 crr121 阅读(104) 评论(0) 推荐(0)

rsync实现远程同步

摘要：如果采用scp会将符号链接转为目录，所以我们采用rsync实现远程同步启动所有slave节点的Hadoop进程的脚本[xiaoqiu@s150 bin]$ cat xcall.sh#!/usr/bin/env bashi=150params=$@for((i=150;... 阅读全文

posted @ 2017-12-22 18:03 crr121 阅读(202) 评论(0) 推荐(0)

hadoop完全分布式编写脚本

摘要：编写一个脚本一次性查看所有主机的主机名在/usr/local/bin目录下新建一个脚本[root@s130:/usr/local/bin]cat xcall.sh#!/bin/shi=130#传递所有的参数params=$@for((i=130;i<=133;i=$i... 阅读全文

posted @ 2017-12-22 17:41 crr121 阅读(104) 评论(0) 推荐(0)

hadoop完全分布式编写脚本

摘要：编写一个脚本一次性查看所有主机的主机名在/usr/local/bin目录下新建一个脚本[root@s130:/usr/local/bin]cat xcall.sh#!/bin/shi=130#传递所有的参数params=$@for((i=130;i<=133;i=$i... 阅读全文

posted @ 2017-12-22 17:41 crr121 阅读(147) 评论(0) 推荐(0)

矩阵转列表

摘要：矩阵转列表from numpy import *a = mat([[1,34,3],[2,3,41],[2,34,41],[2,53,41]])print(a.flatten())print(a.flatten().A)#矩阵转为列表print(a.flatten(... 阅读全文

posted @ 2017-12-22 11:42 crr121 阅读(173) 评论(0) 推荐(0)

矩阵转列表

摘要：矩阵转列表from numpy import *a = mat([[1,34,3],[2,3,41],[2,34,41],[2,53,41]])print(a.flatten())print(a.flatten().A)#矩阵转为列表print(a.flatten(... 阅读全文

posted @ 2017-12-22 11:42 crr121 阅读(179) 评论(0) 推荐(0)

矩阵降维-将矩阵按照某列进行排序

摘要：将三维矩阵转为二维矩阵矩阵降维-将矩阵按照某列进行排序from numpy import *a = mat([[1,34,3],[2,3,41],[2,34,41],[2,53,41]])print(a)srtInd=a[:,1].argsort(0)print(s... 阅读全文

posted @ 2017-12-22 11:23 crr121 阅读(280) 评论(0) 推荐(0)

矩阵降维-将矩阵按照某列进行排序

摘要：将三维矩阵转为二维矩阵矩阵降维-将矩阵按照某列进行排序from numpy import *a = mat([[1,34,3],[2,3,41],[2,34,41],[2,53,41]])print(a)srtInd=a[:,1].argsort(0)print(s... 阅读全文

posted @ 2017-12-22 11:23 crr121 阅读(284) 评论(0) 推荐(0)

Hadoop完全分布式

摘要：修改主机名[root@localhost:/soft/hadoop2.7/etc/hadoop]nano /etc/hostname[root@localhost:/soft/hadoop2.7/etc/hadoop][root@localhost:/soft/had... 阅读全文

posted @ 2017-12-21 23:05 crr121 阅读(128) 评论(0) 推荐(0)

Hadoop完全分布式

摘要：修改主机名[root@localhost:/soft/hadoop2.7/etc/hadoop]nano /etc/hostname[root@localhost:/soft/hadoop2.7/etc/hadoop][root@localhost:/soft/had... 阅读全文

posted @ 2017-12-21 23:05 crr121 阅读(120) 评论(0) 推荐(0)

查看端口是否启用

摘要：查看端口是否启用[root@localhost:/soft/hadoop2.7/etc/hadoop]netstat -ano |grep 50070tcp 0 0 0.0.0.0:50070 0.0.0.0:* ... 阅读全文

posted @ 2017-12-21 22:40 crr121 阅读(242) 评论(0) 推荐(0)

查看端口是否启用

摘要：查看端口是否启用[root@localhost:/soft/hadoop2.7/etc/hadoop]netstat -ano |grep 50070tcp 0 0 0.0.0.0:50070 0.0.0.0:* ... 阅读全文

posted @ 2017-12-21 22:40 crr121 阅读(155) 评论(0) 推荐(0)

hadoop namenode启动失败

摘要：参考博客：点击打开链接重新定义Hadoop的临时存储目录修改core-site.xml在家目录新建一个文件夹[root@localhost:/root]mkdir hadoop_tmp修改core-site.xml[root@localhost:/soft/hadoo... 阅读全文

posted @ 2017-12-21 22:22 crr121 阅读(188) 评论(0) 推荐(0)

hadoop namenode启动失败

摘要：参考博客：点击打开链接重新定义Hadoop的临时存储目录修改core-site.xml在家目录新建一个文件夹[root@localhost:/root]mkdir hadoop_tmp修改core-site.xml[root@localhost:/soft/hadoo... 阅读全文

posted @ 2017-12-21 22:22 crr121 阅读(119) 评论(0) 推荐(0)

Hadoop2.7.5伪分布式安装

摘要：将安装包复制到/soft文件目录下解压[hadoop@localhost soft]$ sudo tar -zxvf hadoop-2.7.5.tar.gz删除安装包[hadoop@localhost soft]$ sudo rm -rf hadoop-2.7.5.t... 阅读全文

posted @ 2017-12-21 16:00 crr121 阅读(142) 评论(0) 推荐(0)

Hadoop2.7.5伪分布式安装

摘要：将安装包复制到/soft文件目录下解压[hadoop@localhost soft]$ sudo tar -zxvf hadoop-2.7.5.tar.gz删除安装包[hadoop@localhost soft]$ sudo rm -rf hadoop-2.7.5.t... 阅读全文

posted @ 2017-12-21 16:00 crr121 阅读(142) 评论(0) 推荐(0)

给用户添加sudo权限

摘要：切换到root模式，编辑etc/sudoers[hadoop@localhost /]$ su rootPassword:ABRT has detected 1 problem(s). For more info run: abrt-cli list[root@loc... 阅读全文

posted @ 2017-12-21 15:38 crr121 阅读(145) 评论(0) 推荐(0)

给用户添加sudo权限

摘要：切换到root模式，编辑etc/sudoers[hadoop@localhost /]$ su rootPassword:ABRT has detected 1 problem(s). For more info run: abrt-cli list[root@loc... 阅读全文

posted @ 2017-12-21 15:38 crr121 阅读(110) 评论(0) 推荐(0)

jdk-linux安装

摘要：1、jdk安装将安装包复制到/soft目录下解压sudo tar -zxvf jdk-8u66-linux-x64.gz删除安装包[hadoop@localhost soft]$ sudo rm -rf jdk-8u66-linux-x64.gz创建符号连接[hado... 阅读全文

posted @ 2017-12-21 15:30 crr121 阅读(122) 评论(0) 推荐(0)

jdk-linux安装

摘要：1、jdk安装将安装包复制到/soft目录下解压sudo tar -zxvf jdk-8u66-linux-x64.gz删除安装包[hadoop@localhost soft]$ sudo rm -rf jdk-8u66-linux-x64.gz创建符号连接[hado... 阅读全文

posted @ 2017-12-21 15:30 crr121 阅读(96) 评论(0) 推荐(0)

Hadoop进程启动

摘要：Hadoop端口：50070 ======》namenode http port50075 =======》datanode http port 50090 ========》2namenode http port8020 ========》namenode ... 阅读全文

posted @ 2017-12-19 21:50 crr121 阅读(456) 评论(0) 推荐(0)

Hadoop无法访问web50070端口

摘要：Hadoop安装成功之后，访问不了web界面的50070端口先查看端口是否启用[hadoop@s128 sbin]$ netstat -ano |grep 50070然后查看防火墙的状态，是否关闭，如果没有，强制性关闭查看防火墙状态：[hadoop@s128 sbin... 阅读全文

posted @ 2017-12-19 21:07 crr121 阅读(658) 评论(0) 推荐(0)

Hadoop无法访问web50070端口

摘要：Hadoop安装成功之后，访问不了web界面的50070端口先查看端口是否启用[hadoop@s128 sbin]$ netstat -ano |grep 50070然后查看防火墙的状态，是否关闭，如果没有，强制性关闭查看防火墙状态：[hadoop@s128 sbin... 阅读全文

posted @ 2017-12-19 21:07 crr121 阅读(180) 评论(0) 推荐(0)

Hadoop问题汇总

摘要：1、启动dananode hadoop-daemon.sh start datanode 阅读全文

posted @ 2017-12-19 20:41 crr121 阅读(95) 评论(0) 推荐(0)

Hadoop问题汇总

摘要：1、启动dananode hadoop-daemon.sh start datanode 阅读全文

posted @ 2017-12-19 20:41 crr121 阅读(98) 评论(0) 推荐(0)

Linux网络连接模式以及修改静态IP

摘要：网络连接模式1、桥接模式centos相当于一台物理机，可以直接连接外网，能够连接同一个局域网下为桥接模式的其他宿主机上的客户机2、NAT模式通过宿主机连接外网，可以访问同一个局域网的其他物理主机，但是其他主机不能够访问该宿主机3、only host不能连接外网，可以连... 阅读全文

posted @ 2017-12-17 15:41 crr121 阅读(444) 评论(0) 推荐(0)

Linux网络连接模式以及修改静态IP

摘要：网络连接模式1、桥接模式centos相当于一台物理机，可以直接连接外网，能够连接同一个局域网下为桥接模式的其他宿主机上的客户机2、NAT模式通过宿主机连接外网，可以访问同一个局域网的其他物理主机，但是其他主机不能够访问该宿主机3、only host不能连接外网，可以连... 阅读全文

posted @ 2017-12-17 15:41 crr121 阅读(275) 评论(0) 推荐(0)

Linux基本命令

摘要：切换用户[root@localhost ~]# su hadoop[hadoop@localhost root]$ su rootPassword:[root@localhost ~]#显示当前目录：pwd进入上次目录：cd -按列表形式查看目录：ls -l 等价于 ... 阅读全文

posted @ 2017-12-17 15:39 crr121 阅读(185) 评论(0) 推荐(0)

Linux基本命令

摘要：切换用户[root@localhost ~]# su hadoop[hadoop@localhost root]$ su rootPassword:[root@localhost ~]#显示当前目录：pwd进入上次目录：cd -按列表形式查看目录：ls -l 等价于 ... 阅读全文

posted @ 2017-12-17 15:39 crr121 阅读(195) 评论(0) 推荐(0)

localhost方式提交作业到spark运行

摘要：修改参数代码见上一节代码地址Java版本：JavaRDD rdd1 = sc.textFile(args[0])Scala版本：val rdd1 = sc.textFile(args(0))编译添加spark依赖包 org.apache.spa... 阅读全文

posted @ 2017-12-15 15:32 crr121 阅读(210) 评论(0) 推荐(0)

localhost方式提交作业到spark运行

摘要：修改参数代码见上一节代码地址Java版本：JavaRDD rdd1 = sc.textFile(args[0])Scala版本：val rdd1 = sc.textFile(args(0))编译添加spark依赖包 org.apache.spa... 阅读全文

posted @ 2017-12-15 15:32 crr121 阅读(120) 评论(0) 推荐(0)

spark_入门（单词统计）

摘要：1、特点快如闪电的集群计算：是Hadoop的100倍，磁盘计算快10倍大规模快速通用的计算引擎：支持Java/scala/python/R 提供80+种操作符，容易构建并行应用组合SQL 流计算复杂分析运行环境：Hadoop mesos,standa... 阅读全文

posted @ 2017-12-14 17:52 crr121 阅读(224) 评论(0) 推荐(0)

spark_入门（单词统计）

摘要：1、特点快如闪电的集群计算：是Hadoop的100倍，磁盘计算快10倍大规模快速通用的计算引擎：支持Java/scala/python/R 提供80+种操作符，容易构建并行应用组合SQL 流计算复杂分析运行环境：Hadoop mesos,standa... 阅读全文

posted @ 2017-12-14 17:52 crr121 阅读(214) 评论(0) 推荐(0)

Logistic回归

摘要：1、什么是回归已知数据集，求这些数据集的函数表达式的过程2、logistic回归数据类型：数值型和标称型3、优点：计算代价不高，易于理解和实现缺点：容易欠拟合，分类精度可能不高4、实现原理：将每个特征值乘以一个回归系数，然后将这些值相加，将总和带入到sigmoid函数... 阅读全文

posted @ 2017-12-14 11:48 crr121 阅读(1219) 评论(0) 推荐(0)

Logistic回归

摘要：1、什么是回归已知数据集，求这些数据集的函数表达式的过程2、logistic回归数据类型：数值型和标称型3、优点：计算代价不高，易于理解和实现缺点：容易欠拟合，分类精度可能不高4、实现原理：将每个特征值乘以一个回归系数，然后将这些值相加，将总和带入到sigmoid函数... 阅读全文

posted @ 2017-12-14 11:48 crr121 阅读(213) 评论(0) 推荐(0)

Linux问题汇总

摘要：解压文件的时候一直显示can't mkdir ，后来换了root用户解压就可以了删除文件夹：rm -rf xxxmv可以修改名字 vi /etc/profile 修改环境变量修改完环境变量要记得source /etc/profile 使配置的环境变量生效cat ... 阅读全文

posted @ 2017-12-12 20:53 crr121 阅读(103) 评论(0) 推荐(0)

Linux问题汇总

摘要：解压文件的时候一直显示can't mkdir ，后来换了root用户解压就可以了删除文件夹：rm -rf xxxmv可以修改名字 vi /etc/profile 修改环境变量修改完环境变量要记得source /etc/profile 使配置的环境变量生效cat ... 阅读全文

posted @ 2017-12-12 20:53 crr121 阅读(148) 评论(0) 推荐(0)

朴素贝叶斯分类器

摘要：1、加载训练数据集，用于训练分类器#加载数据集，用于训练分类器def loadDataSet(): # 分词后的数据，一共有六个向量 postingList=[['my', 'dog', 'has', 'flea', 'problems', 'help',... 阅读全文

posted @ 2017-12-12 18:14 crr121 阅读(240) 评论(0) 推荐(0)

朴素贝叶斯分类器

摘要：1、加载训练数据集，用于训练分类器#加载数据集，用于训练分类器def loadDataSet(): # 分词后的数据，一共有六个向量 postingList=[['my', 'dog', 'has', 'flea', 'problems', 'help',... 阅读全文

posted @ 2017-12-12 18:14 crr121 阅读(131) 评论(0) 推荐(0)

正则表达式

摘要：str = "thon.exe H:/python_workspace/test/test.py"import re#\\w* : \ + \w + *# ... 阅读全文

posted @ 2017-12-12 16:01 crr121 阅读(148) 评论(0) 推荐(0)

决策树算法

摘要：1、决策树的工作原理（1）找到划分数据的特征，作为决策点（2）利用找到的特征对数据进行划分成n个数据子集。（3）如果同一个子集中的数据属于同一类型就不再划分，如果不属于同一类型，继续利用特征进行划分。（4）指导每一个子集的数据属于同一类型停止划分。2、决策树的优点：计... 阅读全文

posted @ 2017-12-07 21:09 crr121 阅读(232) 评论(0) 推荐(0)

决策树算法

摘要：1、决策树的工作原理（1）找到划分数据的特征，作为决策点（2）利用找到的特征对数据进行划分成n个数据子集。（3）如果同一个子集中的数据属于同一类型就不再划分，如果不属于同一类型，继续利用特征进行划分。（4）指导每一个子集的数据属于同一类型停止划分。2、决策树的优点：计... 阅读全文

posted @ 2017-12-07 21:09 crr121 阅读(216) 评论(0) 推荐(0)

python矩阵

摘要：B=min(A)：获得矩阵A每一列的最小值，返回值B为一个行向量，其第i列对应A矩阵第i列的最小值。 C=max(A) ：获得矩阵A每一列的最大值，返回值C为一个行向量，其第i列对应A矩阵第i列的最大值。import numpy as npa = np.ar... 阅读全文

posted @ 2017-12-07 10:24 crr121 阅读(421) 评论(0) 推荐(0)

python矩阵

摘要：B=min(A)：获得矩阵A每一列的最小值，返回值B为一个行向量，其第i列对应A矩阵第i列的最小值。 C=max(A) ：获得矩阵A每一列的最大值，返回值C为一个行向量，其第i列对应A矩阵第i列的最大值。import numpy as npa = np.ar... 阅读全文

posted @ 2017-12-07 10:24 crr121 阅读(161) 评论(0) 推荐(0)

matplotlib

摘要：import matplotlib.pyplot as plt from numpy import * fig = plt.figure() ax = fig.add_subplot(223) ax.plot(x,y) plt.show() 参数223的意思... 阅读全文

posted @ 2017-12-05 15:01 crr121 阅读(102) 评论(0) 推荐(0)

matplotlib

摘要：import matplotlib.pyplot as plt from numpy import * fig = plt.figure() ax = fig.add_subplot(223) ax.plot(x,y) plt.show() 参数223的意思... 阅读全文

posted @ 2017-12-05 15:01 crr121 阅读(128) 评论(0) 推荐(0)

numpy库

摘要：1、创建随机矩阵>>> from numpy import *>>> random.rand(4,4)array([[ 0.1801566 , 0.02580119, 0.02685281, 0.52768083], [ 0.4541100... 阅读全文

posted @ 2017-12-05 09:26 crr121 阅读(110) 评论(0) 推荐(0)

numpy库

摘要：1、创建随机矩阵>>> from numpy import *>>> random.rand(4,4)array([[ 0.1801566 , 0.02580119, 0.02685281, 0.52768083], [ 0.4541100... 阅读全文

posted @ 2017-12-05 09:26 crr121 阅读(124) 评论(0) 推荐(0)

python数字

摘要：1、切片操作#!/usr/bin/env python# -*- coding: utf-8 -*-# 切片操作names = ('aa','bb','cc','dd','ee');print names[0];print names[2];print na... 阅读全文

posted @ 2017-12-04 14:26 crr121 阅读(158) 评论(0) 推荐(0)

python数字

摘要：1、切片操作#!/usr/bin/env python# -*- coding: utf-8 -*-# 切片操作names = ('aa','bb','cc','dd','ee');print names[0];print names[2];print na... 阅读全文

posted @ 2017-12-04 14:26 crr121 阅读(108) 评论(0) 推荐(0)

python数字

摘要：1、复数aComplex = -1.33 + 2.44j;print aComplex;# (-1.33+2.44j)print aComplex.real;print aComplex.imag;# -1.33# 2.44print aComplex.co... 阅读全文

posted @ 2017-12-04 11:45 crr121 阅读(164) 评论(0) 推荐(0)

python数字

摘要：1、复数aComplex = -1.33 + 2.44j;print aComplex;# (-1.33+2.44j)print aComplex.real;print aComplex.imag;# -1.33# 2.44print aComplex.co... 阅读全文

posted @ 2017-12-04 11:45 crr121 阅读(151) 评论(0) 推荐(0)

仰望星空脚踏实地

欢迎关注我的公众号：小秋的博客

12 2017 档案

公告

仰望星空 脚踏实地

欢迎关注我的公众号：小秋的博客

12 2017 档案

公告

仰望星空脚踏实地