摘要:
HDFS命令基本格式:Hadoop fs -cmd < args > HDFS命令基本格式:Hadoop fs -cmd < args > ls 命令 hadoop fs -ls / 列出hdfs文件系统根目录下的目录和文件 hadoop fs -ls -R / 列出hdfs文件系统所有的目录和文件
阅读全文
posted @ 2017-11-21 08:12
papering
阅读(222)
推荐(0)
摘要:
【手动验证:任意2个节点间是否实现 双向 ssh免密登录】 弄懂通信原理和集群的容错性 任意2个节点间实现双向 ssh免密登录,默认在~目录下 【实现上步后,在其中任一节点安装\配置hadoop后,可以将整个安装目录scp复制到各个节点::::各个节点的文件内容是一样的!!!!】 3节点 spark
阅读全文
posted @ 2017-11-21 00:11
papering
阅读(2450)
推荐(0)
摘要:
https://github.com/apache/spark/tree/master/core/src/main/scala/org/apache/spark/network https://github.com/apache/spark/blob/master/core/src/main/sca
阅读全文
posted @ 2017-11-20 19:39
papering
阅读(439)
推荐(0)
摘要:
http://hadoop.apache.org/docs/r1.0.4/cn/hdfs_design.html 通讯协议 所有的HDFS通讯协议都是建立在TCP/IP协议之上。客户端通过一个可配置的TCP端口连接到Namenode,通过ClientProtocol协议与Namenode交互。而Da
阅读全文
posted @ 2017-11-20 18:55
papering
阅读(2627)
推荐(0)
摘要:
HDFS被设计成能够在一个大集群中跨机器可靠地存储超大文件。它将每个文件存储成一系列的数据块,除了最后一个,所有的数据块都是同样大小的。为了容错,文件的所有数据块都会有副本。每个文件的数据块大小和副本系数都是可配置的。应用程序可以指定某个文件的副本数目。副本系数可以在文件创建的时候指定,也可以在之后
阅读全文
posted @ 2017-11-20 18:52
papering
阅读(432)
推荐(0)
posted @ 2017-11-20 18:15
papering
阅读(189)
推荐(0)
摘要:
https://github.com/mongodb/mongo-hadoop https://github.com/mongodb/mongo-hadoop/wiki/Spark-Usage
阅读全文
posted @ 2017-11-20 17:22
papering
阅读(166)
推荐(0)
摘要:
Apache Spark is built around a distributed collection of immutable Java Virtual Machine (JVM) objects called Resilient Distributed Datasets (RDDs for
阅读全文
posted @ 2017-11-20 17:12
papering
阅读(239)
推荐(0)
摘要:
https://github.com/google/protobuf/
阅读全文
posted @ 2017-11-20 13:46
papering
阅读(358)
推荐(0)
摘要:
cat /proc/versionLinux version 3.10.0-327.el7.x86_64 (builder@kbuilder.dev.centos.org) (gcc version 4.8.3 20140911 (Red Hat 4.8.3-9) (GCC) ) #1 SMP Th
阅读全文
posted @ 2017-11-20 12:49
papering
阅读(869)
推荐(0)
摘要:
Java Virtual Machine Process Status Tool
阅读全文
posted @ 2017-11-20 11:28
papering
阅读(180)
推荐(0)
摘要:
If the port is "0", then the OS is looking for any free port -so the port-in-use and port-below-1024 problems are highly unlikely to be the cause of t
阅读全文
posted @ 2017-11-20 02:01
papering
阅读(1674)
推荐(0)
摘要:
https://www.ibm.com/developerworks/library/bd-yarn-intro/
阅读全文
posted @ 2017-11-19 16:24
papering
阅读(126)
推荐(0)
摘要:
https://curl.haxx.se/docs/manpage.html curl is a tool to transfer data from or to a server, using one of the supported protocols (DICT, FILE, FTP, FTP
阅读全文
posted @ 2017-11-18 01:02
papering
阅读(216)
推荐(0)
摘要:
975.45 MB (1,022,836,736)
阅读全文
posted @ 2017-11-17 21:35
papering
阅读(193)
推荐(0)
摘要:
【同一时刻】 2个或多个事件 并发 concurrence 在同一时间间隔内不同时刻发生 并行 同一时刻发生 【计算程序 I/O程序】 未引入进程的系统 在属于同一个应用程序的计算程序和I/O程序之间只能是顺序执行,不能同时执行;但为计算程序和I/O程序分别建立一个进程(process)后,这两个进
阅读全文
posted @ 2017-11-17 00:35
papering
阅读(324)
推荐(0)
摘要:
https://www.liaoxuefeng.com/wiki/
阅读全文
posted @ 2017-11-16 20:58
papering
阅读(260)
推荐(0)
摘要:
cerery http://docs.celeryproject.org/en/latest/userguide/index.html
阅读全文
posted @ 2017-11-16 20:02
papering
阅读(163)
推荐(0)
摘要:
http://www.sohu.com/a/204244301_99961855
阅读全文
posted @ 2017-11-16 13:12
papering
阅读(337)
推荐(0)
摘要:
l=[1,2,3]def f(l): l[1]=123print(l)f(l)print(l)s="abc"def f1(s): s="x"print(s)f1(s)print(s) 可变对象 浅复制,函数修改了原值 不可变对象,不影响 对于不可变类型(数值型、字符串、元组),因变量不能修改,所以运
阅读全文
posted @ 2017-11-16 10:52
papering
阅读(328)
推荐(0)
摘要:
https://blog.golang.org/go-concurrency-patterns-timing-out-and https://blog.golang.org/concurrency-timeouts Go Concurrency Patterns: Timing out, movin
阅读全文
posted @ 2017-11-15 23:43
papering
阅读(213)
推荐(0)
摘要:
https://blog.golang.org/pipelines
阅读全文
posted @ 2017-11-15 23:40
papering
阅读(198)
推荐(0)
摘要:
https://docs.oracle.com/javase/tutorial/essential/concurrency/pools.html Most of the executor implementations in java.util.concurrent use thread pools
阅读全文
posted @ 2017-11-15 22:54
papering
阅读(232)
推荐(0)
摘要:
New Elastic Load Balancing Feature: Sticky Sessions | AWS News Blog https://amazonaws-china.com/cn/blogs/aws/new-elastic-load-balancing-feature-sticky
阅读全文
posted @ 2017-11-15 20:05
papering
阅读(1236)
推荐(0)
摘要:
http://hbase.apache.org/acid-semantics.html Apache HBase (TM) is not an ACID compliant database. However, it does guarantee certain specific propertie
阅读全文
posted @ 2017-11-15 13:04
papering
阅读(215)
推荐(0)
摘要:
http://web.mit.edu/kerberos/ What is Kerberos? Kerberos is a network authentication protocol. It is designed to provide strong authentication for clie
阅读全文
posted @ 2017-11-15 12:52
papering
阅读(192)
推荐(0)
摘要:
class Student():... name = 'Student'...s = Student() # 创建实例sprint(s.name) # 打印name属性,因为实例并没有name属性,所以会继续查找class的name属性Studentprint(Student.name) # 打印类
阅读全文
posted @ 2017-11-15 09:51
papering
阅读(205)
推荐(0)
摘要:
http://www.sogou.com/labs/resource/list_yuliao.php
阅读全文
posted @ 2017-11-15 00:25
papering
阅读(173)
推荐(0)
摘要:
总结: 1、无论一个类实例化多少对象,它的静态变量只有一份拷贝; 静态域属于类,而非由类构造的实例化的对象,所有类的实例对象共享静态域。 class Employee { private static int nextId = 1; private int id; ... } 静态变量 静态常量 p
阅读全文
posted @ 2017-11-14 22:15
papering
阅读(313)
推荐(0)
摘要:
LocalDate.plusDate String.toUpperCase GregorianCalendar.add
阅读全文
posted @ 2017-11-14 19:12
papering
阅读(458)
推荐(0)
摘要:
【数据技巧1】手把手教你用容器实现下钻 https://mp.weixin.qq.com/s/l0iIUi6Jl9UuXxgQoXXM-A 数据分析工作中,尤其在做数据汇报时,难免需要做多层级的精准报告,仪表板里的数据下钻就显得格外重要。以往,我们习惯通过筛选去下钻、或通过视图去筛选,你有没有想到其
阅读全文
posted @ 2017-11-14 12:01
papering
阅读(365)
推荐(0)
摘要:
识别:访问者来源、会话、访问者 HTTP 安全套接字层SSL :包含访问者的登录活动和加密密钥的交换 动态网页:在返回给访问者的每个网页中隐藏一个会话ID的字段来维护访问者状态
阅读全文
posted @ 2017-11-13 22:59
papering
阅读(220)
推荐(0)
摘要:
[root@bigdata-server-02 /]# ps --help all Usage: ps [options] Basic options: -A, -e all processes -a all with tty, except session leaders a all with t
阅读全文
posted @ 2017-11-13 19:27
papering
阅读(314)
推荐(0)
摘要:
https://en.wikipedia.org/wiki/SHA-3
阅读全文
posted @ 2017-11-13 19:01
papering
阅读(199)
推荐(0)
摘要:
【熵增】 由无序到有序 http://spark.apache.org/docs/latest/rdd-programming-guide.html#shuffle-operations Shuffle operations Certain operations within Spark trigg
阅读全文
posted @ 2017-11-13 17:46
papering
阅读(189)
推荐(0)
摘要:
每个job被划分为多个stage。划分stage的一个主要依据是当前计算因子的输入是否是确定的,如果是则将其分在同一个stage,从而避免多个stage之间的消息传递开销。 http://spark.apache.org/docs/latest/rdd-programming-guide.html
阅读全文
posted @ 2017-11-13 16:15
papering
阅读(425)
推荐(0)
posted @ 2017-11-13 14:16
papering
阅读(128)
推荐(0)
摘要:
Apache HBase™ is the Hadoop database, a distributed, scalable, big data store. Use Apache HBase™ when you need random, realtime read/write access to y
阅读全文
posted @ 2017-11-13 14:12
papering
阅读(177)
推荐(0)
摘要:
https://en.wikipedia.org/wiki/DevOps DevOps (a clipped compound of "development" and "operations") is a software engineering practice that aims at uni
阅读全文
posted @ 2017-11-13 11:29
papering
阅读(266)
推荐(0)
摘要:
5 Ways to Make Your Hive Queries Run Faster Technique #1: Use Tez Hive can use the Apache Tez execution engine instead of the venerable Map-reduce eng
阅读全文
posted @ 2017-11-12 13:19
papering
阅读(202)
推荐(0)