Python学习笔记整理总结【语言基础篇】

一、变量赋值及命名规则
① 声明一个变量及赋值

1 #!/usr/bin/env python
2 # -*- coding:utf-8 -*-
3 # _author_soloLi
4 name1="solo"
5 name2=name1
6 print(name1,name2)
7 name1 = "hehe"
8 print(name1,name2)

#name1的值为hehe，name2的值为solo

② 变量命名的规则

1 1、变量名只能是 字母、数字或下划线的任意组合
2 2、变量名的第一个字符不能是数字
3 3、以下关键字不能声明为变量名['and', 'as', 'assert', 'break', 'class', 'continue', 'def', 'del', 'elif', 'else', 'except', 'exec', 'finally', 'for', 'from', 'global','if', 'import', 'in', 'is', 'lambda', 'not', 'or', 'pass', 'print', 'raise', 'return', 'try', 'while', 'with', 'yield']

二、字符编码
python解释器在加载 .py 文件中的代码时，会对内容进行编码（默认ascill）

ASCII：最多只能用 8位来表示（一个字节），即：2**8 = 256，所以，ASCII码最多只能表示 256 个符号。显然ASCII码无法将世界上的各种文字和符号全部表示。

Unicode：它为每种语言中的每个字符设定了统一并且唯一的二进制编码，规定虽有的字符和符号最少由 16 位来表示（2个字节），即：2 **16 = 65536，注：此处说的的是最少2个字节，可能更多。

UTF-8：是对Unicode编码的压缩和优化，他不再使用最少使用2个字节，而是将所有的字符和符号进行分类：ascii码中的内容用1个字节保存、欧洲的字符用2个字节保存，东亚的字符用3个字节保存...
注：python2.x版本，默认支持的字符编码为ASCll python3.x版本，默认支持的是Unicode，不用声明字符编码可以直接显示中文。

扩展：字符编码和转码，bytes和str区别

Python 3最重要的新特性大概要算是对文本和二进制数据作了更为清晰的区分。文本总是Unicode，由str类型表示，二进制数据则由bytes类型表示。Python 3不会以任意隐式的方式混用str和bytes（类似int和long之间自动转换），正是这使得两者的区分特别清晰。你不能拼接字符串和字节包，也无法在字节包里搜索字符串（反之亦然），也不能将字符串传入参数为字节包的函数（反之亦然）。这是件好事。不管怎样，字符串和字节包之间的界线是必然的，下面的图解非常重要，务请牢记于心：

字符串可以编码成字节包，而字节包可以解码成字符串：

1 #!/usr/bin/env python
2 # -*- coding:utf-8 -*-
3 #-Author-solo
4 msg = "里约奥运"
5  
6 print(msg.encode("utf-8"))                      #如果不指定编码格式，默认为utf-8
7 #b'\xe9\x87\x8c\xe7\xba\xa6\xe5\xa5\xa5\xe8\xbf\x90'
8 print(b'\xe9\x87\x8c\xe7\xba\xa6\xe5\xa5\xa5\xe8\xbf\x90'.decode("utf-8"))
9 #里约奥运

View Code

为什么要进行编码和转码？
　　由于每个国家电脑的字符编码格式不统一（列中国：GBK），同一款软件放到不同国家的电脑上会出现乱码的情况，出现这种情况如何解决呢？！当然由于所有国家的电脑都支持Unicode万国码，那么我们可以把Unicode为跳板，先把字符编码转换为Unicode，在把Unicode转换为另一个国家的字符编码（例韩国），则不会出现乱码的情况。当然这里只是转编码集并不是翻译成韩文不要弄混了。

① Python3.0进行编码转换（默认Unicode编码）

1 name = "李伟"                     #此时name为Unicode编码
2 name1 = name.encode("utf-8")      #Unicode转为UTF-8
3 name2 = name1.decode("utf-8")     #UTF-8转为Unicode
4 
5 name3 = name.encode("gbk")       #Unicode转为GBK
6 name4 = name3.decode("gbk")      #GBK转为Unicode

View Code

② Python2.0中的编码转换（默认ascii编码）

 1 ① 声明字符编码（utf-8）
 2 
 3 # -*- coding:utf-8 -*-
 4 name = "李伟"                   #ascii码里是没有字符“你好”的，此时的name为uft-8
 5 
 6 name1 = name.decode("utf-8")    #UTF-8转为Unicode
 7 name2 = name1.encode("gbk")     #Unicode转为gbk
 8 
 9 ② 使用默认字符编码（ascii）
10 name = "nihao"              #英文字符，且第二行字符声明去掉，此刻name为ascii码
11 
12 name1 = name.decode("ascii")     #ascii码转为unicode
13 name2 = name1.encode("utf-8")  #unicode转为utf-8
14 name3 =name1.encode("gbk")     #unicode转为gbk

View Code

三、用户交互及字符串拼接

 1 #!/usr/bin/env python
 2 # -*- coding:utf-8 -*-
 3 # _author_soloLi
 4 # python2.X与python3.X区别:    python2.X raw_input = python3.X input
 5 # 提示用户输入姓名、年龄、工作、工资并以信息列表的形式打印出
 6 
 7  name = input("Please input your name:")
 8  age = int(input("Please input your age:"))  #str强制转换为int
 9  job = input("Please input your job:")
10  salary = input("Please input your salary:")
11 
12  info1 = '''
13  ------------  Info of %s  ---------
14  Name:%s
15  Age:%d
16  Job：%s
17  Salary：%s
18  ''' %(name,name,age,job,salary)     #%s检测数据类型为字符串，%d检测数据类型为整数,%f检测数据类型为浮点数 强制
19  print(info1)
20 
21 # info2 = '''
22 # ------------  Info of {_Name}  ---------
23 # Name:{_Name}
24 # Age:{_Age}
25 # Job：{_Job}
26 # Salary：{_Salary}
27 # ''' .format(_Name=name,
28 #             _Age=age,
29 #             _Job=job,
30 #             _Salary=salary)
31 # print(info2)
32 
33 # info3 = '''
34 # ------------  Info of {0}  ---------
35 # Name:{0}
36 # Age:{1}
37 # Job：{2}
38 # Salary：{3}
39 # ''' .format(name,age,job,salary)
40 # print(info3)

View Code

对比分析：
1、% ：无法同时传递一个变量和元组，又是要加（）来保证不抛出typeerror异常
2、+ ：每增加一个一个+就会开辟一块新的内存空间
3、.fomat ：不会出现上述问题，有时使用为了兼容Python2版本。如使用logging库

四、循环语句（if、while、for、三元运算）

 1 #!/usr/bin/env python
 2 # -*- coding:utf-8 -*-
 3 # _author_soloLi
 4 ################## if语句 ######################
 5 # A = 66
 6 #
 7 # B = int(input("请输入0-100的幸运数字:"))
 8 #
 9 # if B == A:                     #母级顶格写
10 #     print ("恭喜你猜对了!")    #子级强制缩进写
11 # elif B > A :
12 #     print ("猜小了")
13 # else:
14 #     print ("猜大了")
15 
16 
17 ################## while语句 ######################
18 # A = 66
19 # count = 0                    # 设置初始值count=0
20 #
21 # while count < 3 :
22 #
23 #     B = int(input("请输入0-100的数字:"))
24 #
25 #     if B == A:
26 #         print ("恭喜你猜对了!")
27 #         break
28 #     elif B > A :
29 #         print ("猜大了")
30 #     else:
31 #         print ("猜小了")
32 #     count += 1
33 # else:
34 #     print ("你猜的次数太多了!")
35 
36 
37 ################## for语句 ######################
38 A = 66
39 i=1
40 for i in range(3):# while判断count是否小于3，如果小于3则：
41     print("i=",i)
42     B = int(input("请输入0-100的数字:"))
43     if B == A:
44         print ("恭喜你猜对了!")
45         break
46     elif B > A :
47         print ("猜小了")
48     else:
49         print ("猜大了")
50     i+=1
51 else:
52     print ("你猜的次数太多了!")
53 
54 
55 ################## 三元运算 ######################
56  # esult = 值1 if 条件 else 值2
57  # 如果条件成立，那么将 “值1” 赋值给result变量，否则，将“值2”赋值给result变量

View Code

五、基本数据类型

# ①数字
# 1.整型int
#     作用：年纪，等级，身份证号，qq号等整型数字相关
# 2.浮点型float
#     作用：薪资，身高，体重，体质参数等浮点数相关
# 3.长整形
#     python3中没有长整形的概念
# 进制的转换：bin(转成二进制)、oct(转成八进制)、hex(转成十六进制)
#
# ②字符串
# 作用：名字，性别，国籍，地址等描述信息
# 常用操作：1.移除空白：       strip    （参数1：移除什么符号）
#           2.切分：           split  （参数1：以什么符号进行切割，参数2：最大切割次数）  切割成列表
#           3.长度：           len     统计的字符的长度
#           4.切片：           []     （参数1：起始位置：参数2：步长）   顾头不顾尾
#           5.替换：           replace（参数1：old，参数2：new，参数3：替换第几个）
#           6.拼接：           format
#           7.索引：           只能取
#           8.判断开头或结尾： startswith,endswith
#           9.其他。。。
#
# ③列表
# 作用：多个装备，多个爱好，多门课程，多个女朋友等
# 常用操作：1.索引：           即可存也可以取
#           2.切片：           []  切成新的列表
#           3.追加：           append
#           4.删除：           remove（按照值）/pop（按照索引）
#           5.长度：           len    统计的列表的元素个数
#           6.成员运算（包含）:in
#           7.循环：
#           8.插入：           insert
#           9.其他。。。
#
# ④元组
# 作用：存多个值，对比列表来说，元组不可变，主要是用来读
# 常用操作：除了不能改变元组的之外，其他操作跟列表相同
#           1.索引：           index只能取
#           2.切片：           []
#           3.长度：           len
#           4.循环：
#           5.成员运算（包含）:in
#           6.统计：           count
#
# ⑤字典
# 作用：存多个值,key-value(键值对)存取，取值速度快
# 常用操作：1.按key存取值：可存可取
#           2.循环:            无序
#           3.长度：           len
#           4.删除：           pop  第2个参数设置none  程序即使取不到值也不报错
#           5.成员运算（包含）:in
#           6.取值：           get（不存在程序不报错）/info（不存在程序报错）
#           7.其他。。。
#
# ⑥集合
# 作用：去重，关系运算，
# 常用操作：1.长度：           len
#           2.添加：           add（只能添加一个元素）/update（可以添加多个元素）
#           3.删除：           remove（删除指定元素）/pop（随机删除元素）/discard（删除指定元素，与remove区别在于，如果元素不存在也不会报错）
#           4.交集：           &
#           5.差集：           -
#           6.对称差集：       ^
#           7.成员运算：       in
#           8.合集：           |
#           9.父集：           >,>=
#          10.子集：           <,<=
#
# 小结：
# 按存值个数区分
#     标量／原子类型:数字，字符串
#     容器类型:列表，元组，字典
#
# 按可变不可变区分
#     可变:列表，字典
#     不可变:数字，字符串，元组
#
# 按访问方式区分
#     直接访问：数字
#     按照索引访问（序列类型）：字符串，列表，元组
#     key访问（映射类型）    字典

【总结】常见的数据类型有哪些？主要操作方法有哪些？

一、整型
如： 18、73、84
类型常用功能：

 1 abs(x)      #返回绝对值
 2 x+y,x-y,x*y,x/y  #加减乘除
 3 x/y         #取商，浮点数相除保留余数
 4 x//y        #取商，浮点数相除余数为0
 5 x%y         #取余
 6 x**y         #幂次方
 7 cmp(x,y)    #两个数比较，返回True或False相等则为0
 8 coerce(x,y) #强制把两个数生成一个元组
 9 divmod(x,y) #相除得到商和余数组成的元组
10 float(x)    #转换为浮点型
11 str(x)      #转换为字符串
12 hex(x)      #转换为16进制
13 oct(x)      #转换8进制

View Code

更多功能：

  1 class int(object):
  2     """
  3     int(x=0) -> int or long
  4     int(x, base=10) -> int or long
  5     
  6     Convert a number or string to an integer, or return 0 if no arguments
  7     are given.  If x is floating point, the conversion truncates towards zero.
  8     If x is outside the integer range, the function returns a long instead.
  9     
 10     If x is not a number or if base is given, then x must be a string or
 11     Unicode object representing an integer literal in the given base.  The
 12     literal can be preceded by '+' or '-' and be surrounded by whitespace.
 13     The base defaults to 10.  Valid bases are 0 and 2-36.  Base 0 means to
 14     interpret the base from the string as an integer literal.
 15     >>> int('0b100', base=0)
 16     """
 17     def bit_length(self): 
 18         """ 返回表示该数字的时占用的最少位数 """
 19         """
 20         int.bit_length() -> int
 21         
 22         Number of bits necessary to represent self in binary.
 23         >>> bin(37)
 24         '0b100101'
 25         >>> (37).bit_length()
 26         """
 27         return 0
 28 
 29     def conjugate(self, *args, **kwargs): # real signature unknown
 30         """ 返回该复数的共轭复数 """
 31         """ Returns self, the complex conjugate of any int. """
 32         pass
 33 
 34     def __abs__(self):
 35         """ 返回绝对值 """
 36         """ x.__abs__() <==> abs(x) """
 37         pass
 38 
 39     def __add__(self, y):
 40         """ x.__add__(y) <==> x+y """
 41         pass
 42 
 43     def __and__(self, y):
 44         """ x.__and__(y) <==> x&y """
 45         pass
 46 
 47     def __cmp__(self, y): 
 48         """ 比较两个数大小 """
 49         """ x.__cmp__(y) <==> cmp(x,y) """
 50         pass
 51 
 52     def __coerce__(self, y):
 53         """ 强制生成一个元组 """ 
 54         """ x.__coerce__(y) <==> coerce(x, y) """
 55         pass
 56 
 57     def __divmod__(self, y): 
 58         """ 相除，得到商和余数组成的元组 """ 
 59         """ x.__divmod__(y) <==> divmod(x, y) """
 60         pass
 61 
 62     def __div__(self, y): 
 63         """ x.__div__(y) <==> x/y """
 64         pass
 65 
 66     def __float__(self): 
 67         """ 转换为浮点类型 """ 
 68         """ x.__float__() <==> float(x) """
 69         pass
 70 
 71     def __floordiv__(self, y): 
 72         """ x.__floordiv__(y) <==> x//y """
 73         pass
 74 
 75     def __format__(self, *args, **kwargs): # real signature unknown
 76         pass
 77 
 78     def __getattribute__(self, name): 
 79         """ x.__getattribute__('name') <==> x.name """
 80         pass
 81 
 82     def __getnewargs__(self, *args, **kwargs): # real signature unknown
 83         """ 内部调用 __new__方法或创建对象时传入参数使用 """ 
 84         pass
 85 
 86     def __hash__(self): 
 87         """如果对象object为哈希表类型，返回对象object的哈希值。哈希值为整数。在字典查找中，哈希值用于快速比较字典的键。两个数值如果相等，则哈希值也相等。"""
 88         """ x.__hash__() <==> hash(x) """
 89         pass
 90 
 91     def __hex__(self): 
 92         """ 返回当前数的 十六进制 表示 """ 
 93         """ x.__hex__() <==> hex(x) """
 94         pass
 95 
 96     def __index__(self): 
 97         """ 用于切片，数字无意义 """
 98         """ x[y:z] <==> x[y.__index__():z.__index__()] """
 99         pass
100 
101     def __init__(self, x, base=10): # known special case of int.__init__
102         """ 构造方法，执行 x = 123 或 x = int(10) 时，自动调用，暂时忽略 """ 
103         """
104         int(x=0) -> int or long
105         int(x, base=10) -> int or long
106         
107         Convert a number or string to an integer, or return 0 if no arguments
108         are given.  If x is floating point, the conversion truncates towards zero.
109         If x is outside the integer range, the function returns a long instead.
110         
111         If x is not a number or if base is given, then x must be a string or
112         Unicode object representing an integer literal in the given base.  The
113         literal can be preceded by '+' or '-' and be surrounded by whitespace.
114         The base defaults to 10.  Valid bases are 0 and 2-36.  Base 0 means to
115         interpret the base from the string as an integer literal.
116         >>> int('0b100', base=0)
117         # (copied from class doc)
118         """
119         pass
120 
121     def __int__(self): 
122         """ 转换为整数 """ 
123         """ x.__int__() <==> int(x) """
124         pass
125 
126     def __invert__(self): 
127         """ x.__invert__() <==> ~x """
128         pass
129 
130     def __long__(self): 
131         """ 转换为长整数 """ 
132         """ x.__long__() <==> long(x) """
133         pass
134 
135     def __lshift__(self, y): 
136         """ x.__lshift__(y) <==> x<<y """
137         pass
138 
139     def __mod__(self, y): 
140         """ x.__mod__(y) <==> x%y """
141         pass
142 
143     def __mul__(self, y): 
144         """ x.__mul__(y) <==> x*y """
145         pass
146 
147     def __neg__(self): 
148         """ x.__neg__() <==> -x """
149         pass
150 
151     @staticmethod # known case of __new__
152     def __new__(S, *more): 
153         """ T.__new__(S, ...) -> a new object with type S, a subtype of T """
154         pass
155 
156     def __nonzero__(self): 
157         """ x.__nonzero__() <==> x != 0 """
158         pass
159 
160     def __oct__(self): 
161         """ 返回改值的 八进制 表示 """ 
162         """ x.__oct__() <==> oct(x) """
163         pass
164 
165     def __or__(self, y): 
166         """ x.__or__(y) <==> x|y """
167         pass
168 
169     def __pos__(self): 
170         """ x.__pos__() <==> +x """
171         pass
172 
173     def __pow__(self, y, z=None): 
174         """ 幂，次方 """ 
175         """ x.__pow__(y[, z]) <==> pow(x, y[, z]) """
176         pass
177 
178     def __radd__(self, y): 
179         """ x.__radd__(y) <==> y+x """
180         pass
181 
182     def __rand__(self, y): 
183         """ x.__rand__(y) <==> y&x """
184         pass
185 
186     def __rdivmod__(self, y): 
187         """ x.__rdivmod__(y) <==> divmod(y, x) """
188         pass
189 
190     def __rdiv__(self, y): 
191         """ x.__rdiv__(y) <==> y/x """
192         pass
193 
194     def __repr__(self): 
195         """转化为解释器可读取的形式 """
196         """ x.__repr__() <==> repr(x) """
197         pass
198 
199     def __str__(self): 
200         """转换为人阅读的形式，如果没有适于人阅读的解释形式的话，则返回解释器课阅读的形式"""
201         """ x.__str__() <==> str(x) """
202         pass
203 
204     def __rfloordiv__(self, y): 
205         """ x.__rfloordiv__(y) <==> y//x """
206         pass
207 
208     def __rlshift__(self, y): 
209         """ x.__rlshift__(y) <==> y<<x """
210         pass
211 
212     def __rmod__(self, y): 
213         """ x.__rmod__(y) <==> y%x """
214         pass
215 
216     def __rmul__(self, y): 
217         """ x.__rmul__(y) <==> y*x """
218         pass
219 
220     def __ror__(self, y): 
221         """ x.__ror__(y) <==> y|x """
222         pass
223 
224     def __rpow__(self, x, z=None): 
225         """ y.__rpow__(x[, z]) <==> pow(x, y[, z]) """
226         pass
227 
228     def __rrshift__(self, y): 
229         """ x.__rrshift__(y) <==> y>>x """
230         pass
231 
232     def __rshift__(self, y): 
233         """ x.__rshift__(y) <==> x>>y """
234         pass
235 
236     def __rsub__(self, y): 
237         """ x.__rsub__(y) <==> y-x """
238         pass
239 
240     def __rtruediv__(self, y): 
241         """ x.__rtruediv__(y) <==> y/x """
242         pass
243 
244     def __rxor__(self, y): 
245         """ x.__rxor__(y) <==> y^x """
246         pass
247 
248     def __sub__(self, y): 
249         """ x.__sub__(y) <==> x-y """
250         pass
251 
252     def __truediv__(self, y): 
253         """ x.__truediv__(y) <==> x/y """
254         pass
255 
256     def __trunc__(self, *args, **kwargs): 
257         """ 返回数值被截取为整形的值，在整形中无意义 """
258         pass
259 
260     def __xor__(self, y): 
261         """ x.__xor__(y) <==> x^y """
262         pass
263 
264     denominator = property(lambda self: object(), lambda self, v: None, lambda self: None)  # default
265     """ 分母 = 1 """
266     """the denominator of a rational number in lowest terms"""
267 
268     imag = property(lambda self: object(), lambda self, v: None, lambda self: None)  # default
269     """ 虚数，无意义 """
270     """the imaginary part of a complex number"""
271 
272     numerator = property(lambda self: object(), lambda self, v: None, lambda self: None)  # default
273     """ 分子 = 数字大小 """
274     """the numerator of a rational number in lowest terms"""
275 
276     real = property(lambda self: object(), lambda self, v: None, lambda self: None)  # default
277     """ 实属，无意义 """
278     """the real part of a complex number"""
279 
280 int
281 
282 int

View Code

二、长整型
如：2147483649、9223372036854775807
类型常用功能：

1 #长整型功能与整形基本类似

View Code

更多功能：

  1 class long(object):
  2     """
  3     long(x=0) -> long
  4     long(x, base=10) -> long
  5     
  6     Convert a number or string to a long integer, or return 0L if no arguments
  7     are given.  If x is floating point, the conversion truncates towards zero.
  8     
  9     If x is not a number or if base is given, then x must be a string or
 10     Unicode object representing an integer literal in the given base.  The
 11     literal can be preceded by '+' or '-' and be surrounded by whitespace.
 12     The base defaults to 10.  Valid bases are 0 and 2-36.  Base 0 means to
 13     interpret the base from the string as an integer literal.
 14     >>> int('0b100', base=0)
 15     4L
 16     """
 17     def bit_length(self): # real signature unknown; restored from __doc__
 18         """
 19         long.bit_length() -> int or long
 20         
 21         Number of bits necessary to represent self in binary.
 22         >>> bin(37L)
 23         '0b100101'
 24         >>> (37L).bit_length()
 25         """
 26         return 0
 27 
 28     def conjugate(self, *args, **kwargs): # real signature unknown
 29         """ Returns self, the complex conjugate of any long. """
 30         pass
 31 
 32     def __abs__(self): # real signature unknown; restored from __doc__
 33         """ x.__abs__() <==> abs(x) """
 34         pass
 35 
 36     def __add__(self, y): # real signature unknown; restored from __doc__
 37         """ x.__add__(y) <==> x+y """
 38         pass
 39 
 40     def __and__(self, y): # real signature unknown; restored from __doc__
 41         """ x.__and__(y) <==> x&y """
 42         pass
 43 
 44     def __cmp__(self, y): # real signature unknown; restored from __doc__
 45         """ x.__cmp__(y) <==> cmp(x,y) """
 46         pass
 47 
 48     def __coerce__(self, y): # real signature unknown; restored from __doc__
 49         """ x.__coerce__(y) <==> coerce(x, y) """
 50         pass
 51 
 52     def __divmod__(self, y): # real signature unknown; restored from __doc__
 53         """ x.__divmod__(y) <==> divmod(x, y) """
 54         pass
 55 
 56     def __div__(self, y): # real signature unknown; restored from __doc__
 57         """ x.__div__(y) <==> x/y """
 58         pass
 59 
 60     def __float__(self): # real signature unknown; restored from __doc__
 61         """ x.__float__() <==> float(x) """
 62         pass
 63 
 64     def __floordiv__(self, y): # real signature unknown; restored from __doc__
 65         """ x.__floordiv__(y) <==> x//y """
 66         pass
 67 
 68     def __format__(self, *args, **kwargs): # real signature unknown
 69         pass
 70 
 71     def __getattribute__(self, name): # real signature unknown; restored from __doc__
 72         """ x.__getattribute__('name') <==> x.name """
 73         pass
 74 
 75     def __getnewargs__(self, *args, **kwargs): # real signature unknown
 76         pass
 77 
 78     def __hash__(self): # real signature unknown; restored from __doc__
 79         """ x.__hash__() <==> hash(x) """
 80         pass
 81 
 82     def __hex__(self): # real signature unknown; restored from __doc__
 83         """ x.__hex__() <==> hex(x) """
 84         pass
 85 
 86     def __index__(self): # real signature unknown; restored from __doc__
 87         """ x[y:z] <==> x[y.__index__():z.__index__()] """
 88         pass
 89 
 90     def __init__(self, x=0): # real signature unknown; restored from __doc__
 91         pass
 92 
 93     def __int__(self): # real signature unknown; restored from __doc__
 94         """ x.__int__() <==> int(x) """
 95         pass
 96 
 97     def __invert__(self): # real signature unknown; restored from __doc__
 98         """ x.__invert__() <==> ~x """
 99         pass
100 
101     def __long__(self): # real signature unknown; restored from __doc__
102         """ x.__long__() <==> long(x) """
103         pass
104 
105     def __lshift__(self, y): # real signature unknown; restored from __doc__
106         """ x.__lshift__(y) <==> x<<y """
107         pass
108 
109     def __mod__(self, y): # real signature unknown; restored from __doc__
110         """ x.__mod__(y) <==> x%y """
111         pass
112 
113     def __mul__(self, y): # real signature unknown; restored from __doc__
114         """ x.__mul__(y) <==> x*y """
115         pass
116 
117     def __neg__(self): # real signature unknown; restored from __doc__
118         """ x.__neg__() <==> -x """
119         pass
120 
121     @staticmethod # known case of __new__
122     def __new__(S, *more): # real signature unknown; restored from __doc__
123         """ T.__new__(S, ...) -> a new object with type S, a subtype of T """
124         pass
125 
126     def __nonzero__(self): # real signature unknown; restored from __doc__
127         """ x.__nonzero__() <==> x != 0 """
128         pass
129 
130     def __oct__(self): # real signature unknown; restored from __doc__
131         """ x.__oct__() <==> oct(x) """
132         pass
133 
134     def __or__(self, y): # real signature unknown; restored from __doc__
135         """ x.__or__(y) <==> x|y """
136         pass
137 
138     def __pos__(self): # real signature unknown; restored from __doc__
139         """ x.__pos__() <==> +x """
140         pass
141 
142     def __pow__(self, y, z=None): # real signature unknown; restored from __doc__
143         """ x.__pow__(y[, z]) <==> pow(x, y[, z]) """
144         pass
145 
146     def __radd__(self, y): # real signature unknown; restored from __doc__
147         """ x.__radd__(y) <==> y+x """
148         pass
149 
150     def __rand__(self, y): # real signature unknown; restored from __doc__
151         """ x.__rand__(y) <==> y&x """
152         pass
153 
154     def __rdivmod__(self, y): # real signature unknown; restored from __doc__
155         """ x.__rdivmod__(y) <==> divmod(y, x) """
156         pass
157 
158     def __rdiv__(self, y): # real signature unknown; restored from __doc__
159         """ x.__rdiv__(y) <==> y/x """
160         pass
161 
162     def __repr__(self): # real signature unknown; restored from __doc__
163         """ x.__repr__() <==> repr(x) """
164         pass
165 
166     def __rfloordiv__(self, y): # real signature unknown; restored from __doc__
167         """ x.__rfloordiv__(y) <==> y//x """
168         pass
169 
170     def __rlshift__(self, y): # real signature unknown; restored from __doc__
171         """ x.__rlshift__(y) <==> y<<x """
172         pass
173 
174     def __rmod__(self, y): # real signature unknown; restored from __doc__
175         """ x.__rmod__(y) <==> y%x """
176         pass
177 
178     def __rmul__(self, y): # real signature unknown; restored from __doc__
179         """ x.__rmul__(y) <==> y*x """
180         pass
181 
182     def __ror__(self, y): # real signature unknown; restored from __doc__
183         """ x.__ror__(y) <==> y|x """
184         pass
185 
186     def __rpow__(self, x, z=None): # real signature unknown; restored from __doc__
187         """ y.__rpow__(x[, z]) <==> pow(x, y[, z]) """
188         pass
189 
190     def __rrshift__(self, y): # real signature unknown; restored from __doc__
191         """ x.__rrshift__(y) <==> y>>x """
192         pass
193 
194     def __rshift__(self, y): # real signature unknown; restored from __doc__
195         """ x.__rshift__(y) <==> x>>y """
196         pass
197 
198     def __rsub__(self, y): # real signature unknown; restored from __doc__
199         """ x.__rsub__(y) <==> y-x """
200         pass
201 
202     def __rtruediv__(self, y): # real signature unknown; restored from __doc__
203         """ x.__rtruediv__(y) <==> y/x """
204         pass
205 
206     def __rxor__(self, y): # real signature unknown; restored from __doc__
207         """ x.__rxor__(y) <==> y^x """
208         pass
209 
210     def __sizeof__(self, *args, **kwargs): # real signature unknown
211         """ Returns size in memory, in bytes """
212         pass
213 
214     def __str__(self): # real signature unknown; restored from __doc__
215         """ x.__str__() <==> str(x) """
216         pass
217 
218     def __sub__(self, y): # real signature unknown; restored from __doc__
219         """ x.__sub__(y) <==> x-y """
220         pass
221 
222     def __truediv__(self, y): # real signature unknown; restored from __doc__
223         """ x.__truediv__(y) <==> x/y """
224         pass
225 
226     def __trunc__(self, *args, **kwargs): # real signature unknown
227         """ Truncating an Integral returns itself. """
228         pass
229 
230     def __xor__(self, y): # real signature unknown; restored from __doc__
231         """ x.__xor__(y) <==> x^y """
232         pass
233 
234     denominator = property(lambda self: object(), lambda self, v: None, lambda self: None)  # default
235     """the denominator of a rational number in lowest terms"""
236 
237     imag = property(lambda self: object(), lambda self, v: None, lambda self: None)  # default
238     """the imaginary part of a complex number"""
239 
240     numerator = property(lambda self: object(), lambda self, v: None, lambda self: None)  # default
241     """the numerator of a rational number in lowest terms"""
242 
243     real = property(lambda self: object(), lambda self, v: None, lambda self: None)  # default
244     """the real part of a complex number"""
245 
246 long
247 
248 long

View Code

注：跟C语言不同，Python的长整数没有指定位宽，即：Python没有限制长整数数值的大小，但实际上由于机器内存有限，我们使用的长整数数值不可能无限大。自从Python2.2起，如果整数发生溢出，Python会自动将整数数据转换为长整数，所以如今在长整数数据后面不加字母L也不会导致严重后果

三、浮点型
如：3.14、2.88

类型常用功能：

1 #浮点型功能与整形基本类似

View Code

更多功能：

  1 class float(object):
  2     """
  3     float(x) -> floating point number
  4     
  5     Convert a string or number to a floating point number, if possible.
  6     """
  7     def as_integer_ratio(self):   
  8         """ 获取改值的最简比 """
  9         """
 10         float.as_integer_ratio() -> (int, int)
 11 
 12         Return a pair of integers, whose ratio is exactly equal to the original
 13         float and with a positive denominator.
 14         Raise OverflowError on infinities and a ValueError on NaNs.
 15 
 16         >>> (10.0).as_integer_ratio()
 17         (10, 1)
 18         >>> (0.0).as_integer_ratio()
 19         (0, 1)
 20         >>> (-.25).as_integer_ratio()
 21         (-1, 4)
 22         """
 23         pass
 24 
 25     def conjugate(self, *args, **kwargs): # real signature unknown
 26         """ Return self, the complex conjugate of any float. """
 27         pass
 28 
 29     def fromhex(self, string):   
 30         """ 将十六进制字符串转换成浮点型 """
 31         """
 32         float.fromhex(string) -> float
 33         
 34         Create a floating-point number from a hexadecimal string.
 35         >>> float.fromhex('0x1.ffffp10')
 36         2047.984375
 37         >>> float.fromhex('-0x1p-1074')
 38         -4.9406564584124654e-324
 39         """
 40         return 0.0
 41 
 42     def hex(self):   
 43         """ 返回当前值的 16 进制表示 """
 44         """
 45         float.hex() -> string
 46         
 47         Return a hexadecimal representation of a floating-point number.
 48         >>> (-0.1).hex()
 49         '-0x1.999999999999ap-4'
 50         >>> 3.14159.hex()
 51         '0x1.921f9f01b866ep+1'
 52         """
 53         return ""
 54 
 55     def is_integer(self, *args, **kwargs): # real signature unknown
 56         """ Return True if the float is an integer. """
 57         pass
 58 
 59     def __abs__(self):   
 60         """ x.__abs__() <==> abs(x) """
 61         pass
 62 
 63     def __add__(self, y):   
 64         """ x.__add__(y) <==> x+y """
 65         pass
 66 
 67     def __coerce__(self, y):   
 68         """ x.__coerce__(y) <==> coerce(x, y) """
 69         pass
 70 
 71     def __divmod__(self, y):   
 72         """ x.__divmod__(y) <==> divmod(x, y) """
 73         pass
 74 
 75     def __div__(self, y):   
 76         """ x.__div__(y) <==> x/y """
 77         pass
 78 
 79     def __eq__(self, y):   
 80         """ x.__eq__(y) <==> x==y """
 81         pass
 82 
 83     def __float__(self):   
 84         """ x.__float__() <==> float(x) """
 85         pass
 86 
 87     def __floordiv__(self, y):   
 88         """ x.__floordiv__(y) <==> x//y """
 89         pass
 90 
 91     def __format__(self, format_spec):   
 92         """
 93         float.__format__(format_spec) -> string
 94         
 95         Formats the float according to format_spec.
 96         """
 97         return ""
 98 
 99     def __getattribute__(self, name):   
100         """ x.__getattribute__('name') <==> x.name """
101         pass
102 
103     def __getformat__(self, typestr):   
104         """
105         float.__getformat__(typestr) -> string
106         
107         You probably don't want to use this function.  It exists mainly to be
108         used in Python's test suite.
109         
110         typestr must be 'double' or 'float'.  This function returns whichever of
111         'unknown', 'IEEE, big-endian' or 'IEEE, little-endian' best describes the
112         format of floating point numbers used by the C type named by typestr.
113         """
114         return ""
115 
116     def __getnewargs__(self, *args, **kwargs): # real signature unknown
117         pass
118 
119     def __ge__(self, y):   
120         """ x.__ge__(y) <==> x>=y """
121         pass
122 
123     def __gt__(self, y):   
124         """ x.__gt__(y) <==> x>y """
125         pass
126 
127     def __hash__(self):   
128         """ x.__hash__() <==> hash(x) """
129         pass
130 
131     def __init__(self, x):   
132         pass
133 
134     def __int__(self):   
135         """ x.__int__() <==> int(x) """
136         pass
137 
138     def __le__(self, y):   
139         """ x.__le__(y) <==> x<=y """
140         pass
141 
142     def __long__(self):   
143         """ x.__long__() <==> long(x) """
144         pass
145 
146     def __lt__(self, y):   
147         """ x.__lt__(y) <==> x<y """
148         pass
149 
150     def __mod__(self, y):   
151         """ x.__mod__(y) <==> x%y """
152         pass
153 
154     def __mul__(self, y):   
155         """ x.__mul__(y) <==> x*y """
156         pass
157 
158     def __neg__(self):   
159         """ x.__neg__() <==> -x """
160         pass
161 
162     @staticmethod # known case of __new__
163     def __new__(S, *more):   
164         """ T.__new__(S, ...) -> a new object with type S, a subtype of T """
165         pass
166 
167     def __ne__(self, y):   
168         """ x.__ne__(y) <==> x!=y """
169         pass
170 
171     def __nonzero__(self):   
172         """ x.__nonzero__() <==> x != 0 """
173         pass
174 
175     def __pos__(self):   
176         """ x.__pos__() <==> +x """
177         pass
178 
179     def __pow__(self, y, z=None):   
180         """ x.__pow__(y[, z]) <==> pow(x, y[, z]) """
181         pass
182 
183     def __radd__(self, y):   
184         """ x.__radd__(y) <==> y+x """
185         pass
186 
187     def __rdivmod__(self, y):   
188         """ x.__rdivmod__(y) <==> divmod(y, x) """
189         pass
190 
191     def __rdiv__(self, y):   
192         """ x.__rdiv__(y) <==> y/x """
193         pass
194 
195     def __repr__(self):   
196         """ x.__repr__() <==> repr(x) """
197         pass
198 
199     def __rfloordiv__(self, y):   
200         """ x.__rfloordiv__(y) <==> y//x """
201         pass
202 
203     def __rmod__(self, y):   
204         """ x.__rmod__(y) <==> y%x """
205         pass
206 
207     def __rmul__(self, y):   
208         """ x.__rmul__(y) <==> y*x """
209         pass
210 
211     def __rpow__(self, x, z=None):   
212         """ y.__rpow__(x[, z]) <==> pow(x, y[, z]) """
213         pass
214 
215     def __rsub__(self, y):   
216         """ x.__rsub__(y) <==> y-x """
217         pass
218 
219     def __rtruediv__(self, y):   
220         """ x.__rtruediv__(y) <==> y/x """
221         pass
222 
223     def __setformat__(self, typestr, fmt):   
224         """
225         float.__setformat__(typestr, fmt) -> None
226         
227         You probably don't want to use this function.  It exists mainly to be
228         used in Python's test suite.
229         
230         typestr must be 'double' or 'float'.  fmt must be one of 'unknown',
231         'IEEE, big-endian' or 'IEEE, little-endian', and in addition can only be
232         one of the latter two if it appears to match the underlying C reality.
233         
234         Override the automatic determination of C-level floating point type.
235         This affects how floats are converted to and from binary strings.
236         """
237         pass
238 
239     def __str__(self):   
240         """ x.__str__() <==> str(x) """
241         pass
242 
243     def __sub__(self, y):   
244         """ x.__sub__(y) <==> x-y """
245         pass
246 
247     def __truediv__(self, y):   
248         """ x.__truediv__(y) <==> x/y """
249         pass
250 
251     def __trunc__(self, *args, **kwargs): # real signature unknown
252         """ Return the Integral closest to x between 0 and x. """
253         pass
254 
255     imag = property(lambda self: object(), lambda self, v: None, lambda self: None)  # default
256     """the imaginary part of a complex number"""
257 
258     real = property(lambda self: object(), lambda self, v: None, lambda self: None)  # default
259     """the real part of a complex number"""
260 
261  float
262 
263 float

View Code

四、字符串
如：'wupeiqi'、'alex'、'solo'

类型常用功能：

 1 name = "my name is solo"
 2 print(name.capitalize())            #首字母大写
 3 #My name is solo
 4 print(name.count("l"))              #统计字符串出现某个字符的个数
 5 #2
 6 print(name.center(30,"-"))          #打印30个字符，不够的-补齐
 7 #--------my name is solo--------
 8 print(name.ljust(30,"-"))           #打印30个字符，不够的-补齐，字符串在左边
 9 #my name is solo----------------
10 print(name.endswith("solo"))         #判断字符串是否以solo结尾
11 #True
12 print(name[name.find("na"):])       #find寻找na所在的索引下标 字符串也可以切片
13 #name is solo
14 print("5.3".isdigit())              #判断字符是否为整数
15 #False
16 print("a_1A".isidentifier())        #判断是不是一个合法的标识符（变量名）
17 #True
18 print("+".join(["1","2","3"]))     #把join后的内容加入到前面字符串中，以+为分割符
19 #1+2+3
20 print("\nsolo".strip())              #去换行符
21 #solo
22 print("1+2+3+4".split("+"))        #以+为分隔符生成新的列表，默认不写为空格
23 #['1', '2', '3', '4']
24 name = "my name is {name} and i an {year} old"
25 print(name.format(name="solo",year=20)
26 #my name is solo and i an 20 old
27 print(name.format_map({"name":"solo","year":20}))            #很少用
28 #my name is solo and i an 20 old
29 p = str.maketrans("abcdefli","12345678")         #转换  一一对应
30 print("lianzhilei".translate(p))
31 #781nzh8758

View Code

更多功能：

  1 class str(basestring):
  2     """
  3     str(object='') -> string
  4     
  5     Return a nice string representation of the object.
  6     If the argument is a string, the return value is the same object.
  7     """
  8     def capitalize(self):  
  9         """ 首字母变大写 """
 10         """
 11         S.capitalize() -> string
 12         
 13         Return a copy of the string S with only its first character
 14         capitalized.
 15         """
 16         return ""
 17 
 18     def center(self, width, fillchar=None):  
 19         """ 内容居中，width：总长度；fillchar：空白处填充内容，默认无 """
 20         """
 21         S.center(width[, fillchar]) -> string
 22         
 23         Return S centered in a string of length width. Padding is
 24         done using the specified fill character (default is a space)
 25         """
 26         return ""
 27 
 28     def count(self, sub, start=None, end=None):  
 29         """ 子序列个数 """
 30         """
 31         S.count(sub[, start[, end]]) -> int
 32         
 33         Return the number of non-overlapping occurrences of substring sub in
 34         string S[start:end].  Optional arguments start and end are interpreted
 35         as in slice notation.
 36         """
 37         return 0
 38 
 39     def decode(self, encoding=None, errors=None):  
 40         """ 解码 """
 41         """
 42         S.decode([encoding[,errors]]) -> object
 43         
 44         Decodes S using the codec registered for encoding. encoding defaults
 45         to the default encoding. errors may be given to set a different error
 46         handling scheme. Default is 'strict' meaning that encoding errors raise
 47         a UnicodeDecodeError. Other possible values are 'ignore' and 'replace'
 48         as well as any other name registered with codecs.register_error that is
 49         able to handle UnicodeDecodeErrors.
 50         """
 51         return object()
 52 
 53     def encode(self, encoding=None, errors=None):  
 54         """ 编码，针对unicode """
 55         """
 56         S.encode([encoding[,errors]]) -> object
 57         
 58         Encodes S using the codec registered for encoding. encoding defaults
 59         to the default encoding. errors may be given to set a different error
 60         handling scheme. Default is 'strict' meaning that encoding errors raise
 61         a UnicodeEncodeError. Other possible values are 'ignore', 'replace' and
 62         'xmlcharrefreplace' as well as any other name registered with
 63         codecs.register_error that is able to handle UnicodeEncodeErrors.
 64         """
 65         return object()
 66 
 67     def endswith(self, suffix, start=None, end=None):  
 68         """ 是否以 xxx 结束 """
 69         """
 70         S.endswith(suffix[, start[, end]]) -> bool
 71         
 72         Return True if S ends with the specified suffix, False otherwise.
 73         With optional start, test S beginning at that position.
 74         With optional end, stop comparing S at that position.
 75         suffix can also be a tuple of strings to try.
 76         """
 77         return False
 78 
 79     def expandtabs(self, tabsize=None):  
 80         """ 将tab转换成空格，默认一个tab转换成8个空格 """
 81         """
 82         S.expandtabs([tabsize]) -> string
 83         
 84         Return a copy of S where all tab characters are expanded using spaces.
 85         If tabsize is not given, a tab size of 8 characters is assumed.
 86         """
 87         return ""
 88 
 89     def find(self, sub, start=None, end=None):  
 90         """ 寻找子序列位置，如果没找到，返回 -1 """
 91         """
 92         S.find(sub [,start [,end]]) -> int
 93         
 94         Return the lowest index in S where substring sub is found,
 95         such that sub is contained within S[start:end].  Optional
 96         arguments start and end are interpreted as in slice notation.
 97         
 98         Return -1 on failure.
 99         """
100         return 0
101 
102     def format(*args, **kwargs): # known special case of str.format
103         """ 字符串格式化，动态参数，将函数式编程时细说 """
104         """
105         S.format(*args, **kwargs) -> string
106         
107         Return a formatted version of S, using substitutions from args and kwargs.
108         The substitutions are identified by braces ('{' and '}').
109         """
110         pass
111 
112     def index(self, sub, start=None, end=None):  
113         """ 子序列位置，如果没找到，报错 """
114         S.index(sub [,start [,end]]) -> int
115         
116         Like S.find() but raise ValueError when the substring is not found.
117         """
118         return 0
119 
120     def isalnum(self):  
121         """ 是否是字母和数字 """
122         """
123         S.isalnum() -> bool
124         
125         Return True if all characters in S are alphanumeric
126         and there is at least one character in S, False otherwise.
127         """
128         return False
129 
130     def isalpha(self):  
131         """ 是否是字母 """
132         """
133         S.isalpha() -> bool
134         
135         Return True if all characters in S are alphabetic
136         and there is at least one character in S, False otherwise.
137         """
138         return False
139 
140     def isdigit(self):  
141         """ 是否是数字 """
142         """
143         S.isdigit() -> bool
144         
145         Return True if all characters in S are digits
146         and there is at least one character in S, False otherwise.
147         """
148         return False
149 
150     def islower(self):  
151         """ 是否小写 """
152         """
153         S.islower() -> bool
154         
155         Return True if all cased characters in S are lowercase and there is
156         at least one cased character in S, False otherwise.
157         """
158         return False
159 
160     def isspace(self):  
161         """
162         S.isspace() -> bool
163         
164         Return True if all characters in S are whitespace
165         and there is at least one character in S, False otherwise.
166         """
167         return False
168 
169     def istitle(self):  
170         """
171         S.istitle() -> bool
172         
173         Return True if S is a titlecased string and there is at least one
174         character in S, i.e. uppercase characters may only follow uncased
175         characters and lowercase characters only cased ones. Return False
176         otherwise.
177         """
178         return False
179 
180     def isupper(self):  
181         """
182         S.isupper() -> bool
183         
184         Return True if all cased characters in S are uppercase and there is
185         at least one cased character in S, False otherwise.
186         """
187         return False
188 
189     def join(self, iterable):  
190         """ 连接 """
191         """
192         S.join(iterable) -> string
193         
194         Return a string which is the concatenation of the strings in the
195         iterable.  The separator between elements is S.
196         """
197         return ""
198 
199     def ljust(self, width, fillchar=None):  
200         """ 内容左对齐，右侧填充 """
201         """
202         S.ljust(width[, fillchar]) -> string
203         
204         Return S left-justified in a string of length width. Padding is
205         done using the specified fill character (default is a space).
206         """
207         return ""
208 
209     def lower(self):  
210         """ 变小写 """
211         """
212         S.lower() -> string
213         
214         Return a copy of the string S converted to lowercase.
215         """
216         return ""
217 
218     def lstrip(self, chars=None):  
219         """ 移除左侧空白 """
220         """
221         S.lstrip([chars]) -> string or unicode
222         
223         Return a copy of the string S with leading whitespace removed.
224         If chars is given and not None, remove characters in chars instead.
225         If chars is unicode, S will be converted to unicode before stripping
226         """
227         return ""
228 
229     def partition(self, sep):  
230         """ 分割，前，中，后三部分 """
231         """
232         S.partition(sep) -> (head, sep, tail)
233         
234         Search for the separator sep in S, and return the part before it,
235         the separator itself, and the part after it.  If the separator is not
236         found, return S and two empty strings.
237         """
238         pass
239 
240     def replace(self, old, new, count=None):  
241         """ 替换 """
242         """
243         S.replace(old, new[, count]) -> string
244         
245         Return a copy of string S with all occurrences of substring
246         old replaced by new.  If the optional argument count is
247         given, only the first count occurrences are replaced.
248         """
249         return ""
250 
251     def rfind(self, sub, start=None, end=None):  
252         """
253         S.rfind(sub [,start [,end]]) -> int
254         
255         Return the highest index in S where substring sub is found,
256         such that sub is contained within S[start:end].  Optional
257         arguments start and end are interpreted as in slice notation.
258         
259         Return -1 on failure.
260         """
261         return 0
262 
263     def rindex(self, sub, start=None, end=None):  
264         """
265         S.rindex(sub [,start [,end]]) -> int
266         
267         Like S.rfind() but raise ValueError when the substring is not found.
268         """
269         return 0
270 
271     def rjust(self, width, fillchar=None):  
272         """
273         S.rjust(width[, fillchar]) -> string
274         
275         Return S right-justified in a string of length width. Padding is
276         done using the specified fill character (default is a space)
277         """
278         return ""
279 
280     def rpartition(self, sep):  
281         """
282         S.rpartition(sep) -> (head, sep, tail)
283         
284         Search for the separator sep in S, starting at the end of S, and return
285         the part before it, the separator itself, and the part after it.  If the
286         separator is not found, return two empty strings and S.
287         """
288         pass
289 
290     def rsplit(self, sep=None, maxsplit=None):  
291         """
292         S.rsplit([sep [,maxsplit]]) -> list of strings
293         
294         Return a list of the words in the string S, using sep as the
295         delimiter string, starting at the end of the string and working
296         to the front.  If maxsplit is given, at most maxsplit splits are
297         done. If sep is not specified or is None, any whitespace string
298         is a separator.
299         """
300         return []
301 
302     def rstrip(self, chars=None):  
303         """
304         S.rstrip([chars]) -> string or unicode
305         
306         Return a copy of the string S with trailing whitespace removed.
307         If chars is given and not None, remove characters in chars instead.
308         If chars is unicode, S will be converted to unicode before stripping
309         """
310         return ""
311 
312     def split(self, sep=None, maxsplit=None):  
313         """ 分割， maxsplit最多分割几次 """
314         """
315         S.split([sep [,maxsplit]]) -> list of strings
316         
317         Return a list of the words in the string S, using sep as the
318         delimiter string.  If maxsplit is given, at most maxsplit
319         splits are done. If sep is not specified or is None, any
320         whitespace string is a separator and empty strings are removed
321         from the result.
322         """
323         return []
324 
325     def splitlines(self, keepends=False):  
326         """ 根据换行分割 """
327         """
328         S.splitlines(keepends=False) -> list of strings
329         
330         Return a list of the lines in S, breaking at line boundaries.
331         Line breaks are not included in the resulting list unless keepends
332         is given and true.
333         """
334         return []
335 
336     def startswith(self, prefix, start=None, end=None):  
337         """ 是否起始 """
338         """
339         S.startswith(prefix[, start[, end]]) -> bool
340         
341         Return True if S starts with the specified prefix, False otherwise.
342         With optional start, test S beginning at that position.
343         With optional end, stop comparing S at that position.
344         prefix can also be a tuple of strings to try.
345         """
346         return False
347 
348     def strip(self, chars=None):  
349         """ 移除两段空白 """
350         """
351         S.strip([chars]) -> string or unicode
352         
353         Return a copy of the string S with leading and trailing
354         whitespace removed.
355         If chars is given and not None, remove characters in chars instead.
356         If chars is unicode, S will be converted to unicode before stripping
357         """
358         return ""
359 
360     def swapcase(self):  
361         """ 大写变小写，小写变大写 """
362         """
363         S.swapcase() -> string
364         
365         Return a copy of the string S with uppercase characters
366         converted to lowercase and vice versa.
367         """
368         return ""
369 
370     def title(self):  
371         """
372         S.title() -> string
373         
374         Return a titlecased version of S, i.e. words start with uppercase
375         characters, all remaining cased characters have lowercase.
376         """
377         return ""
378 
379     def translate(self, table, deletechars=None):  
380         """
381         转换，需要先做一个对应表，最后一个表示删除字符集合
382         intab = "aeiou"
383         outtab = "12345"
384         trantab = maketrans(intab, outtab)
385         str = "this is string example....wow!!!"
386         print str.translate(trantab, 'xm')
387         """
388 
389         """
390         S.translate(table [,deletechars]) -> string
391         
392         Return a copy of the string S, where all characters occurring
393         in the optional argument deletechars are removed, and the
394         remaining characters have been mapped through the given
395         translation table, which must be a string of length 256 or None.
396         If the table argument is None, no translation is applied and
397         the operation simply removes the characters in deletechars.
398         """
399         return ""
400 
401     def upper(self):  
402         """
403         S.upper() -> string
404         
405         Return a copy of the string S converted to uppercase.
406         """
407         return ""
408 
409     def zfill(self, width):  
410         """方法返回指定长度的字符串，原字符串右对齐，前面填充0。"""
411         """
412         S.zfill(width) -> string
413         
414         Pad a numeric string S with zeros on the left, to fill a field
415         of the specified width.  The string S is never truncated.
416         """
417         return ""
418 
419     def _formatter_field_name_split(self, *args, **kwargs): # real signature unknown
420         pass
421 
422     def _formatter_parser(self, *args, **kwargs): # real signature unknown
423         pass
424 
425     def __add__(self, y):  
426         """ x.__add__(y) <==> x+y """
427         pass
428 
429     def __contains__(self, y):  
430         """ x.__contains__(y) <==> y in x """
431         pass
432 
433     def __eq__(self, y):  
434         """ x.__eq__(y) <==> x==y """
435         pass
436 
437     def __format__(self, format_spec):  
438         """
439         S.__format__(format_spec) -> string
440         
441         Return a formatted version of S as described by format_spec.
442         """
443         return ""
444 
445     def __getattribute__(self, name):  
446         """ x.__getattribute__('name') <==> x.name """
447         pass
448 
449     def __getitem__(self, y):  
450         """ x.__getitem__(y) <==> x[y] """
451         pass
452 
453     def __getnewargs__(self, *args, **kwargs): # real signature unknown
454         pass
455 
456     def __getslice__(self, i, j):  
457         """
458         x.__getslice__(i, j) <==> x[i:j]
459                    
460                    Use of negative indices is not supported.
461         """
462         pass
463 
464     def __ge__(self, y):  
465         """ x.__ge__(y) <==> x>=y """
466         pass
467 
468     def __gt__(self, y):  
469         """ x.__gt__(y) <==> x>y """
470         pass
471 
472     def __hash__(self):  
473         """ x.__hash__() <==> hash(x) """
474         pass
475 
476     def __init__(self, string=''): # known special case of str.__init__
477         """
478         str(object='') -> string
479         
480         Return a nice string representation of the object.
481         If the argument is a string, the return value is the same object.
482         # (copied from class doc)
483         """
484         pass
485 
486     def __len__(self):  
487         """ x.__len__() <==> len(x) """
488         pass
489 
490     def __le__(self, y):  
491         """ x.__le__(y) <==> x<=y """
492         pass
493 
494     def __lt__(self, y):  
495         """ x.__lt__(y) <==> x<y """
496         pass
497 
498     def __mod__(self, y):  
499         """ x.__mod__(y) <==> x%y """
500         pass
501 
502     def __mul__(self, n):  
503         """ x.__mul__(n) <==> x*n """
504         pass
505 
506     @staticmethod # known case of __new__
507     def __new__(S, *more):  
508         """ T.__new__(S, ...) -> a new object with type S, a subtype of T """
509         pass
510 
511     def __ne__(self, y):  
512         """ x.__ne__(y) <==> x!=y """
513         pass
514 
515     def __repr__(self):  
516         """ x.__repr__() <==> repr(x) """
517         pass
518 
519     def __rmod__(self, y):  
520         """ x.__rmod__(y) <==> y%x """
521         pass
522 
523     def __rmul__(self, n):  
524         """ x.__rmul__(n) <==> n*x """
525         pass
526 
527     def __sizeof__(self):  
528         """ S.__sizeof__() -> size of S in memory, in bytes """
529         pass
530 
531     def __str__(self):  
532         """ x.__str__() <==> str(x) """
533         pass
534 
535 str
536 
537 str

View Code

五、列表
如：[11,22,33,44,55]、['wupeiqi', 'alex','solo']
1、创建列表：

1 #两种创建方式
2 name_list = ['alex', 'seven', 'eric']
3 
4 name_list = list(['alex', 'seven', 'eric'])

View Code

2、列表类常用功能：
① 切片

 1 name_list = ["Alex","Tenglan","Eric","Rain","Tom","Amy"]
 2 print(name_list[0:3])       #取下标0至下标3之间的元素，包括0，不包括3
 3 #['Alex', 'Tenglan', 'Eric']
 4 print(name_list[:3])        #:前什么都不写，表示从0开始，效果跟上句一样
 5 #['Alex', 'Tenglan', 'Eric']
 6 print(name_list[3:])        #:后什么不写，表示取值到最后
 7 #['Rain', 'Tom', 'Amy']
 8 print(name_list[:])         #:前后都不写，表示取值所有
 9 #['Alex', 'Tenglan', 'Eric', 'Rain', 'Tom', 'Amy']
10 print(name_list[-3:-1])     #从-3开始到-1，包括-3，不包括-1
11 #['Rain', 'Tom']
12 print(name_list[1:-1])      #从1开始到-1，下标有正有负时，正数在前负数在后
13 #['Tenglan', 'Eric', 'Rain', 'Tom']
14 print(name_list[::2])       #2表示，每个1个元素，就取一个
15 #['Alex', 'Eric', 'Tom']
16 #注：[-1:0] [0:0] [-1:2] 都是空

View Code

② 追加

1 name_list = ["Alex","Tenglan","Eric","Rain","Tom","Amy"]
2 name_list.append("new")          #append追加，加到最后，只能添加一个
3 print(name_list)
4 #['Alex', 'Tenglan', 'Eric', 'Rain', 'Tom', 'Amy', 'new']

View Code

③ 插入

1 name_list = ["Alex","Tenglan","Eric","Rain","Tom","Amy"]
2 name_list.insert(3,"new")          #insert插入,把"new"加到下标3的位置
3 print(name_list)

View Code

④ 修改

1 name_list = ["Alex","Tenglan","Eric","Rain","Tom","Amy"]
2 name_list[2] = "solo"                #把下标2的字符串换成solo
3 print(name_list)

View Code

⑤ 删除

 1 #3种删除方式
 2 name_list = ["Alex","Tenglan","Eric","Rain","Tom","Amy"]
 3 del name_list[3]                      #del删除，指定要删除的下标
 4 print(name_list)
 5 #['Alex', 'Tenglan', 'Eric', 'Tom', 'Amy']
 6 name_list.remove("Tenglan")          #remove删除，指定要删除的字符
 7 print(name_list)
 8 #['Alex', 'Eric', 'Tom', 'Amy']
 9 name_list.pop()                       #pop删除，删除列表最后一个值
10 print(name_list)
11 #['Alex', 'Eric', 'Tom']

View Code

⑥ 扩展

1 name_list = ["Alex","Tenglan","Eric","Rain","Tom","Amy"]
2 age_list = [11,22,33]
3 name_list.extend(age_list)               #extend扩展,把列表age_list添加到name_list列表
4 print(name_list)

View Code

⑦ 拷贝

1 name_list = ["Alex","Tenglan","Eric","Rain","Tom","Amy"]
2 copy_list = name_list.copy()                #copy拷贝，对列表进行复制
3 print(copy_list)
4 #注：之后会整理深浅copy的详细区分

View Code

⑧ 统计

1 name_list = ["Alex","Tenglan","Eric","Amy","Tom","Amy"]
2 print(name_list.count("Amy"))               #count统计,统计列表Amy的个数
3 #2

View Code

⑨ 排序和翻转

1 name_list = ["Alex","Tenglan","Eric","Rain","Tom","Amy","1","2","3"]
2 name_list.sort()                              #sort排序，对列表进行排序
3 print(name_list)
4 #['1', '2', '3', 'Alex', 'Amy', 'Eric', 'Rain', 'Tenglan', 'Tom']
5 name_list.reverse()                           #reverse翻转，对列表进行翻转
6 print(name_list)
7 #['Tom', 'Tenglan', 'Rain', 'Eric', 'Amy', 'Alex', '3', '2', '1']

View Code

⑩ 获取下标

1 name_list = ["Alex","Tenglan","Eric","Rain","Tom","Amy"]
2 print(name_list.index("Tenglan"))              #index索引,获取字符的下标
3 #1

View Code

六、元组

如：(11,22,33,44,55)、('wupeiqi', 'alex','lzl')

1、创建元组：

 1 #5种创建方式
 2 age = 11,22,33,44,55            #直接写数字或者字符串，默认创建类型元组 字符串类型用引号'solo'
 3 #输出： (11, 22, 33, 44, 55)   
 4 age = (11,22,33,44,55)          #常见命名方式,()指定类型元组
 5 #输出： (11, 22, 33, 44, 55)
 6 age = tuple((11,22,33,44,55))   #tuple 以类的方式创建(()) 双括号 里面的()不可去掉
 7 #输出： (11, 22, 33, 44, 55)
 8 age = tuple([11,22,33,44,55])   #同(()) 效果一样 很少用 忘记它
 9 #输出： (11, 22, 33, 44, 55)
10 age = tuple({11,22,33,44,55})   #({})创建的元组，随机排列  没卵用
11 #输出： (33, 11, 44, 22, 55)

View Code

2、元组类常用功能：

1 ##count　　　　　　　 #统计元组字符出现的次数　　　
2 name =  ('wupeiqi', 'alex','solo')
3 print(name.count('alex'))             
4 # 1
5 ##index             #查看字符串所在的索引位置
6 name =  ('wupeiqi', 'alex','solo')
7 print(name.index('solo'))               
8 # solo

View Code

七、字典无序
如：{'name': 'wupeiqi', 'age': 18} 、{'host': 'solo.solo.solo.solo', 'port': 80}
注：字典一种key：value 的数据类型，也称键值对。字典dict是无序的，key值必须是唯一的，不能有重复。循环时，默认循环的是key
1、创建字典

1 #两种创建方式：
2 info_dic = {'stu1101': "TengLan Wu",'stu1102': "LongZe Luola",'stu1103': "XiaoZe Maliya",}
3 print(info_dic)
4 #{'stu1102': 'LongZe Luola', 'stu1101': 'TengLan Wu', 'stu1103': 'XiaoZe Maliya'}
5 info_dic = dict({'stu1101': "TengLan Wu",'stu1102': "LongZe Luola",'stu1103': "XiaoZe Maliya",})
6 print(info_dic)
7 #{'stu1102': 'LongZe Luola', 'stu1101': 'TengLan Wu', 'stu1103': 'XiaoZe Maliya'}

View Code

2、字典类常用功能：

① 增加

1 info_dic = {'stu1101': "TengLan Wu",'stu1102': "LongZe Luola",'stu1103': "XiaoZe Maliya",}
2 info_dic['stu1104'] = "JingKong Cang"           #增加
3 print(info_dic)

View Code

② 修改

1 info_dic = {'stu1101': "TengLan Wu",'stu1102': "LongZe Luola",'stu1103': "XiaoZe Maliya",}
2 info_dic["stu1101"] = "Jingkong Cang"         #有相应的key时为修改，没有为增加
3 print(info_dic)

View Code

③ 删除

 1 #3种删除方式
 2 info_dic = {'stu1101': "TengLan Wu",'stu1102': "LongZe Luola",'stu1103': "XiaoZe Maliya",}
 3 info_dic.pop('stu1101')                       #pop删除,指定删除的key
 4 print(info_dic)
 5 #{'stu1103': 'XiaoZe Maliya', 'stu1102': 'LongZe Luola'}
 6 del info_dic['stu1102']                      #del删除，指定删除的key
 7 print(info_dic)
 8 #{'stu1103': 'XiaoZe Maliya'}
 9 info_dic = {'stu1101': "TengLan Wu",'stu1102': "LongZe Luola",'stu1103': "XiaoZe Maliya",}
10 info_dic.popitem()                             #随机删除，没卵用
11 print(info_dic)
12 #{'stu1101': 'TengLan Wu', 'stu1103': 'XiaoZe Maliya'}

View Code

④ 查找value值

1 info_dic = {'stu1101': "TengLan Wu",'stu1102': "LongZe Luola",'stu1103': "XiaoZe Maliya",}
2 print(info_dic.get('stu1102'))     #get查找，通过key查找value值
3 #LongZe Luola
4 print(info_dic['stu1102'])         #通过key直接查找，但是如果输入查找的key不存在的话，就会报错，get则不会
5 #LongZe Luola

View Code

⑤ 字典多级嵌套

 1 av_catalog = {
 2     "欧美":{
 3         "www.youporn.com": ["很多免费的,世界最大的","质量一般"],
 4         "www.pornhub.com": ["很多免费的,也很大","质量比yourporn高点"],
 5         "letmedothistoyou.com": ["多是自拍,高质量图片很多","资源不多,更新慢"],
 6         "x-art.com":["质量很高,真的很高","全部收费,屌比请绕过"]
 7     },
 8     "日韩":{
 9         "tokyo-hot":["质量怎样不清楚,个人已经不喜欢日韩范了","听说是收费的"]
10     },
11     "大陆":{
12         "1024":["全部免费,真好,好人一生平安","服务器在国外,慢"]
13     }
14 }
15  
16 av_catalog["大陆"]["1024"][1] += ",可以用爬虫爬下来"
17 print(av_catalog["大陆"]["1024"])
18 #['全部免费,真好,好人一生平安', '服务器在国外,慢,可以用爬虫爬下来']

View Code

⑥ 循环

 1 info_dic = {'stu1101': "TengLan Wu",'stu1102': "LongZe Luola",'stu1103': "XiaoZe Maliya",}
 2 for stu_nu in info_dic:
 3     print(stu_nu,info_dic[stu_nu])        #循环默认提取的是key
 4 #stu1103 XiaoZe Maliya
 5 #stu1101 TengLan Wu
 6 #stu1102 LongZe Luola
 7 for k,v in info_dic.items():              #先把dict生成list，数据量大的时候费时，不建议使用
 8     print(k,v)
 9 #stu1103 XiaoZe Maliya
10 #stu1101 TengLan Wu
11 #stu1102 LongZe Luola

View Code

八、集合
如：{'solo', 33, 'alex', 22, 'eric', 'wupeiqi', 11}
注：集合是一个无序的，不重复的数据组合。去重性，把一个列表变成集合，就自动去重了。关系测试，测试两组数据之前的交集、差集、并集
1、创建集合

1 #标准创建方式
2 info_set = set(["alex","wupeiqi","eric","solo",11,22,33])
3 print(info_set,type(info_set))
4 #{33, 11, 'wupeiqi', 'solo', 'alex', 'eric', 22} <class 'set'>

View Code

2、集合类常用功能
① 添加

1 #添加的两种方式
2 set_1 = set(["alex","wupeiqi","eric","solo"])
3 set_1.add(11)                         #add只能添加一个元素
4 print(set_1)
5 #{'alex', 'solo', 'eric', 11, 'wupeiqi'}
6 set_1 = set(["alex","wupeiqi","eric","solo"])
7 set_1.update([11,22,33])
8 print(set_1)                           #update可以添加多个元素
9 #{33, 11, 'alex', 'wupeiqi', 'eric', 22, 'solo'}

View Code

② 删除

 1 #删除的三种方式
 2 set_1 = set(["alex","wupeiqi","eric","solo",11,22,33])
 3 set_1.remove("alex")                    #remove 删除指定元素
 4 print(set_1)
 5 #{'eric', 33, 'solo', 11, 22, 'wupeiqi'}
 6 set_1.pop()                             #pop 随机删除元素
 7 print(set_1)
 8 #{33, 'wupeiqi', 11, 22, 'solo'}
 9 set_1.discard("solo")                   #discard 删除指定元素，与remove区别在于，如果元素不存在也不会报错
10 set_1.discard(55)
11 print(set_1)
12 #{33, 'wupeiqi', 11, 22}

View Code

3、集合关系测试
① 交集

1 set_1 = set(["alex","wupeiqi","eric","solo",11,22,33])
2 set_2 = set([11,22,33,44,55,66])
3 print(set_1.intersection(set_2))            #intersection 取两个set的交集 set_1和set_2可以互换位置
4 #{33, 11, 22}

View Code

② 并集

1 set_1 = set(["alex","wupeiqi","eric","solo",11,22,33])
2 set_2 = set([11,22,33,44,55,66])
3 print(set_1.union(set_2))                     #union 取两个set集合的并集 set_1和set_2可以互换位置
4 #{33, 66, 11, 44, 'eric', 55, 'solo', 22, 'wupeiqi', 'alex'}

View Code

③ 差集

1 set_1 = set(["alex","wupeiqi","eric","solo",11,22,33])
2 set_2 = set([11,22,33,44,55,66])
3 print(set_1.difference(set_2))                 #difference  取两个set集合的差集 set_1有但是set_2没有的集合
4 #{'solo', 'eric', 'wupeiqi', 'alex'}

View Code

④ 子集、父集

1 set_1 = set(["alex","wupeiqi","eric","solo",11,22,33])
2 set_2 = set([11,22,33,44,55,66])
3 set_3 = set([11,22,33])
4 print(set_1.issubset(set_2))                      #issubset 子集
5 #False
6 print(set_1.issuperset(set_3))                    #issuperset 父集
7 #True

View Code

⑤ 对称差集

1 set_1 = set(["alex","wupeiqi","eric","solo",11,22,33])
2 set_2 = set([11,22,33,44,55,66])
3 print(set_1.symmetric_difference(set_2))           #symmetric_difference 对称差集=两个集合并集减去合集
4 #{66, 'solo', 'eric', 'alex', 55, 'wupeiqi', 44}

View Code

⑥ 运算符做关系测试

1 set_1 = set(["alex","wupeiqi","eric","solo",11,22,33])
2 set_2 = set([11,22,33,44,55,66])
3 set_union = set_1 | set_2           # 并集
4 set_intersection = set_1 & set_2    # 交集
5 set_difference = set_1 - set_2      # 差集
6 set_symmetric_difference = set_1 ^ set_2  # 对称差集

View Code

六、模块初识

Python有大量的模块，从而使得开发Python程序非常简洁。类库有包括三中：
① 、Python内部提供的模块
②、业内开源的模块
③、程序员自己开发的模块：Python脚本的名字不要与模块名相同

1、sys模块（系统内置）
① sys.argv 用来捕获执行python脚本时传入的参数
② sys.stdin 标准信息输入
③ sys.stdout 标准定向输出
④ sys.stdout.flush 强制刷新标准输出缓存

1 import time
2 import sys
3 for i in range(5):
4     print(i),
5     sys.stdout.flush()
6     time.sleep(1)
7 # 这样设计是为了打印一个数每秒五秒钟，但如果您运行它，因为它是现在（取决于您的默认系统缓冲），
8 # 你可能看不到任何输出 CodeGo.net，直到再一次全部，你会看到0 1 2 3 4打印到屏幕上。
9 # 这是输出被缓冲，除非你sys.stdout之后每print你不会看到从输出中取出sys.stdout.flush()网上看到的差别

View Code

2、os模块（与系统进行交互）
① os.dir、os.popen调用当前系统命令

3、platform模块（识别当前运行的系统）

七、运算符
1、算数运算：

2、比较运算：

3、赋值运算：

4、逻辑运算：

5、成员运算：

6、身份运算：

7、位运算：

8、运算符优先级：

八、深浅拷贝剖析
1、对象赋值（创建列表变量Alex，变量包含子列表，通过变量Alex给变量solo赋值，然后对变量Alex的元素进行修改，此时solo会有什么变化呢？）

 1 import copy                     #import调用copy模块
 2  
 3 Alex = ["Alex", 28, ["Python", "C#", "JavaScript"]]
 4 solo = Alex                       #直接赋值
 5  
 6 #   修改前打印
 7 print(id(Alex))
 8 print(Alex)
 9 print([id(adr) for adr in Alex])
10 # 输出：  7316664
11 #        ['Alex', 28, ['Python', 'C#', 'JavaScript']]
12 #        [2775776, 1398430400, 7318024]
13 print(id(solo))
14 print(solo)
15 print([id(adr) for adr in solo])
16 # 输出：  7316664
17 #        ['Alex', 28, ['Python', 'C#', 'JavaScript']]
18 #        [2775776, 1398430400, 7318024]
19  
20 #    对变量进行修改
21 Alex[0]='Mr.Wu'
22 Alex[2].append('CSS')
23 print(id(Alex))
24 print(Alex)
25 print([id(adr) for adr in Alex])
26 # 输出：  7316664
27 #        ['Mr.Wu', 28, ['Python', 'C#', 'JavaScript', 'CSS']]
28 #        [5170528, 1398430400, 7318024]
29 print(id(solo))
30 print(solo)
31 print([id(adr) for adr in solo])
32 # 输出：  7316664
33 #        ['Mr.Wu', 28, ['Python', 'C#', 'JavaScript', 'CSS']]
34 #        [5170528, 1398430400, 7318024]

View Code

初始条件： Alex = ["Alex", 28, ["Python", "C#", "JavaScript"]]
对象赋值： solo = Alex #直接赋值
对象赋值结果：solo = ["Alex", 28, ["Python", "C#", "JavaScript"]]
对象赋值时是进行对象引用（内存地址）的传递，被赋值的变量并没有开辟新内存，两个变量共用一个内存地址

修改对象赋值：solo = ['Mr.Wu', 28, ['Python', 'C#', 'JavaScript', 'CSS']]
str是不可变类型，所以当修改元素Alex为Mr.Wu时，内存地址发生改变；list是可变类型，元素['Python', 'C#', 'JavaScript', 'CSS']修改完后，内存地址没有改变

2、浅拷贝（创建列表变量Alex，变量包含子列表，通过copy模块的浅拷贝函数copy()对变量Alex进行拷贝，当对Alex进行操作时，此时solo会如何变化？）

 1 import copy                     #import调用copy模块
 2 Alex = ["Alex", 28, ["Python", "C#", "JavaScript"]]
 3 solo = copy.copy(Alex)                       #通过copy模块里面的浅拷贝函数copy()
 4 #   修改前打印
 5 print(id(Alex))
 6 print(Alex)
 7 print([id(adr) for adr in Alex])
 8 # 输出：  10462472
 9 #        ['Alex', 28, ['Python', 'C#', 'JavaScript']]
10 #        [5462752, 1359960768, 10463232]
11 print(id(solo))
12 print(solo)
13 print([id(adr) for adr in solo])
14 # 输出：  10201848
15 #        ['Alex', 28, ['Python', 'C#', 'JavaScript']]
16 #        [5462752, 1359960768, 10463232]
17 #    对变量进行修改
18 Alex[0]='Mr.Wu'
19 Alex[2].append('CSS')
20 print(id(Alex))
21 print(Alex)
22 print([id(adr) for adr in Alex])
23 # 输出：  10462472
24 #        ['Mr.Wu', 28, ['Python', 'C#', 'JavaScript', 'CSS']]
25 #        [10151264, 1359960768, 10463232]
26 print(id(solo))
27 print(solo)
28 print([id(adr) for adr in solo])
29 # 输出：  10201848
30 #        ['Alex', 28, ['Python', 'C#', 'JavaScript', 'CSS']]
31 #        [5462752, 1359960768, 10463232]

View Code

初始条件： Alex = ["Alex", 28, ["Python", "C#", "JavaScript"]]
浅拷贝： solo = copy.copy(Alex) #通过copy模块里面的浅拷贝函数copy()
浅拷贝结果： solo = ["Alex", 28, ["Python", "C#", "JavaScript"]]
浅拷贝时变量solo新建了一块内存(10201848),此内存记录了list中元素的地址；对于list中的元素，浅拷贝会使用原始元素的引用（内存地址）

修改浅拷贝： solo = ['Mr.Wu', 28, ['Python', 'C#', 'JavaScript', 'CSS']]
str是不可变类型，所以当修改元素Alex为Mr.Wu时，内存地址发生改变；list是可变类型，元素['Python', 'C#', 'JavaScript', 'CSS']修改完后，内存地址没有改变

3、深拷贝（创建列表变量Alex，变量包含子列表，通过copy模块的深拷贝函数deepcopy()对变量Alex进行拷贝，当对Alex进行操作时，此时solo会如何变化？）

 1 #!/usr/bin/env python
 2 # -*- coding:utf-8 -*-
 3 #-Author-Lian
 4 #   深拷贝
 5 import copy                     #import调用copy模块
 6  
 7 Alex = ["Alex", 28, ["Python", "C#", "JavaScript"]]
 8 solo = copy.deepcopy(Alex)                       #通过copy模块里面的深拷贝函数deepcopy()
 9  
10 #   修改前打印
11 print(id(Alex))
12 print(Alex)
13 print([id(adr) for adr in Alex])
14 # 输出：  6202712
15 #        ['Alex', 28, ['Python', 'C#', 'JavaScript']]
16 #        [4086496, 1363237568, 6203472]
17 print(id(solo))
18 print(solo)
19 print([id(adr) for adr in solo])
20 # 输出：  6203032
21 #        ['Alex', 28, ['Python', 'C#', 'JavaScript']]
22 #        [4086496, 1363237568, 6203512]
23  
24 #    对变量进行修改
25 Alex[0]='Mr.Wu'
26 Alex[2].append('CSS')
27 print(id(Alex))
28 print(Alex)
29 print([id(adr) for adr in Alex])
30 # 输出：  6202712
31 #        ['Mr.Wu', 28, ['Python', 'C#', 'JavaScript', 'CSS']]
32 #        [5236064, 1363237568, 6203472]
33 print(id(solo))
34 print(solo)
35 print([id(adr) for adr in solo])
36 # 输出：  6203032
37 #        ['Alex', 28, ['Python', 'C#', 'JavaScript']]
38 #        [4086496, 1363237568, 6203512]

View Code

初始条件： Alex = ["Alex", 28, ["Python", "C#", "JavaScript"]]
深拷贝： solo = copy.deepcopy(Alex) #通过copy模块里面的深拷贝函数deepcopy()
深拷贝结果： solo = ["Alex", 28, ["Python", "C#", "JavaScript"]]
深拷贝时变量solo新建了一块内存(10201848),此内存记录了list中元素的地址；但是，对于list中第三个元素（['Python', 'C#', 'JavaScript']）重新生成了一个地址（6203512），此时两个变量的第三个元素的内存引用地址不同

修改深拷贝： solo = ['Mr.Wu', 28, ['Python', 'C#', 'JavaScript', 'CSS']]
str是不可变类型，所以当修改元素Alex为Mr.Wu时，内存地址发生改变；list是可变类型，元素['Python', 'C#', 'JavaScript', 'CSS']修改完后，内存地址没有改变，但是Alex和solo在第三个元素引用的本就不同

4、对于拷贝有一些特殊情况
（1）对于非容器类型（如数字、字符串、和其他'原子'类型的对象）没有拷贝这一说
（2）也就是说，对于这些类型，"obj is copy.copy(obj)" 、"obj is copy.deepcopy(obj)"
（3）如果元祖变量只包含原子类型对象，则不能深拷贝
①为什么要拷贝？
答：当进行修改时，想要保留原来的数据和修改后的数据
②数字字符串和集合在修改时的差异？（深浅拷贝不同的终极原因）
答：在修改数据时：
数字字符串：在内存中新建一份数据
集合：修改内存中的同一份数据
③对于集合，如何保留其修改前和修改后的数据？
答：在内存中拷贝一份
④对于集合，如何拷贝其n层元素同时拷贝？
答：深拷贝

九、文件操作

（1）打开文件： 文件句柄 = file('文件路径', '模式')
python中打开文件有两种方式，即：open(...) 和 file(...) ，本质上前者在内部会调用后者来进行文件操作，推荐使用 open。
1、打开文件的模式：
　　r，只读模式【默认】
　　w，只写模式【不可读；不存在则创建；存在则删除内容；】
　　a，追加模式【不可读；不存在则创建；存在则只追加内容；】
2、"+" 同时读写某个文件：
　　r+，可读写文件。【可读；可写；可追加】
　　w+，写读
　　a+，追加读

1 总结1：r+模式下，如果在.write()进行写入内容前，有print()输出，则要写的内容会从文件尾部开始写入，使用的是读、追加模式；如果在.write()进行写入内容前，是seek()移动光标，则要写的内容会从移动到的光标开始进行写入，会把原来的内容覆盖掉，而不是整体后移，这点要记住；如果在.write()进行写入内容前，既没有print()也没有seek()光标移动，这种情况之前想的的情况，就是r+读写模式能先写后读吗？r+模式下默认光标在文件的首部，此时会直接从文件开头进行写入，效果等同于seek(0)。关于最后一点，参考a+模式。
2 总结2：读写模式一定要先写后读吗？能不能先读后写？  如果先读的话，由于用的是w+模式打开的文件，打开后会清空原文件内容，所有读取的到东西是空的。另W+模式后期用的很少，了解即可，包括a+追加读这种模式；另w+模式下，光标会跟随文件写入移到到文件末尾，不用seek移到光标的话，打印内容为空
3 注：w+模式下，关于.write()跟seek()和print()的关系与r+模式下是一样一样的。w+打开文件后先清空，然后追加写，如果.write()前有seek()的话会从光标位置覆盖写。
4 总结3：通过上面的程序可以得出，a+模式下光标位置为文件末尾，如果要print()的话要结合seek()进行使用；另外与r+、w+不同的是，.write()与seek()没有关系，只能写内容到文件末尾，一直都是追加模式！

小结　

3、"U"表示在读取时，可以将 \r \n \r\n自动转换成 \n （与 r 或 r+ 模式同使用）
　　rU
　　r+U
4、"b"表示处理二进制文件（如：FTP发送上传ISO镜像文件，linux可忽略，windows处理二进制文件时需标注）
　　rb 二进制读
　　wb 二进制写(ab也一样)
　　ab
（2）文件操作常用功能：
1、read()、readline()、readlines()的区别
　　print(info_file.read()) #read参数，读取文件所有内容
　　print(info_file.readline()) #readline，只读取文章中的一行内容
　　print(info_file.readlines()) #readlines，把文章内容以换行符分割，并生成list格式，数据量大的话不建议使用
2、seek、tell光标
　　data = info_file.read() #默认光标在起始位置，.read（）读取完后，光标停留到文件末尾
　　print(info_file.tell()) #tell 获取当前的光标位
　　info_file.seek(0) #seek 移动光标到文件首部
3、文件循环
　　for index,line in enumerate(info_file.readlines()): #先把文件内容以行为分割生成列表，数据量大不能用
　　for line in info_file: #建议使用方法，每读取一行，内存会把之前的空间清空，不会占用太多内存
4、flush 刷新
　　sys.stdout.flush() #flush 强制刷新缓存到内存的数据写入硬盘
5、truncate 截断
　　truncate跟光标位置无关，从文件首部开始截取字符；如果是truncate(0)会把文件清空
6、with 语句
　　为了避免打开文件后忘记关闭，可以通过管理上下文，即：
　　　　with open('log','r') as f:
　　　　...
如此方式，当with代码块执行完毕时，内部会自动关闭并释放文件资源。在Python 2.7 后，with又支持同时对多个文件的上下文进行管理，即：
　　　　with open('log1') as obj1, open('log2') as obj2:
　　　　pass　
（3）文件修改方式：
1、把文件读取到内存当中，对内存进行修改，把修改后的内容写入到原文件(旧内容被清空)
2、如果在硬盘上直接写，会进行覆盖，硬盘上不能进行插入，原来的内容不会整体后移，而是直接覆盖掉
3、把文件读取到内存当中，对内存进行修改，把修改的内容另存为新的文件(旧文件保留)
　　① 另存方式
　　② r+模式
　　③ a+模式

十、函数

①格式

1     def 函数名(参数):
2             ....
3             函数体
4             ....
5             return 返回值
6     函数名()

②形参： def func(name): // name 叫做函数func的形式参数，简称：形参
③实参： func("solo") // 'solo' 叫做函数func的实际参数，简称：实参
④默认参数： def stu_register(name,age,course,country="CN") // 位置参数
⑤关键参数： stu_register(age=22,name='lzl',course="python")　 // 关键参数必须放在位置参数之后

形参：在定义函数时，括号内的参数成为形参
特点：形参就是变量名
# def foo(x,y): #x=1,y=2
#     print(x)
#     print(y)

实参：在调用函数时，括号内的参数成为实参
特点：实参就是变量值
# foo(1,2)

在调用阶段实参（变量值）才会绑定形参（变量名）
调用结束后，解除绑定

参数的分类
位置参数：按照从左到右的顺序依次定义的参数
    位置形参：必须被传值，并且多一个不行，少一个也不行
    位置实参：与形参按照位置一一对应

# def foo(x,y):
#     print(x)
#     print(y)
#
# foo('egon',1,2)


关键字实参：指的是按照name=value的形式，指名道姓地给name传值
# def foo(name,age):
#     print(name)
#     print(age)



# foo('egon',18)
# foo(age=18,name='egon')

关键字实参需要注意的问题是：
# def foo(name,age,sex):
#     print(name)
#     print(age)
#     print(sex)

# foo('egon',18,'male')
# print('======>')
# foo(sex='male',age=18,name='egon')
# foo('egon',sex='male',age=18)

问题一:语法规定位置实参必须在关键字实参的前面
# foo('egon',sex='male',age=18)

问题二:一定不要对同一个形参传多次值
# foo('egon',sex='male',age=18,name='egon1')

# foo('male',age=18,name='egon1')



默认参数（默认形参）：在定义阶段，就已经为形参赋值，意味在调用阶段可以不用传值
# def foo(x,y=1111111):
#     print(x)
#     print(y)
#
#
# foo(1,'a')
#
# def register(name,age,sex='male'):
#     print(name,age,sex)
#
#
# register('asb',73)
# register('wsb',38)
# register('ysb',84)
# register('yaya',28,'female')


默认参数需要注意的问题
问题一:默认参数必须放在位置参数之后
# def foo(y=1,x):
#     print(x,y)

问题二：默认参数只在定义阶段赋值一次，而且仅一次
# x=100
# def foo(a,b=x):
#     print(a,b)
#
# x=111111111111111111111111111111
# foo('egon'

问题三：默认参数的值应该定义成不可变类型



#可变长参数指的是实参的个数多了
#实参无非位置实参和关键字实参两种

#形参必须要两种机制来分别处理按照位置定义的实参溢出的情况：*
#跟按照关键字定义的实参溢出的情况：**

# def foo(x,y,*args): #nums=(3,4,5,6,7)
#     print(x)
#     print(y)
#     print(args)

# foo(1,2,3,4,5,6,7) #*
# foo(1,2) #*



#*args的扩展用法
# def foo(x,y,*args): #*args=*(3,4,5,6,7)
#     print(x)
#     print(y)
#     print(args)
#
# # foo(1,2,3,4,5,6,7) #*
#
#
# foo(1,2,*(3,4,5,6,7)) #foo(1,2,3,4,5,6,7)


# def foo(x,y=1,*args): #
#     print(x)
#     print(y)
#     print(args)
#
# # foo('a','b',*(1,2,3,4,5,6,7)) #foo('a','b',1,2,3,4,5,6,7)
# # foo('egon',10,2,3,4,5,6,9,y=2) #报错
# foo('egon',10,2,3,4,5,6,9)









# def foo(x,y,**kwargs): #nums={'z':3,'b':2,'a':1}
#     print(x)
#     print(y)
#     print(kwargs)
# foo(1,2,z=3,a=1,b=2) #**


# def foo(x,y,**kwargs): #kwargs={'z':3,'b':2,'a':1}
#     print(x)
#     print(y)
#     print(kwargs)
#
# foo(1,2,**{'z':3,'b':2,'a':1}) #foo(1,2,a=1,z=3,b=2)


# def foo(x, y):  #
#     print(x)
#     print(y)
#
# foo(**{'y':1,'x':2})  # foo(y=1,x=2)







# def foo(x,*args,**kwargs):#args=(2,3,4,5) kwargs={'b':1,'a':2}
#     print(x)
#     print(args)
#     print(kwargs)
#
#
# foo(1,2,3,4,5,b=1,a=2)


#这俩东西*args，**kwargs干甚用？？？
def register(name,age,sex='male'):
    print(name)
    print(age)
    print(sex)


# def wrapper(*args,**kwargs): #args=(1,2,3) kwargs={'a':1,'b':2}
#     # print(args)
#     # print(kwargs)
#     register(*args,**kwargs)
#     # register(*(1, 2, 3),**{'a': 1, 'b': 2})
#     # register(1, 2, 3,a=1,b=2)
#
#
# wrapper(1,2,3,a=1,b=2)

import time

# def register(name,age,sex='male'):
#     # start_time=time.time()
#     print(name)
#     print(age)
#     print(sex)
#     time.sleep(3)
    # stop_time=time.time()
    # print('run time is %s' %(stop_time-start_time))

# def wrapper(*args, **kwargs): #args=('egon',) kwargs={'age':18}
#     start_time=time.time()
#     register(*args, **kwargs)
#     stop_time=time.time()
#     print('run time is %s' %(stop_time-start_time))
#
#
# wrapper('egon',age=18)

# register('egon',18)










#命名关键字参数:  在*后面定义的形参称为命名关键字参数，必须是被以关键字实参的形式传值
# def foo(name,age,*args,sex='male',group):
#     print(name)
#     print(age)
#     print(args)
#     print(sex)
#     print(group)
#
# foo('alex',18,19,20,300,group='group1')





def foo(name,age=18,*args,sex='male',group,**kwargs):
    pass

参数另一种较好的讲解！

⑥动态参数/非固定参数（*args 和 **kwargs）：

1 （1）*args：*args会把多传入的实参变成一个元组的类型；即使传入的是list类型也会变成元组，成为元组中的一个元素；另函数中有*args与其他形参的时候，*args一定要写到其 他形参的后面，否则传入的实参都会被传入到*args当中打印成元组；还有如果没有多出传入的实参即*args没有值的时候，*args为空，不会报错。
2 （2）**kwargs：**kwargs会把多出的a=b这种类型的实参打印成字典的类型（要区分开与关键参数的区别，关键参数的实参有对应的形参），被当成多余的实参传入到了*args里面，所以**kwargs的值才为空，分别用*inf_list和**info_dict的方式传入到*args、**kwargs当中（stu_register("lzl",*info_list,**info_dict) //传入列表和字典）
3 总结：*args必须放到**kwargs前面（规定）；位置参数一定要放到关键参数之前（规定）；默认参数不能跟*args、**kwargs一块存在（会报错）。

⑦return 返回值： 如果不执行return，函数的默认返回值为None；当函数执行到return时，函数结束执行
⑧局部变量： name = "Alex Li" #定义变量name

1 def change_name(name):
2               name = "金角大王,一个有Tesla的男人"   #函数内部更改变量
3              函数内部对变量进行更改后，生效范围仅限于函数内部，对外部变量没有影响，这种变量称为局部变量；函数内部也可以让变量全局生效，需要加参数global，这种情况很少用。

⑨递归函数： 如果一个函数在内部调用自身本身，这个函数就是递归函数
条件：有结束条件、更深一层递归规模比上次递归有所减少、效率不高，递归层次过多会导致栈溢出

1 写一个递归：
2              def func(n1,n2):        #获取斐波那契数列100之前的数字
3                  if n1 > 100:
4                      return
5                  print(n1)
6                  n3 = n1 + n2
7                  func(n2,n3)
8              func(0,1)

⑩匿名函数：不需要显式的指定函数

1 #普通函数               #换成匿名函数
2 def calc(n):            calc = lambda n:n**n
3     return n**n         print(calc(10)
4 print(calc(10))

⑪高阶函数: 变量可以指向函数，函数的参数能接收变量，那么一个函数就可以接收另一个函数作为参数，这种函数就称之为高阶函数。

1 def add(x,y,f):
2     return f(x) + f(y)
3 res = add(3,-6,abs)
4 print(res)

⑫内置函数

⑬函数的调用顺序：被调用函数要在执行之前被定义

 1 #函数错误的调用方式
 2 def func():                     #定义函数func()
 3     print("in the func")
 4     foo()                       #调用函数foo()
 5 func()                          #执行函数func()
 6 def foo():                      #定义函数foo()
 7     print("in the foo")
 8 
 9 #函数正确的调用方式
10 def func():                     #定义函数func()
11     print("in the func")
12     foo()                       #调用函数foo()
13 def foo():                      #定义函数foo()
14     print("in the foo")
15 func()                          #执行函数func()

⑭高阶函数：1、某一函数当做参数传入另一个函数中。2、函数的返回值包含一个或多个函数

⑮内嵌函数：在一个函数体内创建另外一个函数（内嵌函数中定义的函数在全局中是无法直接执行的）

⑯装饰器：本质是函数（装饰其他函数），为其他函数添加附加功能的。
遵循原则: 1.不能修改被装饰函数的源代码 2.不能修改被装饰函数的调用方式
组成：装饰器由高阶函数+内嵌函数组成

⑰生成器：调用时才会生成相应数据的机制，称为生成器：generator

应用：可通过yield实现在单线程的情况下实现并发运算的效果(协程)

 1 #!/usr/bin/env python
 2 # -*- coding:utf-8 -*-
 3 #-Author-solo
 4 import time
 5 def consumer(name):
 6     print("%s 准备吃包子啦!" %name)
 7     while True:
 8        baozi = yield            #yield的作用:保存当前状态并返回
 9  
10        print("包子[%s]来了,被[%s]吃了!" %(baozi,name))
11 def producer(name):
12     c = consumer('A')            
13     c2 = consumer('B')       
14     c.__next__()　　　　　　　　#c.__next__()等同于next(c)
15     c2.__next__()               #next()作用：调用yield，不给yield传值
16     print("老子开始准备做包子啦!")
17     for i in range(10):
18         time.sleep(1)
19         print("%s做了2个包子!"%(name))
20         c.send(i)                #send()作用：调用yield，给yield传值
21         c2.send(i)
22 producer("solo")

协程

⑱迭代器

可迭代对象：可以直接作用于for循环的对象：Iterable
可以直接作用于for循环的数据类型有：1、集合数据类型，如list、tuple、dict、set、str等；2、生成器，包括generator和带yield的generator function；
可以用isinstance()去判断一个对象是否是Iterable对象

from collections import Iterable
print(isinstance([], Iterable))
# True

迭代器：可以被next()函数调用并不断返回下一个值的对象称为迭代器：Iterator。
用isinstance()判断一个对象是否是Iterator对象

from collections import Iterator
print(isinstance([], Iterator))
# True

小结：
1、凡是可作用于for循环的对象都是Iterable类型；
2、凡是可作用于next()函数的对象都是Iterator类型，它们表示一个惰性计算的序列；
3、集合数据类型如list、dict、str等是Iterable但不是Iterator，不过可以通过iter()函数获得一个Iterator对象；
4、Python的for循环本质上就是通过不断调用next()函数实现的

1 for x in [1, 2, 3, 4, 5]:
2     pass

for循环

 1 首先获得Iterator对象:
 2 it = iter([1, 2, 3, 4, 5])
 3 # 循环:
 4 while True:
 5     try:
 6         # 获得下一个值:
 7         x = next(it)
 8     except StopIteration:
 9         # 遇到StopIteration就退出循环
10         break

等价效果（迭代器）

十一、常用模块

（一）、导入模块：导入模块的本质就是把python文件解释一遍；导入包的本质就是把包文件下面的init.py文件运行一遍

（二）、常用模块：

（1）time和datatime模块
　　时间相关的操作，时间有三种表示方式：1、时间戳 1970年1月1日之后的秒，即：time.time()
2、格式化的字符串 2014-11-11 11:11，即：time.strftime('%Y-%m-%d')
3、结构化时间元组包含了：年、日、星期等... time.struct_time 即：time.localtime()

 1 import time
 2 print(time.time())              #时间戳
 3 #1472037866.0750718
 4 print(time.localtime())        #结构化时间
 5 #time.struct_time(tm_year=2016, tm_mon=8, tm_mday=25, tm_hour=8, tm_min=44, tm_sec=46, tm_wday=3, tm_yday=238, tm_isdst=0)
 6 print(time.strftime('%Y-%m-%d'))    #格式化的字符串
 7 #2016-08-25
 8 print(time.strftime('%Y-%m-%d',time.localtime()))
 9 #2016-08-25
10 print(time.gmtime())            #结构化时间
11 #time.struct_time(tm_year=2016, tm_mon=8, tm_mday=25, tm_hour=3, tm_min=8, tm_sec=48, tm_wday=3, tm_yday=238, tm_isdst=0)
12 print(time.strptime('2014-11-11', '%Y-%m-%d'))  #结构化时间
13 #time.struct_time(tm_year=2014, tm_mon=11, tm_mday=11, tm_hour=0, tm_min=0, tm_sec=0, tm_wday=1, tm_yday=315, tm_isdst=-1)
14 print(time.asctime())
15 #Thu Aug 25 11:15:10 2016
16 print(time.asctime(time.localtime()))
17 #Thu Aug 25 11:15:10 2016
18 print(time.ctime(time.time()))
19 #Thu Aug 25 11:15:10 2016

time模块

 1 import datetime
 2 print(datetime.date)    #表示日期的类。常用的属性有year, month, day
 3 #<class 'datetime.date'>
 4 print(datetime.time)    #表示时间的类。常用的属性有hour, minute, second, microsecond
 5 #<class 'datetime.time'>
 6 print(datetime.datetime)        #表示日期时间
 7 #<class 'datetime.datetime'>
 8 print(datetime.timedelta)       #表示时间间隔，即两个时间点之间的长度
 9 #<class 'datetime.timedelta'>
10 print(datetime.datetime.now())
11 #2016-08-25 14:21:07.722285
12 print(datetime.datetime.now() - datetime.timedelta(days=5))
13 #2016-08-20 14:21:28.275460

datetime模块

 1 import time
 2 
 3 str = '2017-03-26 3:12'
 4 str2 = '2017-05-26 13:12'
 5 date1 = time.strptime(str, '%Y-%m-%d %H:%M')
 6 date2 = time.strptime(str2, '%Y-%m-%d %H:%M')
 7 if float(time.time()) >= float(time.mktime(date1)) and float(time.time()) <= float(time.mktime(date2)):
 8     print 'cccccccc'
 9 
10 
11 import datetime
12 
13 str = '2017-03-26 3:12'
14 str2 = '2017-05-26 13:12'
15 date1 = datetime.datetime.strptime(str,'%Y-%m-%d %H:%M')
16 date2 = datetime.datetime.strptime(str2,'%Y-%m-%d %H:%M')
17 datenow =  datetime.datetime.now()
18 if datenow <date1:
19     print 'dddddd'
20 
21 时间比较

时间比较

（2）random模块：生成随机数（验证码）

 1 #random随机数模块
 2 import random
 3 print(random.random())      #生成0到1的随机数
 4 #0.7308387398872364
 5 print(random.randint(1,3))  #生成1-3随机数
 6 #3
 7 print(random.randrange(1,3)) #生成1-2随机数，不包含3
 8 #2
 9 print(random.choice("hello"))  #随机选取字符串
10 #e
11 print(random.sample("hello",2))     #随机选取特定的字符
12 #['l', 'h']
13 items = [1,2,3,4,5,6,7]
14 random.shuffle(items)
15 print(items)
16 #[2, 3, 1, 6, 4, 7, 5]

生成随机数

 1 import random
 2 checkcode = ''
 3 for i in range(4):
 4     current = random.randrange(0,4)
 5     if current != i:
 6         temp = chr(random.randint(65,90))
 7     else:
 8         temp = random.randint(0,9)
 9     checkcode += str(temp)
10 print(checkcode)
11 #51T6

验证码

（3）os模块：用于提供系统级别的操作（比如目录、路径等的操作）

 1 import os
 2 os.getcwd() #获取当前工作目录，即当前python脚本工作的目录路径
 3 os.chdir("dirname")  #改变当前脚本工作目录；相当于shell下cd
 4 os.curdir  #返回当前目录: ('.')
 5 os.pardir  #获取当前目录的父目录字符串名：('..')
 6 os.makedirs('dirname1/dirname2')    #可生成多层递归目录
 7 os.removedirs('dirname1')   # 若目录为空，则删除，并递归到上一级目录，如若也为空，则删除，依此类推
 8 os.mkdir('dirname')   # 生成单级目录；相当于shell中mkdir dirname
 9 os.rmdir('dirname')    #删除单级空目录，若目录不为空则无法删除，报错；相当于shell中rmdir dirname
10 os.listdir('dirname')    #列出指定目录下的所有文件和子目录，包括隐藏文件，并以列表方式打印
11 os.remove() # 删除一个文件
12 os.rename("oldname","newname") # 重命名文件/目录
13 os.stat('path/filename') # 获取文件/目录信息
14 os.sep    #输出操作系统特定的路径分隔符，win下为"\\",Linux下为"/"
15 os.linesep    #输出当前平台使用的行终止符，win下为"\t\n",Linux下为"\n"
16 os.pathsep    #输出用于分割文件路径的字符串
17 os.name    #输出字符串指示当前使用平台。win->'nt'; Linux->'posix'
18 os.system("bash command")  #运行shell命令，直接显示
19 os.environ  #获取系统环境变量
20 os.path.abspath(path)  #返回path规范化的绝对路径
21 os.path.split(path)  #将path分割成目录和文件名二元组返回
22 os.path.dirname(path) # 返回path的目录。其实就是os.path.split(path)的第一个元素
23 os.path.basename(path) # 返回path最后的文件名。如何path以／或\结尾，那么就会返回空值。即os.path.split(path)的第二个元素
24 os.path.exists(path)  #如果path存在，返回True；如果path不存在，返回False
25 os.path.isabs(path)  #如果path是绝对路径，返回True
26 os.path.isfile(path)  #如果path是一个存在的文件，返回True。否则返回False
27 os.path.isdir(path)  #如果path是一个存在的目录，则返回True。否则返回False
28 os.path.join(path1[, path2[, ...]]) # 将多个路径组合后返回，第一个绝对路径之前的参数将被忽略
29 os.path.getatime(path)  #返回path所指向的文件或者目录的最后存取时间
30 os.path.getmtime(path)  #返回path所指向的文件或者目录的最后修改时间

os模块

（4）sys模块：用于提供对解释器相关的操作（比如退出程序、版本信息等）

1 import sys
2 sys.argv           #命令行参数List，第一个元素是程序本身路径
3 sys.exit(n)        #退出程序，正常退出时exit(0)
4 sys.version       # 获取Python解释程序的版本信息
5 sys.maxint         #最大的Int值
6 sys.path           #返回模块的搜索路径，初始化时使用PYTHONPATH环境变量的值
7 sys.platform      #返回操作系统平台名称
8 sys.stdout.write('please:')
9 val = sys.stdin.readline()[:-1]

sys模块

（5）shutil模块：高级的（文件、文件夹、压缩包）处理模块（比如文件的拷贝、压缩等）

① shutil.copyfileobj 将文件内容拷贝到另一个文件中，可以部分内容

1 def copyfileobj(fsrc, fdst, length=16*1024):
2     """copy data from file-like object fsrc to file-like object fdst"""
3     while 1:
4         buf = fsrc.read(length)
5         if not buf:
6             break
7         fdst.write(buf)

shutil.copyfileobj

1 import shutil
2 f1 = open("fsrc",encoding="utf-8")
3 f2 = open("fdst",encoding="utf-8")
4 shutil.copyfile(f1,f2)
5 #把文件f1里的内容拷贝到f2当中

把文件f1里的内容拷贝到f2当中

② shutil.copyfile 文件拷贝

 1 def copyfile(src, dst):
 2     """Copy data from src to dst"""
 3     if _samefile(src, dst):
 4         raise Error("`%s` and `%s` are the same file" % (src, dst))
 5     for fn in [src, dst]:
 6         try:
 7             st = os.stat(fn)
 8         except OSError:
 9             # File most likely does not exist
10             pass
11         else:
12             # XXX What about other special files? (sockets, devices...)
13             if stat.S_ISFIFO(st.st_mode):
14                 raise SpecialFileError("`%s` is a named pipe" % fn)
15     with open(src, 'rb') as fsrc:
16         with open(dst, 'wb') as fdst:
17             copyfileobj(fsrc, fdst)

shutil.copyfile

1 import shutil
2 shutil.copyfile("f1","f2")
3 #把文件f1里的内容拷贝到f2当中

把文件f1里的内容拷贝到f2当中

③ shutil.copymode(src, dst) 仅拷贝权限。内容、组、用户均不变

1 def copymode(src, dst):
2     """Copy mode bits from src to dst"""
3     if hasattr(os, 'chmod'):
4         st = os.stat(src)
5         mode = stat.S_IMODE(st.st_mode)
6         os.chmod(dst, mode)

shutil.copymode

④ shutil.copystat(src, dst) 拷贝状态的信息，包括：mode bits, atime, mtime, flags

 1 def copystat(src, dst):
 2     """Copy all stat info (mode bits, atime, mtime, flags) from src to dst"""
 3     st = os.stat(src)
 4     mode = stat.S_IMODE(st.st_mode)
 5     if hasattr(os, 'utime'):
 6         os.utime(dst, (st.st_atime, st.st_mtime))
 7     if hasattr(os, 'chmod'):
 8         os.chmod(dst, mode)
 9     if hasattr(os, 'chflags') and hasattr(st, 'st_flags'):
10         try:
11             os.chflags(dst, st.st_flags)
12         except OSError, why:
13             for err in 'EOPNOTSUPP', 'ENOTSUP':
14                 if hasattr(errno, err) and why.errno == getattr(errno, err):
15                     break
16             else:
17                 raise

shutil.copystat

⑤ shutil.copy(src, dst) 拷贝文件和权限

 1 def copy(src, dst):
 2     """Copy data and mode bits ("cp src dst").
 3 
 4     The destination may be a directory.
 5 
 6     """
 7     if os.path.isdir(dst):
 8         dst = os.path.join(dst, os.path.basename(src))
 9     copyfile(src, dst)
10     copymode(src, dst)

shutil.copy

⑥ shutil.copy2(src, dst) 拷贝文件和状态信息

 1 def copy2(src, dst):
 2     """Copy data and all stat info ("cp -p src dst").
 3 
 4     The destination may be a directory.
 5 
 6     """
 7     if os.path.isdir(dst):
 8         dst = os.path.join(dst, os.path.basename(src))
 9     copyfile(src, dst)
10     copystat(src, dst)

shutil.copy2

⑦ shutil.copytree(src, dst, symlinks=False, ignore=None) 递归的去拷贝文件拷贝多层目录

 1 def ignore_patterns(*patterns):
 2     """Function that can be used as copytree() ignore parameter.
 3 
 4     Patterns is a sequence of glob-style patterns
 5     that are used to exclude files"""
 6     def _ignore_patterns(path, names):
 7         ignored_names = []
 8         for pattern in patterns:
 9             ignored_names.extend(fnmatch.filter(names, pattern))
10         return set(ignored_names)
11     return _ignore_patterns
12 def copytree(src, dst, symlinks=False, ignore=None):
13     """Recursively copy a directory tree using copy2().
14 
15     The destination directory must not already exist.
16     If exception(s) occur, an Error is raised with a list of reasons.
17 
18     If the optional symlinks flag is true, symbolic links in the
19     source tree result in symbolic links in the destination tree; if
20     it is false, the contents of the files pointed to by symbolic
21     links are copied.
22 
23     The optional ignore argument is a callable. If given, it
24     is called with the `src` parameter, which is the directory
25     being visited by copytree(), and `names` which is the list of
26     `src` contents, as returned by os.listdir():
27 
28         callable(src, names) -> ignored_names
29 
30     Since copytree() is called recursively, the callable will be
31     called once for each directory that is copied. It returns a
32     list of names relative to the `src` directory that should
33     not be copied.
34 
35     XXX Consider this example code rather than the ultimate tool.
36 
37     """
38     names = os.listdir(src)
39     if ignore is not None:
40         ignored_names = ignore(src, names)
41     else:
42         ignored_names = set()
43 
44     os.makedirs(dst)
45     errors = []
46     for name in names:
47         if name in ignored_names:
48             continue
49         srcname = os.path.join(src, name)
50         dstname = os.path.join(dst, name)
51         try:
52             if symlinks and os.path.islink(srcname):
53                 linkto = os.readlink(srcname)
54                 os.symlink(linkto, dstname)
55             elif os.path.isdir(srcname):
56                 copytree(srcname, dstname, symlinks, ignore)
57             else:
58                 # Will raise a SpecialFileError for unsupported file types                copy2(srcname, dstname)
59         # catch the Error from the recursive copytree so that we can
60         # continue with other files
61         except Error, err:
62             errors.extend(err.args[0])
63         except EnvironmentError, why:
64             errors.append((srcname, dstname, str(why)))
65     try:
66         copystat(src, dst)
67     except OSError, why:
68         if WindowsError is not None and isinstance(why, WindowsError):
69             # Copying file access times may fail on Windows
70             pass
71         else:
72             errors.append((src, dst, str(why)))
73     if errors:
74         raise Error, errors

shutil.copytree

⑧ shutil.rmtree(path[, ignore_errors[, onerror]]) 递归的去删除文件

 1 def rmtree(path, ignore_errors=False, onerror=None):
 2     """Recursively delete a directory tree.
 3 
 4     If ignore_errors is set, errors are ignored; otherwise, if onerror
 5     is set, it is called to handle the error with arguments (func,
 6     path, exc_info) where func is os.listdir, os.remove, or os.rmdir;
 7     path is the argument to that function that caused it to fail; and
 8     exc_info is a tuple returned by sys.exc_info().  If ignore_errors
 9     is false and onerror is None, an exception is raised.
10 
11     """
12     if ignore_errors:
13         def onerror(*args):
14             pass
15     elif onerror is None:
16         def onerror(*args):
17             raise
18     try:
19         if os.path.islink(path):
20             # symlinks to directories are forbidden, see bug #1669
21             raise OSError("Cannot call rmtree on a symbolic link")
22     except OSError:
23         onerror(os.path.islink, path, sys.exc_info())
24         # can't continue even if onerror hook returns
25         return
26     names = []
27     try:
28         names = os.listdir(path)
29     except os.error, err:
30         onerror(os.listdir, path, sys.exc_info())
31     for name in names:
32         fullname = os.path.join(path, name)
33         try:
34             mode = os.lstat(fullname).st_mode
35         except os.error:
36             mode = 0
37         if stat.S_ISDIR(mode):
38             rmtree(fullname, ignore_errors, onerror)
39         else:
40             try:
41                 os.remove(fullname)
42             except os.error, err:
43                 onerror(os.remove, fullname, sys.exc_info())
44     try:
45         os.rmdir(path)
46     except os.error:
47         onerror(os.rmdir, path, sys.exc_info())

shutil.rmtree

⑨ shutil.move(src, dst) 递归的去移动文件

 1 def move(src, dst):
 2     """Recursively move a file or directory to another location. This is
 3     similar to the Unix "mv" command.
 4 
 5     If the destination is a directory or a symlink to a directory, the source
 6     is moved inside the directory. The destination path must not already
 7     exist.
 8 
 9     If the destination already exists but is not a directory, it may be
10     overwritten depending on os.rename() semantics.
11 
12     If the destination is on our current filesystem, then rename() is used.
13     Otherwise, src is copied to the destination and then removed.
14     A lot more could be done here...  A look at a mv.c shows a lot of
15     the issues this implementation glosses over.
16 
17     """
18     real_dst = dst
19     if os.path.isdir(dst):
20         if _samefile(src, dst):
21             # We might be on a case insensitive filesystem,
22             # perform the rename anyway.            os.rename(src, dst)
23             return
24 
25         real_dst = os.path.join(dst, _basename(src))
26         if os.path.exists(real_dst):
27             raise Error, "Destination path '%s' already exists" % real_dst
28     try:
29         os.rename(src, real_dst)
30     except OSError:
31         if os.path.isdir(src):
32             if _destinsrc(src, dst):
33                 raise Error, "Cannot move a directory '%s' into itself '%s'." % (src, dst)
34             copytree(src, real_dst, symlinks=True)
35             rmtree(src)
36         else:
37             copy2(src, real_dst)
38             os.unlink(src)

shutil.move

⑩ shutil.make_archive(base_name, format,...) 创建压缩包并返回文件路径，例如：zip、tar
base_name：压缩包的文件名，也可以是压缩包的路径。只是文件名时，则保存至当前目录，否则保存至指定路径，
　　　　　　　　如：www =>保存至当前路径
　　　　　　　　如：/Users/wupeiqi/www =>保存至/Users/wupeiqi/
format：压缩包种类，“zip”, “tar”, “bztar”，“gztar”
root_dir：要压缩的文件夹路径（默认当前目录）
owner：用户，默认当前用户
group：组，默认当前组
logger：用于记录日志，通常是logging.Logger对象

 1 def make_archive(base_name, format, root_dir=None, base_dir=None, verbose=0,
 2                  dry_run=0, owner=None, group=None, logger=None):
 3     """Create an archive file (eg. zip or tar).
 4 
 5     'base_name' is the name of the file to create, minus any format-specific
 6     extension; 'format' is the archive format: one of "zip", "tar", "bztar"
 7     or "gztar".
 8 
 9     'root_dir' is a directory that will be the root directory of the
10     archive; ie. we typically chdir into 'root_dir' before creating the
11     archive.  'base_dir' is the directory where we start archiving from;
12     ie. 'base_dir' will be the common prefix of all files and
13     directories in the archive.  'root_dir' and 'base_dir' both default
14     to the current directory.  Returns the name of the archive file.
15 
16     'owner' and 'group' are used when creating a tar archive. By default,
17     uses the current owner and group.
18     """
19     save_cwd = os.getcwd()
20     if root_dir is not None:
21         if logger is not None:
22             logger.debug("changing into '%s'", root_dir)
23         base_name = os.path.abspath(base_name)
24         if not dry_run:
25             os.chdir(root_dir)
26 
27     if base_dir is None:
28         base_dir = os.curdir
29 
30     kwargs = {'dry_run': dry_run, 'logger': logger}
31 
32     try:
33         format_info = _ARCHIVE_FORMATS[format]
34     except KeyError:
35         raise ValueError, "unknown archive format '%s'" % format
36 
37     func = format_info[0]
38     for arg, val in format_info[1]:
39         kwargs[arg] = val
40 
41     if format != 'zip':
42         kwargs['owner'] = owner
43         kwargs['group'] = group
44 
45     try:
46         filename = func(base_name, base_dir, **kwargs)
47     finally:
48         if root_dir is not None:
49             if logger is not None:
50                 logger.debug("changing back to '%s'", save_cwd)
51             os.chdir(save_cwd)
52 
53     return filename
54 
55 源码

源码

shutil 对压缩包的处理是调用 ZipFile 和 TarFile 两个模块来进行的，详细：

 1 import zipfile
 2 
 3 # 压缩
 4 z = zipfile.ZipFile('laxi.zip', 'w')
 5 z.write('a.log')
 6 z.write('data.data')
 7 z.close()
 8 
 9 # 解压
10 z = zipfile.ZipFile('laxi.zip', 'r')
11 z.extractall()
12 z.close()
13 
14 zipfile 压缩解压
15 
16 zipfile 压缩解压

zipfile 压缩解压

 1 import tarfile
 2 
 3 # 压缩
 4 tar = tarfile.open('your.tar','w')
 5 tar.add('/Users/wupeiqi/PycharmProjects/bbs2.zip', arcname='bbs2.zip')
 6 tar.add('/Users/wupeiqi/PycharmProjects/cmdb.zip', arcname='cmdb.zip')
 7 tar.close()
 8 
 9 # 解压
10 tar = tarfile.open('your.tar','r')
11 tar.extractall()  # 可设置解压地址
12 tar.close()
13 
14 tarfile 压缩解压
15 
16 tarfile 压缩解压

tarfile 压缩解压

  1 class ZipFile(object):
  2     """ Class with methods to open, read, write, close, list zip files.
  3 
  4     z = ZipFile(file, mode="r", compression=ZIP_STORED, allowZip64=False)
  5 
  6     file: Either the path to the file, or a file-like object.
  7           If it is a path, the file will be opened and closed by ZipFile.
  8     mode: The mode can be either read "r", write "w" or append "a".
  9     compression: ZIP_STORED (no compression) or ZIP_DEFLATED (requires zlib).
 10     allowZip64: if True ZipFile will create files with ZIP64 extensions when
 11                 needed, otherwise it will raise an exception when this would
 12                 be necessary.
 13 
 14     """
 15 
 16     fp = None                   # Set here since __del__ checks it
 17 
 18     def __init__(self, file, mode="r", compression=ZIP_STORED, allowZip64=False):
 19         """Open the ZIP file with mode read "r", write "w" or append "a"."""
 20         if mode not in ("r", "w", "a"):
 21             raise RuntimeError('ZipFile() requires mode "r", "w", or "a"')
 22 
 23         if compression == ZIP_STORED:
 24             pass
 25         elif compression == ZIP_DEFLATED:
 26             if not zlib:
 27                 raise RuntimeError,\
 28                       "Compression requires the (missing) zlib module"
 29         else:
 30             raise RuntimeError, "That compression method is not supported"
 31 
 32         self._allowZip64 = allowZip64
 33         self._didModify = False
 34         self.debug = 0  # Level of printing: 0 through 3
 35         self.NameToInfo = {}    # Find file info given name
 36         self.filelist = []      # List of ZipInfo instances for archive
 37         self.compression = compression  # Method of compression
 38         self.mode = key = mode.replace('b', '')[0]
 39         self.pwd = None
 40         self._comment = ''
 41 
 42         # Check if we were passed a file-like object
 43         if isinstance(file, basestring):
 44             self._filePassed = 0
 45             self.filename = file
 46             modeDict = {'r' : 'rb', 'w': 'wb', 'a' : 'r+b'}
 47             try:
 48                 self.fp = open(file, modeDict[mode])
 49             except IOError:
 50                 if mode == 'a':
 51                     mode = key = 'w'
 52                     self.fp = open(file, modeDict[mode])
 53                 else:
 54                     raise
 55         else:
 56             self._filePassed = 1
 57             self.fp = file
 58             self.filename = getattr(file, 'name', None)
 59 
 60         try:
 61             if key == 'r':
 62                 self._RealGetContents()
 63             elif key == 'w':
 64                 # set the modified flag so central directory gets written
 65                 # even if no files are added to the archive
 66                 self._didModify = True
 67             elif key == 'a':
 68                 try:
 69                     # See if file is a zip file
 70                     self._RealGetContents()
 71                     # seek to start of directory and overwrite
 72                     self.fp.seek(self.start_dir, 0)
 73                 except BadZipfile:
 74                     # file is not a zip file, just append
 75                     self.fp.seek(0, 2)
 76 
 77                     # set the modified flag so central directory gets written
 78                     # even if no files are added to the archive
 79                     self._didModify = True
 80             else:
 81                 raise RuntimeError('Mode must be "r", "w" or "a"')
 82         except:
 83             fp = self.fp
 84             self.fp = None
 85             if not self._filePassed:
 86                 fp.close()
 87             raise
 88 
 89     def __enter__(self):
 90         return self
 91 
 92     def __exit__(self, type, value, traceback):
 93         self.close()
 94 
 95     def _RealGetContents(self):
 96         """Read in the table of contents for the ZIP file."""
 97         fp = self.fp
 98         try:
 99             endrec = _EndRecData(fp)
100         except IOError:
101             raise BadZipfile("File is not a zip file")
102         if not endrec:
103             raise BadZipfile, "File is not a zip file"
104         if self.debug > 1:
105             print endrec
106         size_cd = endrec[_ECD_SIZE]             # bytes in central directory
107         offset_cd = endrec[_ECD_OFFSET]         # offset of central directory
108         self._comment = endrec[_ECD_COMMENT]    # archive comment
109 
110         # "concat" is zero, unless zip was concatenated to another file
111         concat = endrec[_ECD_LOCATION] - size_cd - offset_cd
112         if endrec[_ECD_SIGNATURE] == stringEndArchive64:
113             # If Zip64 extension structures are present, account for them
114             concat -= (sizeEndCentDir64 + sizeEndCentDir64Locator)
115 
116         if self.debug > 2:
117             inferred = concat + offset_cd
118             print "given, inferred, offset", offset_cd, inferred, concat
119         # self.start_dir:  Position of start of central directory
120         self.start_dir = offset_cd + concat
121         fp.seek(self.start_dir, 0)
122         data = fp.read(size_cd)
123         fp = cStringIO.StringIO(data)
124         total = 0
125         while total < size_cd:
126             centdir = fp.read(sizeCentralDir)
127             if len(centdir) != sizeCentralDir:
128                 raise BadZipfile("Truncated central directory")
129             centdir = struct.unpack(structCentralDir, centdir)
130             if centdir[_CD_SIGNATURE] != stringCentralDir:
131                 raise BadZipfile("Bad magic number for central directory")
132             if self.debug > 2:
133                 print centdir
134             filename = fp.read(centdir[_CD_FILENAME_LENGTH])
135             # Create ZipInfo instance to store file information
136             x = ZipInfo(filename)
137             x.extra = fp.read(centdir[_CD_EXTRA_FIELD_LENGTH])
138             x.comment = fp.read(centdir[_CD_COMMENT_LENGTH])
139             x.header_offset = centdir[_CD_LOCAL_HEADER_OFFSET]
140             (x.create_version, x.create_system, x.extract_version, x.reserved,
141                 x.flag_bits, x.compress_type, t, d,
142                 x.CRC, x.compress_size, x.file_size) = centdir[1:12]
143             x.volume, x.internal_attr, x.external_attr = centdir[15:18]
144             # Convert date/time code to (year, month, day, hour, min, sec)
145             x._raw_time = t
146             x.date_time = ( (d>>9)+1980, (d>>5)&0xF, d&0x1F,
147                                      t>>11, (t>>5)&0x3F, (t&0x1F) * 2 )
148 
149             x._decodeExtra()
150             x.header_offset = x.header_offset + concat
151             x.filename = x._decodeFilename()
152             self.filelist.append(x)
153             self.NameToInfo[x.filename] = x
154 
155             # update total bytes read from central directory
156             total = (total + sizeCentralDir + centdir[_CD_FILENAME_LENGTH]
157                      + centdir[_CD_EXTRA_FIELD_LENGTH]
158                      + centdir[_CD_COMMENT_LENGTH])
159 
160             if self.debug > 2:
161                 print "total", total
162 
163 
164     def namelist(self):
165         """Return a list of file names in the archive."""
166         l = []
167         for data in self.filelist:
168             l.append(data.filename)
169         return l
170 
171     def infolist(self):
172         """Return a list of class ZipInfo instances for files in the
173         archive."""
174         return self.filelist
175 
176     def printdir(self):
177         """Print a table of contents for the zip file."""
178         print "%-46s %19s %12s" % ("File Name", "Modified    ", "Size")
179         for zinfo in self.filelist:
180             date = "%d-%02d-%02d %02d:%02d:%02d" % zinfo.date_time[:6]
181             print "%-46s %s %12d" % (zinfo.filename, date, zinfo.file_size)
182 
183     def testzip(self):
184         """Read all the files and check the CRC."""
185         chunk_size = 2 ** 20
186         for zinfo in self.filelist:
187             try:
188                 # Read by chunks, to avoid an OverflowError or a
189                 # MemoryError with very large embedded files.
190                 with self.open(zinfo.filename, "r") as f:
191                     while f.read(chunk_size):     # Check CRC-32
192                         pass
193             except BadZipfile:
194                 return zinfo.filename
195 
196     def getinfo(self, name):
197         """Return the instance of ZipInfo given 'name'."""
198         info = self.NameToInfo.get(name)
199         if info is None:
200             raise KeyError(
201                 'There is no item named %r in the archive' % name)
202 
203         return info
204 
205     def setpassword(self, pwd):
206         """Set default password for encrypted files."""
207         self.pwd = pwd
208 
209     @property
210     def comment(self):
211         """The comment text associated with the ZIP file."""
212         return self._comment
213 
214     @comment.setter
215     def comment(self, comment):
216         # check for valid comment length
217         if len(comment) > ZIP_MAX_COMMENT:
218             import warnings
219             warnings.warn('Archive comment is too long; truncating to %d bytes'
220                           % ZIP_MAX_COMMENT, stacklevel=2)
221             comment = comment[:ZIP_MAX_COMMENT]
222         self._comment = comment
223         self._didModify = True
224 
225     def read(self, name, pwd=None):
226         """Return file bytes (as a string) for name."""
227         return self.open(name, "r", pwd).read()
228 
229     def open(self, name, mode="r", pwd=None):
230         """Return file-like object for 'name'."""
231         if mode not in ("r", "U", "rU"):
232             raise RuntimeError, 'open() requires mode "r", "U", or "rU"'
233         if not self.fp:
234             raise RuntimeError, \
235                   "Attempt to read ZIP archive that was already closed"
236 
237         # Only open a new file for instances where we were not
238         # given a file object in the constructor
239         if self._filePassed:
240             zef_file = self.fp
241             should_close = False
242         else:
243             zef_file = open(self.filename, 'rb')
244             should_close = True
245 
246         try:
247             # Make sure we have an info object
248             if isinstance(name, ZipInfo):
249                 # 'name' is already an info object
250                 zinfo = name
251             else:
252                 # Get info object for name
253                 zinfo = self.getinfo(name)
254 
255             zef_file.seek(zinfo.header_offset, 0)
256 
257             # Skip the file header:
258             fheader = zef_file.read(sizeFileHeader)
259             if len(fheader) != sizeFileHeader:
260                 raise BadZipfile("Truncated file header")
261             fheader = struct.unpack(structFileHeader, fheader)
262             if fheader[_FH_SIGNATURE] != stringFileHeader:
263                 raise BadZipfile("Bad magic number for file header")
264 
265             fname = zef_file.read(fheader[_FH_FILENAME_LENGTH])
266             if fheader[_FH_EXTRA_FIELD_LENGTH]:
267                 zef_file.read(fheader[_FH_EXTRA_FIELD_LENGTH])
268 
269             if fname != zinfo.orig_filename:
270                 raise BadZipfile, \
271                         'File name in directory "%s" and header "%s" differ.' % (
272                             zinfo.orig_filename, fname)
273 
274             # check for encrypted flag & handle password
275             is_encrypted = zinfo.flag_bits & 0x1
276             zd = None
277             if is_encrypted:
278                 if not pwd:
279                     pwd = self.pwd
280                 if not pwd:
281                     raise RuntimeError, "File %s is encrypted, " \
282                         "password required for extraction" % name
283 
284                 zd = _ZipDecrypter(pwd)
285                 # The first 12 bytes in the cypher stream is an encryption header
286                 #  used to strengthen the algorithm. The first 11 bytes are
287                 #  completely random, while the 12th contains the MSB of the CRC,
288                 #  or the MSB of the file time depending on the header type
289                 #  and is used to check the correctness of the password.
290                 bytes = zef_file.read(12)
291                 h = map(zd, bytes[0:12])
292                 if zinfo.flag_bits & 0x8:
293                     # compare against the file type from extended local headers
294                     check_byte = (zinfo._raw_time >> 8) & 0xff
295                 else:
296                     # compare against the CRC otherwise
297                     check_byte = (zinfo.CRC >> 24) & 0xff
298                 if ord(h[11]) != check_byte:
299                     raise RuntimeError("Bad password for file", name)
300 
301             return ZipExtFile(zef_file, mode, zinfo, zd,
302                     close_fileobj=should_close)
303         except:
304             if should_close:
305                 zef_file.close()
306             raise
307 
308     def extract(self, member, path=None, pwd=None):
309         """Extract a member from the archive to the current working directory,
310            using its full name. Its file information is extracted as accurately
311            as possible. `member' may be a filename or a ZipInfo object. You can
312            specify a different directory using `path'.
313         """
314         if not isinstance(member, ZipInfo):
315             member = self.getinfo(member)
316 
317         if path is None:
318             path = os.getcwd()
319 
320         return self._extract_member(member, path, pwd)
321 
322     def extractall(self, path=None, members=None, pwd=None):
323         """Extract all members from the archive to the current working
324            directory. `path' specifies a different directory to extract to.
325            `members' is optional and must be a subset of the list returned
326            by namelist().
327         """
328         if members is None:
329             members = self.namelist()
330 
331         for zipinfo in members:
332             self.extract(zipinfo, path, pwd)
333 
334     def _extract_member(self, member, targetpath, pwd):
335         """Extract the ZipInfo object 'member' to a physical
336            file on the path targetpath.
337         """
338         # build the destination pathname, replacing
339         # forward slashes to platform specific separators.
340         arcname = member.filename.replace('/', os.path.sep)
341 
342         if os.path.altsep:
343             arcname = arcname.replace(os.path.altsep, os.path.sep)
344         # interpret absolute pathname as relative, remove drive letter or
345         # UNC path, redundant separators, "." and ".." components.
346         arcname = os.path.splitdrive(arcname)[1]
347         arcname = os.path.sep.join(x for x in arcname.split(os.path.sep)
348                     if x not in ('', os.path.curdir, os.path.pardir))
349         if os.path.sep == '\\':
350             # filter illegal characters on Windows
351             illegal = ':<>|"?*'
352             if isinstance(arcname, unicode):
353                 table = {ord(c): ord('_') for c in illegal}
354             else:
355                 table = string.maketrans(illegal, '_' * len(illegal))
356             arcname = arcname.translate(table)
357             # remove trailing dots
358             arcname = (x.rstrip('.') for x in arcname.split(os.path.sep))
359             arcname = os.path.sep.join(x for x in arcname if x)
360 
361         targetpath = os.path.join(targetpath, arcname)
362         targetpath = os.path.normpath(targetpath)
363 
364         # Create all upper directories if necessary.
365         upperdirs = os.path.dirname(targetpath)
366         if upperdirs and not os.path.exists(upperdirs):
367             os.makedirs(upperdirs)
368 
369         if member.filename[-1] == '/':
370             if not os.path.isdir(targetpath):
371                 os.mkdir(targetpath)
372             return targetpath
373 
374         with self.open(member, pwd=pwd) as source, \
375              file(targetpath, "wb") as target:
376             shutil.copyfileobj(source, target)
377 
378         return targetpath
379 
380     def _writecheck(self, zinfo):
381         """Check for errors before writing a file to the archive."""
382         if zinfo.filename in self.NameToInfo:
383             import warnings
384             warnings.warn('Duplicate name: %r' % zinfo.filename, stacklevel=3)
385         if self.mode not in ("w", "a"):
386             raise RuntimeError, 'write() requires mode "w" or "a"'
387         if not self.fp:
388             raise RuntimeError, \
389                   "Attempt to write ZIP archive that was already closed"
390         if zinfo.compress_type == ZIP_DEFLATED and not zlib:
391             raise RuntimeError, \
392                   "Compression requires the (missing) zlib module"
393         if zinfo.compress_type not in (ZIP_STORED, ZIP_DEFLATED):
394             raise RuntimeError, \
395                   "That compression method is not supported"
396         if not self._allowZip64:
397             requires_zip64 = None
398             if len(self.filelist) >= ZIP_FILECOUNT_LIMIT:
399                 requires_zip64 = "Files count"
400             elif zinfo.file_size > ZIP64_LIMIT:
401                 requires_zip64 = "Filesize"
402             elif zinfo.header_offset > ZIP64_LIMIT:
403                 requires_zip64 = "Zipfile size"
404             if requires_zip64:
405                 raise LargeZipFile(requires_zip64 +
406                                    " would require ZIP64 extensions")
407 
408     def write(self, filename, arcname=None, compress_type=None):
409         """Put the bytes from filename into the archive under the name
410         arcname."""
411         if not self.fp:
412             raise RuntimeError(
413                   "Attempt to write to ZIP archive that was already closed")
414 
415         st = os.stat(filename)
416         isdir = stat.S_ISDIR(st.st_mode)
417         mtime = time.localtime(st.st_mtime)
418         date_time = mtime[0:6]
419         # Create ZipInfo instance to store file information
420         if arcname is None:
421             arcname = filename
422         arcname = os.path.normpath(os.path.splitdrive(arcname)[1])
423         while arcname[0] in (os.sep, os.altsep):
424             arcname = arcname[1:]
425         if isdir:
426             arcname += '/'
427         zinfo = ZipInfo(arcname, date_time)
428         zinfo.external_attr = (st[0] & 0xFFFF) << 16L      # Unix attributes
429         if compress_type is None:
430             zinfo.compress_type = self.compression
431         else:
432             zinfo.compress_type = compress_type
433 
434         zinfo.file_size = st.st_size
435         zinfo.flag_bits = 0x00
436         zinfo.header_offset = self.fp.tell()    # Start of header bytes
437 
438         self._writecheck(zinfo)
439         self._didModify = True
440 
441         if isdir:
442             zinfo.file_size = 0
443             zinfo.compress_size = 0
444             zinfo.CRC = 0
445             zinfo.external_attr |= 0x10  # MS-DOS directory flag
446             self.filelist.append(zinfo)
447             self.NameToInfo[zinfo.filename] = zinfo
448             self.fp.write(zinfo.FileHeader(False))
449             return
450 
451         with open(filename, "rb") as fp:
452             # Must overwrite CRC and sizes with correct data later
453             zinfo.CRC = CRC = 0
454             zinfo.compress_size = compress_size = 0
455             # Compressed size can be larger than uncompressed size
456             zip64 = self._allowZip64 and \
457                     zinfo.file_size * 1.05 > ZIP64_LIMIT
458             self.fp.write(zinfo.FileHeader(zip64))
459             if zinfo.compress_type == ZIP_DEFLATED:
460                 cmpr = zlib.compressobj(zlib.Z_DEFAULT_COMPRESSION,
461                      zlib.DEFLATED, -15)
462             else:
463                 cmpr = None
464             file_size = 0
465             while 1:
466                 buf = fp.read(1024 * 8)
467                 if not buf:
468                     break
469                 file_size = file_size + len(buf)
470                 CRC = crc32(buf, CRC) & 0xffffffff
471                 if cmpr:
472                     buf = cmpr.compress(buf)
473                     compress_size = compress_size + len(buf)
474                 self.fp.write(buf)
475         if cmpr:
476             buf = cmpr.flush()
477             compress_size = compress_size + len(buf)
478             self.fp.write(buf)
479             zinfo.compress_size = compress_size
480         else:
481             zinfo.compress_size = file_size
482         zinfo.CRC = CRC
483         zinfo.file_size = file_size
484         if not zip64 and self._allowZip64:
485             if file_size > ZIP64_LIMIT:
486                 raise RuntimeError('File size has increased during compressing')
487             if compress_size > ZIP64_LIMIT:
488                 raise RuntimeError('Compressed size larger than uncompressed size')
489         # Seek backwards and write file header (which will now include
490         # correct CRC and file sizes)
491         position = self.fp.tell()       # Preserve current position in file
492         self.fp.seek(zinfo.header_offset, 0)
493         self.fp.write(zinfo.FileHeader(zip64))
494         self.fp.seek(position, 0)
495         self.filelist.append(zinfo)
496         self.NameToInfo[zinfo.filename] = zinfo
497 
498     def writestr(self, zinfo_or_arcname, bytes, compress_type=None):
499         """Write a file into the archive.  The contents is the string
500         'bytes'.  'zinfo_or_arcname' is either a ZipInfo instance or
501         the name of the file in the archive."""
502         if not isinstance(zinfo_or_arcname, ZipInfo):
503             zinfo = ZipInfo(filename=zinfo_or_arcname,
504                             date_time=time.localtime(time.time())[:6])
505 
506             zinfo.compress_type = self.compression
507             if zinfo.filename[-1] == '/':
508                 zinfo.external_attr = 0o40775 << 16   # drwxrwxr-x
509                 zinfo.external_attr |= 0x10           # MS-DOS directory flag
510             else:
511                 zinfo.external_attr = 0o600 << 16     # ?rw-------
512         else:
513             zinfo = zinfo_or_arcname
514 
515         if not self.fp:
516             raise RuntimeError(
517                   "Attempt to write to ZIP archive that was already closed")
518 
519         if compress_type is not None:
520             zinfo.compress_type = compress_type
521 
522         zinfo.file_size = len(bytes)            # Uncompressed size
523         zinfo.header_offset = self.fp.tell()    # Start of header bytes
524         self._writecheck(zinfo)
525         self._didModify = True
526         zinfo.CRC = crc32(bytes) & 0xffffffff       # CRC-32 checksum
527         if zinfo.compress_type == ZIP_DEFLATED:
528             co = zlib.compressobj(zlib.Z_DEFAULT_COMPRESSION,
529                  zlib.DEFLATED, -15)
530             bytes = co.compress(bytes) + co.flush()
531             zinfo.compress_size = len(bytes)    # Compressed size
532         else:
533             zinfo.compress_size = zinfo.file_size
534         zip64 = zinfo.file_size > ZIP64_LIMIT or \
535                 zinfo.compress_size > ZIP64_LIMIT
536         if zip64 and not self._allowZip64:
537             raise LargeZipFile("Filesize would require ZIP64 extensions")
538         self.fp.write(zinfo.FileHeader(zip64))
539         self.fp.write(bytes)
540         if zinfo.flag_bits & 0x08:
541             # Write CRC and file sizes after the file data
542             fmt = '<LQQ' if zip64 else '<LLL'
543             self.fp.write(struct.pack(fmt, zinfo.CRC, zinfo.compress_size,
544                   zinfo.file_size))
545         self.fp.flush()
546         self.filelist.append(zinfo)
547         self.NameToInfo[zinfo.filename] = zinfo
548 
549     def __del__(self):
550         """Call the "close()" method in case the user forgot."""
551         self.close()
552 
553     def close(self):
554         """Close the file, and for mode "w" and "a" write the ending
555         records."""
556         if self.fp is None:
557             return
558 
559         try:
560             if self.mode in ("w", "a") and self._didModify: # write ending records
561                 pos1 = self.fp.tell()
562                 for zinfo in self.filelist:         # write central directory
563                     dt = zinfo.date_time
564                     dosdate = (dt[0] - 1980) << 9 | dt[1] << 5 | dt[2]
565                     dostime = dt[3] << 11 | dt[4] << 5 | (dt[5] // 2)
566                     extra = []
567                     if zinfo.file_size > ZIP64_LIMIT \
568                             or zinfo.compress_size > ZIP64_LIMIT:
569                         extra.append(zinfo.file_size)
570                         extra.append(zinfo.compress_size)
571                         file_size = 0xffffffff
572                         compress_size = 0xffffffff
573                     else:
574                         file_size = zinfo.file_size
575                         compress_size = zinfo.compress_size
576 
577                     if zinfo.header_offset > ZIP64_LIMIT:
578                         extra.append(zinfo.header_offset)
579                         header_offset = 0xffffffffL
580                     else:
581                         header_offset = zinfo.header_offset
582 
583                     extra_data = zinfo.extra
584                     if extra:
585                         # Append a ZIP64 field to the extra's
586                         extra_data = struct.pack(
587                                 '<HH' + 'Q'*len(extra),
588                                 1, 8*len(extra), *extra) + extra_data
589 
590                         extract_version = max(45, zinfo.extract_version)
591                         create_version = max(45, zinfo.create_version)
592                     else:
593                         extract_version = zinfo.extract_version
594                         create_version = zinfo.create_version
595 
596                     try:
597                         filename, flag_bits = zinfo._encodeFilenameFlags()
598                         centdir = struct.pack(structCentralDir,
599                         stringCentralDir, create_version,
600                         zinfo.create_system, extract_version, zinfo.reserved,
601                         flag_bits, zinfo.compress_type, dostime, dosdate,
602                         zinfo.CRC, compress_size, file_size,
603                         len(filename), len(extra_data), len(zinfo.comment),
604                         0, zinfo.internal_attr, zinfo.external_attr,
605                         header_offset)
606                     except DeprecationWarning:
607                         print >>sys.stderr, (structCentralDir,
608                         stringCentralDir, create_version,
609                         zinfo.create_system, extract_version, zinfo.reserved,
610                         zinfo.flag_bits, zinfo.compress_type, dostime, dosdate,
611                         zinfo.CRC, compress_size, file_size,
612                         len(zinfo.filename), len(extra_data), len(zinfo.comment),
613                         0, zinfo.internal_attr, zinfo.external_attr,
614                         header_offset)
615                         raise
616                     self.fp.write(centdir)
617                     self.fp.write(filename)
618                     self.fp.write(extra_data)
619                     self.fp.write(zinfo.comment)
620 
621                 pos2 = self.fp.tell()
622                 # Write end-of-zip-archive record
623                 centDirCount = len(self.filelist)
624                 centDirSize = pos2 - pos1
625                 centDirOffset = pos1
626                 requires_zip64 = None
627                 if centDirCount > ZIP_FILECOUNT_LIMIT:
628                     requires_zip64 = "Files count"
629                 elif centDirOffset > ZIP64_LIMIT:
630                     requires_zip64 = "Central directory offset"
631                 elif centDirSize > ZIP64_LIMIT:
632                     requires_zip64 = "Central directory size"
633                 if requires_zip64:
634                     # Need to write the ZIP64 end-of-archive records
635                     if not self._allowZip64:
636                         raise LargeZipFile(requires_zip64 +
637                                            " would require ZIP64 extensions")
638                     zip64endrec = struct.pack(
639                             structEndArchive64, stringEndArchive64,
640                             44, 45, 45, 0, 0, centDirCount, centDirCount,
641                             centDirSize, centDirOffset)
642                     self.fp.write(zip64endrec)
643 
644                     zip64locrec = struct.pack(
645                             structEndArchive64Locator,
646                             stringEndArchive64Locator, 0, pos2, 1)
647                     self.fp.write(zip64locrec)
648                     centDirCount = min(centDirCount, 0xFFFF)
649                     centDirSize = min(centDirSize, 0xFFFFFFFF)
650                     centDirOffset = min(centDirOffset, 0xFFFFFFFF)
651 
652                 endrec = struct.pack(structEndArchive, stringEndArchive,
653                                     0, 0, centDirCount, centDirCount,
654                                     centDirSize, centDirOffset, len(self._comment))
655                 self.fp.write(endrec)
656                 self.fp.write(self._comment)
657                 self.fp.flush()
658         finally:
659             fp = self.fp
660             self.fp = None
661             if not self._filePassed:
662                 fp.close()
663 
664 ZipFile
665 
666 ZipFile 源码

ZipFile 源码

  1 class TarFile(object):
  2     """The TarFile Class provides an interface to tar archives.
  3     """
  4 
  5     debug = 0                   # May be set from 0 (no msgs) to 3 (all msgs)
  6 
  7     dereference = False         # If true, add content of linked file to the
  8                                 # tar file, else the link.
  9 
 10     ignore_zeros = False        # If true, skips empty or invalid blocks and
 11                                 # continues processing.
 12 
 13     errorlevel = 1              # If 0, fatal errors only appear in debug
 14                                 # messages (if debug >= 0). If > 0, errors
 15                                 # are passed to the caller as exceptions.
 16 
 17     format = DEFAULT_FORMAT     # The format to use when creating an archive.
 18 
 19     encoding = ENCODING         # Encoding for 8-bit character strings.
 20 
 21     errors = None               # Error handler for unicode conversion.
 22 
 23     tarinfo = TarInfo           # The default TarInfo class to use.
 24 
 25     fileobject = ExFileObject   # The default ExFileObject class to use.
 26 
 27     def __init__(self, name=None, mode="r", fileobj=None, format=None,
 28             tarinfo=None, dereference=None, ignore_zeros=None, encoding=None,
 29             errors=None, pax_headers=None, debug=None, errorlevel=None):
 30         """Open an (uncompressed) tar archive `name'. `mode' is either 'r' to
 31            read from an existing archive, 'a' to append data to an existing
 32            file or 'w' to create a new file overwriting an existing one. `mode'
 33            defaults to 'r'.
 34            If `fileobj' is given, it is used for reading or writing data. If it
 35            can be determined, `mode' is overridden by `fileobj's mode.
 36            `fileobj' is not closed, when TarFile is closed.
 37         """
 38         modes = {"r": "rb", "a": "r+b", "w": "wb"}
 39         if mode not in modes:
 40             raise ValueError("mode must be 'r', 'a' or 'w'")
 41         self.mode = mode
 42         self._mode = modes[mode]
 43 
 44         if not fileobj:
 45             if self.mode == "a" and not os.path.exists(name):
 46                 # Create nonexistent files in append mode.
 47                 self.mode = "w"
 48                 self._mode = "wb"
 49             fileobj = bltn_open(name, self._mode)
 50             self._extfileobj = False
 51         else:
 52             if name is None and hasattr(fileobj, "name"):
 53                 name = fileobj.name
 54             if hasattr(fileobj, "mode"):
 55                 self._mode = fileobj.mode
 56             self._extfileobj = True
 57         self.name = os.path.abspath(name) if name else None
 58         self.fileobj = fileobj
 59 
 60         # Init attributes.
 61         if format is not None:
 62             self.format = format
 63         if tarinfo is not None:
 64             self.tarinfo = tarinfo
 65         if dereference is not None:
 66             self.dereference = dereference
 67         if ignore_zeros is not None:
 68             self.ignore_zeros = ignore_zeros
 69         if encoding is not None:
 70             self.encoding = encoding
 71 
 72         if errors is not None:
 73             self.errors = errors
 74         elif mode == "r":
 75             self.errors = "utf-8"
 76         else:
 77             self.errors = "strict"
 78 
 79         if pax_headers is not None and self.format == PAX_FORMAT:
 80             self.pax_headers = pax_headers
 81         else:
 82             self.pax_headers = {}
 83 
 84         if debug is not None:
 85             self.debug = debug
 86         if errorlevel is not None:
 87             self.errorlevel = errorlevel
 88 
 89         # Init datastructures.
 90         self.closed = False
 91         self.members = []       # list of members as TarInfo objects
 92         self._loaded = False    # flag if all members have been read
 93         self.offset = self.fileobj.tell()
 94                                 # current position in the archive file
 95         self.inodes = {}        # dictionary caching the inodes of
 96                                 # archive members already added
 97 
 98         try:
 99             if self.mode == "r":
100                 self.firstmember = None
101                 self.firstmember = self.next()
102 
103             if self.mode == "a":
104                 # Move to the end of the archive,
105                 # before the first empty block.
106                 while True:
107                     self.fileobj.seek(self.offset)
108                     try:
109                         tarinfo = self.tarinfo.fromtarfile(self)
110                         self.members.append(tarinfo)
111                     except EOFHeaderError:
112                         self.fileobj.seek(self.offset)
113                         break
114                     except HeaderError, e:
115                         raise ReadError(str(e))
116 
117             if self.mode in "aw":
118                 self._loaded = True
119 
120                 if self.pax_headers:
121                     buf = self.tarinfo.create_pax_global_header(self.pax_headers.copy())
122                     self.fileobj.write(buf)
123                     self.offset += len(buf)
124         except:
125             if not self._extfileobj:
126                 self.fileobj.close()
127             self.closed = True
128             raise
129 
130     def _getposix(self):
131         return self.format == USTAR_FORMAT
132     def _setposix(self, value):
133         import warnings
134         warnings.warn("use the format attribute instead", DeprecationWarning,
135                       2)
136         if value:
137             self.format = USTAR_FORMAT
138         else:
139             self.format = GNU_FORMAT
140     posix = property(_getposix, _setposix)
141 
142     #--------------------------------------------------------------------------
143     # Below are the classmethods which act as alternate constructors to the
144     # TarFile class. The open() method is the only one that is needed for
145     # public use; it is the "super"-constructor and is able to select an
146     # adequate "sub"-constructor for a particular compression using the mapping
147     # from OPEN_METH.
148     #
149     # This concept allows one to subclass TarFile without losing the comfort of
150     # the super-constructor. A sub-constructor is registered and made available
151     # by adding it to the mapping in OPEN_METH.
152 
153     @classmethod
154     def open(cls, name=None, mode="r", fileobj=None, bufsize=RECORDSIZE, **kwargs):
155         """Open a tar archive for reading, writing or appending. Return
156            an appropriate TarFile class.
157 
158            mode:
159            'r' or 'r:*' open for reading with transparent compression
160            'r:'         open for reading exclusively uncompressed
161            'r:gz'       open for reading with gzip compression
162            'r:bz2'      open for reading with bzip2 compression
163            'a' or 'a:'  open for appending, creating the file if necessary
164            'w' or 'w:'  open for writing without compression
165            'w:gz'       open for writing with gzip compression
166            'w:bz2'      open for writing with bzip2 compression
167 
168            'r|*'        open a stream of tar blocks with transparent compression
169            'r|'         open an uncompressed stream of tar blocks for reading
170            'r|gz'       open a gzip compressed stream of tar blocks
171            'r|bz2'      open a bzip2 compressed stream of tar blocks
172            'w|'         open an uncompressed stream for writing
173            'w|gz'       open a gzip compressed stream for writing
174            'w|bz2'      open a bzip2 compressed stream for writing
175         """
176 
177         if not name and not fileobj:
178             raise ValueError("nothing to open")
179 
180         if mode in ("r", "r:*"):
181             # Find out which *open() is appropriate for opening the file.
182             for comptype in cls.OPEN_METH:
183                 func = getattr(cls, cls.OPEN_METH[comptype])
184                 if fileobj is not None:
185                     saved_pos = fileobj.tell()
186                 try:
187                     return func(name, "r", fileobj, **kwargs)
188                 except (ReadError, CompressionError), e:
189                     if fileobj is not None:
190                         fileobj.seek(saved_pos)
191                     continue
192             raise ReadError("file could not be opened successfully")
193 
194         elif ":" in mode:
195             filemode, comptype = mode.split(":", 1)
196             filemode = filemode or "r"
197             comptype = comptype or "tar"
198 
199             # Select the *open() function according to
200             # given compression.
201             if comptype in cls.OPEN_METH:
202                 func = getattr(cls, cls.OPEN_METH[comptype])
203             else:
204                 raise CompressionError("unknown compression type %r" % comptype)
205             return func(name, filemode, fileobj, **kwargs)
206 
207         elif "|" in mode:
208             filemode, comptype = mode.split("|", 1)
209             filemode = filemode or "r"
210             comptype = comptype or "tar"
211 
212             if filemode not in ("r", "w"):
213                 raise ValueError("mode must be 'r' or 'w'")
214 
215             stream = _Stream(name, filemode, comptype, fileobj, bufsize)
216             try:
217                 t = cls(name, filemode, stream, **kwargs)
218             except:
219                 stream.close()
220                 raise
221             t._extfileobj = False
222             return t
223 
224         elif mode in ("a", "w"):
225             return cls.taropen(name, mode, fileobj, **kwargs)
226 
227         raise ValueError("undiscernible mode")
228 
229     @classmethod
230     def taropen(cls, name, mode="r", fileobj=None, **kwargs):
231         """Open uncompressed tar archive name for reading or writing.
232         """
233         if mode not in ("r", "a", "w"):
234             raise ValueError("mode must be 'r', 'a' or 'w'")
235         return cls(name, mode, fileobj, **kwargs)
236 
237     @classmethod
238     def gzopen(cls, name, mode="r", fileobj=None, compresslevel=9, **kwargs):
239         """Open gzip compressed tar archive name for reading or writing.
240            Appending is not allowed.
241         """
242         if mode not in ("r", "w"):
243             raise ValueError("mode must be 'r' or 'w'")
244 
245         try:
246             import gzip
247             gzip.GzipFile
248         except (ImportError, AttributeError):
249             raise CompressionError("gzip module is not available")
250 
251         try:
252             fileobj = gzip.GzipFile(name, mode, compresslevel, fileobj)
253         except OSError:
254             if fileobj is not None and mode == 'r':
255                 raise ReadError("not a gzip file")
256             raise
257 
258         try:
259             t = cls.taropen(name, mode, fileobj, **kwargs)
260         except IOError:
261             fileobj.close()
262             if mode == 'r':
263                 raise ReadError("not a gzip file")
264             raise
265         except:
266             fileobj.close()
267             raise
268         t._extfileobj = False
269         return t
270 
271     @classmethod
272     def bz2open(cls, name, mode="r", fileobj=None, compresslevel=9, **kwargs):
273         """Open bzip2 compressed tar archive name for reading or writing.
274            Appending is not allowed.
275         """
276         if mode not in ("r", "w"):
277             raise ValueError("mode must be 'r' or 'w'.")
278 
279         try:
280             import bz2
281         except ImportError:
282             raise CompressionError("bz2 module is not available")
283 
284         if fileobj is not None:
285             fileobj = _BZ2Proxy(fileobj, mode)
286         else:
287             fileobj = bz2.BZ2File(name, mode, compresslevel=compresslevel)
288 
289         try:
290             t = cls.taropen(name, mode, fileobj, **kwargs)
291         except (IOError, EOFError):
292             fileobj.close()
293             if mode == 'r':
294                 raise ReadError("not a bzip2 file")
295             raise
296         except:
297             fileobj.close()
298             raise
299         t._extfileobj = False
300         return t
301 
302     # All *open() methods are registered here.
303     OPEN_METH = {
304         "tar": "taropen",   # uncompressed tar
305         "gz":  "gzopen",    # gzip compressed tar
306         "bz2": "bz2open"    # bzip2 compressed tar
307     }
308 
309     #--------------------------------------------------------------------------
310     # The public methods which TarFile provides:
311 
312     def close(self):
313         """Close the TarFile. In write-mode, two finishing zero blocks are
314            appended to the archive.
315         """
316         if self.closed:
317             return
318 
319         if self.mode in "aw":
320             self.fileobj.write(NUL * (BLOCKSIZE * 2))
321             self.offset += (BLOCKSIZE * 2)
322             # fill up the end with zero-blocks
323             # (like option -b20 for tar does)
324             blocks, remainder = divmod(self.offset, RECORDSIZE)
325             if remainder > 0:
326                 self.fileobj.write(NUL * (RECORDSIZE - remainder))
327 
328         if not self._extfileobj:
329             self.fileobj.close()
330         self.closed = True
331 
332     def getmember(self, name):
333         """Return a TarInfo object for member `name'. If `name' can not be
334            found in the archive, KeyError is raised. If a member occurs more
335            than once in the archive, its last occurrence is assumed to be the
336            most up-to-date version.
337         """
338         tarinfo = self._getmember(name)
339         if tarinfo is None:
340             raise KeyError("filename %r not found" % name)
341         return tarinfo
342 
343     def getmembers(self):
344         """Return the members of the archive as a list of TarInfo objects. The
345            list has the same order as the members in the archive.
346         """
347         self._check()
348         if not self._loaded:    # if we want to obtain a list of
349             self._load()        # all members, we first have to
350                                 # scan the whole archive.
351         return self.members
352 
353     def getnames(self):
354         """Return the members of the archive as a list of their names. It has
355            the same order as the list returned by getmembers().
356         """
357         return [tarinfo.name for tarinfo in self.getmembers()]
358 
359     def gettarinfo(self, name=None, arcname=None, fileobj=None):
360         """Create a TarInfo object for either the file `name' or the file
361            object `fileobj' (using os.fstat on its file descriptor). You can
362            modify some of the TarInfo's attributes before you add it using
363            addfile(). If given, `arcname' specifies an alternative name for the
364            file in the archive.
365         """
366         self._check("aw")
367 
368         # When fileobj is given, replace name by
369         # fileobj's real name.
370         if fileobj is not None:
371             name = fileobj.name
372 
373         # Building the name of the member in the archive.
374         # Backward slashes are converted to forward slashes,
375         # Absolute paths are turned to relative paths.
376         if arcname is None:
377             arcname = name
378         drv, arcname = os.path.splitdrive(arcname)
379         arcname = arcname.replace(os.sep, "/")
380         arcname = arcname.lstrip("/")
381 
382         # Now, fill the TarInfo object with
383         # information specific for the file.
384         tarinfo = self.tarinfo()
385         tarinfo.tarfile = self
386 
387         # Use os.stat or os.lstat, depending on platform
388         # and if symlinks shall be resolved.
389         if fileobj is None:
390             if hasattr(os, "lstat") and not self.dereference:
391                 statres = os.lstat(name)
392             else:
393                 statres = os.stat(name)
394         else:
395             statres = os.fstat(fileobj.fileno())
396         linkname = ""
397 
398         stmd = statres.st_mode
399         if stat.S_ISREG(stmd):
400             inode = (statres.st_ino, statres.st_dev)
401             if not self.dereference and statres.st_nlink > 1 and \
402                     inode in self.inodes and arcname != self.inodes[inode]:
403                 # Is it a hardlink to an already
404                 # archived file?
405                 type = LNKTYPE
406                 linkname = self.inodes[inode]
407             else:
408                 # The inode is added only if its valid.
409                 # For win32 it is always 0.
410                 type = REGTYPE
411                 if inode[0]:
412                     self.inodes[inode] = arcname
413         elif stat.S_ISDIR(stmd):
414             type = DIRTYPE
415         elif stat.S_ISFIFO(stmd):
416             type = FIFOTYPE
417         elif stat.S_ISLNK(stmd):
418             type = SYMTYPE
419             linkname = os.readlink(name)
420         elif stat.S_ISCHR(stmd):
421             type = CHRTYPE
422         elif stat.S_ISBLK(stmd):
423             type = BLKTYPE
424         else:
425             return None
426 
427         # Fill the TarInfo object with all
428         # information we can get.
429         tarinfo.name = arcname
430         tarinfo.mode = stmd
431         tarinfo.uid = statres.st_uid
432         tarinfo.gid = statres.st_gid
433         if type == REGTYPE:
434             tarinfo.size = statres.st_size
435         else:
436             tarinfo.size = 0L
437         tarinfo.mtime = statres.st_mtime
438         tarinfo.type = type
439         tarinfo.linkname = linkname
440         if pwd:
441             try:
442                 tarinfo.uname = pwd.getpwuid(tarinfo.uid)[0]
443             except KeyError:
444                 pass
445         if grp:
446             try:
447                 tarinfo.gname = grp.getgrgid(tarinfo.gid)[0]
448             except KeyError:
449                 pass
450 
451         if type in (CHRTYPE, BLKTYPE):
452             if hasattr(os, "major") and hasattr(os, "minor"):
453                 tarinfo.devmajor = os.major(statres.st_rdev)
454                 tarinfo.devminor = os.minor(statres.st_rdev)
455         return tarinfo
456 
457     def list(self, verbose=True):
458         """Print a table of contents to sys.stdout. If `verbose' is False, only
459            the names of the members are printed. If it is True, an `ls -l'-like
460            output is produced.
461         """
462         self._check()
463 
464         for tarinfo in self:
465             if verbose:
466                 print filemode(tarinfo.mode),
467                 print "%s/%s" % (tarinfo.uname or tarinfo.uid,
468                                  tarinfo.gname or tarinfo.gid),
469                 if tarinfo.ischr() or tarinfo.isblk():
470                     print "%10s" % ("%d,%d" \
471                                     % (tarinfo.devmajor, tarinfo.devminor)),
472                 else:
473                     print "%10d" % tarinfo.size,
474                 print "%d-%02d-%02d %02d:%02d:%02d" \
475                       % time.localtime(tarinfo.mtime)[:6],
476 
477             print tarinfo.name + ("/" if tarinfo.isdir() else ""),
478 
479             if verbose:
480                 if tarinfo.issym():
481                     print "->", tarinfo.linkname,
482                 if tarinfo.islnk():
483                     print "link to", tarinfo.linkname,
484             print
485 
486     def add(self, name, arcname=None, recursive=True, exclude=None, filter=None):
487         """Add the file `name' to the archive. `name' may be any type of file
488            (directory, fifo, symbolic link, etc.). If given, `arcname'
489            specifies an alternative name for the file in the archive.
490            Directories are added recursively by default. This can be avoided by
491            setting `recursive' to False. `exclude' is a function that should
492            return True for each filename to be excluded. `filter' is a function
493            that expects a TarInfo object argument and returns the changed
494            TarInfo object, if it returns None the TarInfo object will be
495            excluded from the archive.
496         """
497         self._check("aw")
498 
499         if arcname is None:
500             arcname = name
501 
502         # Exclude pathnames.
503         if exclude is not None:
504             import warnings
505             warnings.warn("use the filter argument instead",
506                     DeprecationWarning, 2)
507             if exclude(name):
508                 self._dbg(2, "tarfile: Excluded %r" % name)
509                 return
510 
511         # Skip if somebody tries to archive the archive...
512         if self.name is not None and os.path.abspath(name) == self.name:
513             self._dbg(2, "tarfile: Skipped %r" % name)
514             return
515 
516         self._dbg(1, name)
517 
518         # Create a TarInfo object from the file.
519         tarinfo = self.gettarinfo(name, arcname)
520 
521         if tarinfo is None:
522             self._dbg(1, "tarfile: Unsupported type %r" % name)
523             return
524 
525         # Change or exclude the TarInfo object.
526         if filter is not None:
527             tarinfo = filter(tarinfo)
528             if tarinfo is None:
529                 self._dbg(2, "tarfile: Excluded %r" % name)
530                 return
531 
532         # Append the tar header and data to the archive.
533         if tarinfo.isreg():
534             with bltn_open(name, "rb") as f:
535                 self.addfile(tarinfo, f)
536 
537         elif tarinfo.isdir():
538             self.addfile(tarinfo)
539             if recursive:
540                 for f in os.listdir(name):
541                     self.add(os.path.join(name, f), os.path.join(arcname, f),
542                             recursive, exclude, filter)
543 
544         else:
545             self.addfile(tarinfo)
546 
547     def addfile(self, tarinfo, fileobj=None):
548         """Add the TarInfo object `tarinfo' to the archive. If `fileobj' is
549            given, tarinfo.size bytes are read from it and added to the archive.
550            You can create TarInfo objects using gettarinfo().
551            On Windows platforms, `fileobj' should always be opened with mode
552            'rb' to avoid irritation about the file size.
553         """
554         self._check("aw")
555 
556         tarinfo = copy.copy(tarinfo)
557 
558         buf = tarinfo.tobuf(self.format, self.encoding, self.errors)
559         self.fileobj.write(buf)
560         self.offset += len(buf)
561 
562         # If there's data to follow, append it.
563         if fileobj is not None:
564             copyfileobj(fileobj, self.fileobj, tarinfo.size)
565             blocks, remainder = divmod(tarinfo.size, BLOCKSIZE)
566             if remainder > 0:
567                 self.fileobj.write(NUL * (BLOCKSIZE - remainder))
568                 blocks += 1
569             self.offset += blocks * BLOCKSIZE
570 
571         self.members.append(tarinfo)
572 
573     def extractall(self, path=".", members=None):
574         """Extract all members from the archive to the current working
575            directory and set owner, modification time and permissions on
576            directories afterwards. `path' specifies a different directory
577            to extract to. `members' is optional and must be a subset of the
578            list returned by getmembers().
579         """
580         directories = []
581 
582         if members is None:
583             members = self
584 
585         for tarinfo in members:
586             if tarinfo.isdir():
587                 # Extract directories with a safe mode.
588                 directories.append(tarinfo)
589                 tarinfo = copy.copy(tarinfo)
590                 tarinfo.mode = 0700
591             self.extract(tarinfo, path)
592 
593         # Reverse sort directories.
594         directories.sort(key=operator.attrgetter('name'))
595         directories.reverse()
596 
597         # Set correct owner, mtime and filemode on directories.
598         for tarinfo in directories:
599             dirpath = os.path.join(path, tarinfo.name)
600             try:
601                 self.chown(tarinfo, dirpath)
602                 self.utime(tarinfo, dirpath)
603                 self.chmod(tarinfo, dirpath)
604             except ExtractError, e:
605                 if self.errorlevel > 1:
606                     raise
607                 else:
608                     self._dbg(1, "tarfile: %s" % e)
609 
610     def extract(self, member, path=""):
611         """Extract a member from the archive to the current working directory,
612            using its full name. Its file information is extracted as accurately
613            as possible. `member' may be a filename or a TarInfo object. You can
614            specify a different directory using `path'.
615         """
616         self._check("r")
617 
618         if isinstance(member, basestring):
619             tarinfo = self.getmember(member)
620         else:
621             tarinfo = member
622 
623         # Prepare the link target for makelink().
624         if tarinfo.islnk():
625             tarinfo._link_target = os.path.join(path, tarinfo.linkname)
626 
627         try:
628             self._extract_member(tarinfo, os.path.join(path, tarinfo.name))
629         except EnvironmentError, e:
630             if self.errorlevel > 0:
631                 raise
632             else:
633                 if e.filename is None:
634                     self._dbg(1, "tarfile: %s" % e.strerror)
635                 else:
636                     self._dbg(1, "tarfile: %s %r" % (e.strerror, e.filename))
637         except ExtractError, e:
638             if self.errorlevel > 1:
639                 raise
640             else:
641                 self._dbg(1, "tarfile: %s" % e)
642 
643     def extractfile(self, member):
644         """Extract a member from the archive as a file object. `member' may be
645            a filename or a TarInfo object. If `member' is a regular file, a
646            file-like object is returned. If `member' is a link, a file-like
647            object is constructed from the link's target. If `member' is none of
648            the above, None is returned.
649            The file-like object is read-only and provides the following
650            methods: read(), readline(), readlines(), seek() and tell()
651         """
652         self._check("r")
653 
654         if isinstance(member, basestring):
655             tarinfo = self.getmember(member)
656         else:
657             tarinfo = member
658 
659         if tarinfo.isreg():
660             return self.fileobject(self, tarinfo)
661 
662         elif tarinfo.type not in SUPPORTED_TYPES:
663             # If a member's type is unknown, it is treated as a
664             # regular file.
665             return self.fileobject(self, tarinfo)
666 
667         elif tarinfo.islnk() or tarinfo.issym():
668             if isinstance(self.fileobj, _Stream):
669                 # A small but ugly workaround for the case that someone tries
670                 # to extract a (sym)link as a file-object from a non-seekable
671                 # stream of tar blocks.
672                 raise StreamError("cannot extract (sym)link as file object")
673             else:
674                 # A (sym)link's file object is its target's file object.
675                 return self.extractfile(self._find_link_target(tarinfo))
676         else:
677             # If there's no data associated with the member (directory, chrdev,
678             # blkdev, etc.), return None instead of a file object.
679             return None
680 
681     def _extract_member(self, tarinfo, targetpath):
682         """Extract the TarInfo object tarinfo to a physical
683            file called targetpath.
684         """
685         # Fetch the TarInfo object for the given name
686         # and build the destination pathname, replacing
687         # forward slashes to platform specific separators.
688         targetpath = targetpath.rstrip("/")
689         targetpath = targetpath.replace("/", os.sep)
690 
691         # Create all upper directories.
692         upperdirs = os.path.dirname(targetpath)
693         if upperdirs and not os.path.exists(upperdirs):
694             # Create directories that are not part of the archive with
695             # default permissions.
696             os.makedirs(upperdirs)
697 
698         if tarinfo.islnk() or tarinfo.issym():
699             self._dbg(1, "%s -> %s" % (tarinfo.name, tarinfo.linkname))
700         else:
701             self._dbg(1, tarinfo.name)
702 
703         if tarinfo.isreg():
704             self.makefile(tarinfo, targetpath)
705         elif tarinfo.isdir():
706             self.makedir(tarinfo, targetpath)
707         elif tarinfo.isfifo():
708             self.makefifo(tarinfo, targetpath)
709         elif tarinfo.ischr() or tarinfo.isblk():
710             self.makedev(tarinfo, targetpath)
711         elif tarinfo.islnk() or tarinfo.issym():
712             self.makelink(tarinfo, targetpath)
713         elif tarinfo.type not in SUPPORTED_TYPES:
714             self.makeunknown(tarinfo, targetpath)
715         else:
716             self.makefile(tarinfo, targetpath)
717 
718         self.chown(tarinfo, targetpath)
719         if not tarinfo.issym():
720             self.chmod(tarinfo, targetpath)
721             self.utime(tarinfo, targetpath)
722 
723     #--------------------------------------------------------------------------
724     # Below are the different file methods. They are called via
725     # _extract_member() when extract() is called. They can be replaced in a
726     # subclass to implement other functionality.
727 
728     def makedir(self, tarinfo, targetpath):
729         """Make a directory called targetpath.
730         """
731         try:
732             # Use a safe mode for the directory, the real mode is set
733             # later in _extract_member().
734             os.mkdir(targetpath, 0700)
735         except EnvironmentError, e:
736             if e.errno != errno.EEXIST:
737                 raise
738 
739     def makefile(self, tarinfo, targetpath):
740         """Make a file called targetpath.
741         """
742         source = self.extractfile(tarinfo)
743         try:
744             with bltn_open(targetpath, "wb") as target:
745                 copyfileobj(source, target)
746         finally:
747             source.close()
748 
749     def makeunknown(self, tarinfo, targetpath):
750         """Make a file from a TarInfo object with an unknown type
751            at targetpath.
752         """
753         self.makefile(tarinfo, targetpath)
754         self._dbg(1, "tarfile: Unknown file type %r, " \
755                      "extracted as regular file." % tarinfo.type)
756 
757     def makefifo(self, tarinfo, targetpath):
758         """Make a fifo called targetpath.
759         """
760         if hasattr(os, "mkfifo"):
761             os.mkfifo(targetpath)
762         else:
763             raise ExtractError("fifo not supported by system")
764 
765     def makedev(self, tarinfo, targetpath):
766         """Make a character or block device called targetpath.
767         """
768         if not hasattr(os, "mknod") or not hasattr(os, "makedev"):
769             raise ExtractError("special devices not supported by system")
770 
771         mode = tarinfo.mode
772         if tarinfo.isblk():
773             mode |= stat.S_IFBLK
774         else:
775             mode |= stat.S_IFCHR
776 
777         os.mknod(targetpath, mode,
778                  os.makedev(tarinfo.devmajor, tarinfo.devminor))
779 
780     def makelink(self, tarinfo, targetpath):
781         """Make a (symbolic) link called targetpath. If it cannot be created
782           (platform limitation), we try to make a copy of the referenced file
783           instead of a link.
784         """
785         if hasattr(os, "symlink") and hasattr(os, "link"):
786             # For systems that support symbolic and hard links.
787             if tarinfo.issym():
788                 if os.path.lexists(targetpath):
789                     os.unlink(targetpath)
790                 os.symlink(tarinfo.linkname, targetpath)
791             else:
792                 # See extract().
793                 if os.path.exists(tarinfo._link_target):
794                     if os.path.lexists(targetpath):
795                         os.unlink(targetpath)
796                     os.link(tarinfo._link_target, targetpath)
797                 else:
798                     self._extract_member(self._find_link_target(tarinfo), targetpath)
799         else:
800             try:
801                 self._extract_member(self._find_link_target(tarinfo), targetpath)
802             except KeyError:
803                 raise ExtractError("unable to resolve link inside archive")
804 
805     def chown(self, tarinfo, targetpath):
806         """Set owner of targetpath according to tarinfo.
807         """
808         if pwd and hasattr(os, "geteuid") and os.geteuid() == 0:
809             # We have to be root to do so.
810             try:
811                 g = grp.getgrnam(tarinfo.gname)[2]
812             except KeyError:
813                 g = tarinfo.gid
814             try:
815                 u = pwd.getpwnam(tarinfo.uname)[2]
816             except KeyError:
817                 u = tarinfo.uid
818             try:
819                 if tarinfo.issym() and hasattr(os, "lchown"):
820                     os.lchown(targetpath, u, g)
821                 else:
822                     if sys.platform != "os2emx":
823                         os.chown(targetpath, u, g)
824             except EnvironmentError, e:
825                 raise ExtractError("could not change owner")
826 
827     def chmod(self, tarinfo, targetpath):
828         """Set file permissions of targetpath according to tarinfo.
829         """
830         if hasattr(os, 'chmod'):
831             try:
832                 os.chmod(targetpath, tarinfo.mode)
833             except EnvironmentError, e:
834                 raise ExtractError("could not change mode")
835 
836     def utime(self, tarinfo, targetpath):
837         """Set modification time of targetpath according to tarinfo.
838         """
839         if not hasattr(os, 'utime'):
840             return
841         try:
842             os.utime(targetpath, (tarinfo.mtime, tarinfo.mtime))
843         except EnvironmentError, e:
844             raise ExtractError("could not change modification time")
845 
846     #--------------------------------------------------------------------------
847     def next(self):
848         """Return the next member of the archive as a TarInfo object, when
849            TarFile is opened for reading. Return None if there is no more
850            available.
851         """
852         self._check("ra")
853         if self.firstmember is not None:
854             m = self.firstmember
855             self.firstmember = None
856             return m
857 
858         # Read the next block.
859         self.fileobj.seek(self.offset)
860         tarinfo = None
861         while True:
862             try:
863                 tarinfo = self.tarinfo.fromtarfile(self)
864             except EOFHeaderError, e:
865                 if self.ignore_zeros:
866                     self._dbg(2, "0x%X: %s" % (self.offset, e))
867                     self.offset += BLOCKSIZE
868                     continue
869             except InvalidHeaderError, e:
870                 if self.ignore_zeros:
871                     self._dbg(2, "0x%X: %s" % (self.offset, e))
872                     self.offset += BLOCKSIZE
873                     continue
874                 elif self.offset == 0:
875                     raise ReadError(str(e))
876             except EmptyHeaderError:
877                 if self.offset == 0:
878                     raise ReadError("empty file")
879             except TruncatedHeaderError, e:
880                 if self.offset == 0:
881                     raise ReadError(str(e))
882             except SubsequentHeaderError, e:
883                 raise ReadError(str(e))
884             break
885 
886         if tarinfo is not None:
887             self.members.append(tarinfo)
888         else:
889             self._loaded = True
890 
891         return tarinfo
892 
893     #--------------------------------------------------------------------------
894     # Little helper methods:
895 
896     def _getmember(self, name, tarinfo=None, normalize=False):
897         """Find an archive member by name from bottom to top.
898            If tarinfo is given, it is used as the starting point.
899         """
900         # Ensure that all members have been loaded.
901         members = self.getmembers()
902 
903         # Limit the member search list up to tarinfo.
904         if tarinfo is not None:
905             members = members[:members.index(tarinfo)]
906 
907         if normalize:
908             name = os.path.normpath(name)
909 
910         for member in reversed(members):
911             if normalize:
912                 member_name = os.path.normpath(member.name)
913             else:
914                 member_name = member.name
915 
916             if name == member_name:
917                 return member
918 
919     def _load(self):
920         """Read through the entire archive file and look for readable
921            members.
922         """
923         while True:
924             tarinfo = self.next()
925             if tarinfo is None:
926                 break
927         self._loaded = True
928 
929     def _check(self, mode=None):
930         """Check if TarFile is still open, and if the operation's mode
931            corresponds to TarFile's mode.
932         """
933         if self.closed:
934             raise IOError("%s is closed" % self.__class__.__name__)
935         if mode is not None and self.mode not in mode:
936             raise IOError("bad operation for mode %r" % self.mode)
937 
938     def _find_link_target(self, tarinfo):
939         """Find the target member of a symlink or hardlink member in the
940            archive.
941         """
942         if tarinfo.issym():
943             # Always search the entire archive.
944             linkname = "/".join(filter(None, (os.path.dirname(tarinfo.name), tarinfo.linkname)))
945             limit = None
946         else:
947             # Search the archive before the link, because a hard link is
948             # just a reference to an already archived file.
949             linkname = tarinfo.linkname
950             limit = tarinfo
951 
952         member = self._getmember(linkname, tarinfo=limit, normalize=True)
953         if member is None:
954             raise KeyError("linkname %r not found" % linkname)
955         return member
956 
957     def __iter__(self):
958         """Provide an iterator object.
959         """
960         if self._loaded:
961             return iter(self.members)
962         else:
963             return TarIter(self)
964 
965     def _dbg(self, level, msg):
966         """Write debugging output to sys.stderr.
967         """
968         if level <= self.debug:
969             print >> sys.stderr, msg
970 
971     def __enter__(self):
972         self._check()
973         return self
974 
975     def __exit__(self, type, value, traceback):
976         if type is None:
977             self.close()
978         else:
979             # An exception occurred. We must not call close() because
980             # it would try to write end-of-archive blocks and padding.
981             if not self._extfileobj:
982                 self.fileobj.close()
983             self.closed = True
984 # class TarFile
985 
986 TarFile
987 
988 TarFile 源码

TarFile 源码

（6）json 和 pickle模块：文件只能存二进制或字符串，不能存其他类型，所以用到了用于序列化的两个模块

（7）shelve模块：shelve模块内部对pickle进行了封装，shelve模块是一个简单的k,v将内存数据通过文件持久化的模块，可以持久化任何pickle可支持的python数据格式（可以存储数据、获取数据、给数据重新赋值）

 1 #!/usr/bin/env python
 2 # -*- coding:utf-8 -*-
 3 #-Author-solo
 4 import shelve
 5 # k，v方式存储数据
 6 s = shelve.open("shelve_test")  # 打开一个文件
 7 tuple = (1, 2, 3, 4)
 8 list = ['a', 'b', 'c', 'd']
 9 info = {"name": "lzl", "age": 18}
10 s["tuple"] = tuple  # 持久化元组
11 s["list"] = list
12 s["info"] = info
13 s.close()
14 # 通过key获取value值
15 d = shelve.open("shelve_test")  # 打开一个文件
16 print(d["tuple"])  # 读取
17 print(d.get("list"))
18 print(d.get("info"))
19 # (1, 2, 3, 4)
20 # ['a', 'b', 'c', 'd']
21 # {'name': 'lzl', 'age': 18}
22 d.close()
23 # 循环打印key值
24 s = shelve.open("shelve_test")  # 打开一个文件
25 for k in s.keys():              # 循环key值
26     print(k)
27 # list
28 # tuple
29 # info
30 s.close()
31 # 更新key的value值
32 s = shelve.open("shelve_test")  # 打开一个文件
33 s.update({"list":[22,33]})      #重新赋值或者s["list"] = [22,33]
34 print(s["list"])
35 #[22, 33]
36 s.close()

View Code

（8）xml模块：xml是实现不同语言或程序之间进行数据交换的协议，跟json差不多，但json使用起来更简单（通过<>节点来区别数据结构）

 1 <?xml version="1.0"?><data>
 2     <country name="Liechtenstein">
 3         <rank updated="yes">2</rank>
 4         <year>2008</year>
 5         <gdppc>141100</gdppc>
 6         <neighbor name="Austria" direction="E"/>
 7         <neighbor name="Switzerland" direction="W"/>
 8     </country>
 9     <country name="Singapore">
10         <rank updated="yes">5</rank>
11         <year>2011</year>
12         <gdppc>59900</gdppc>
13         <neighbor name="Malaysia" direction="N"/>
14     </country>
15     <country name="Panama">
16         <rank updated="yes">69</rank>
17         <year>2011</year>
18         <gdppc>13600</gdppc>
19         <neighbor name="Costa Rica" direction="W"/>
20         <neighbor name="Colombia" direction="E"/>
21     </country></data>

文件

 1 import xml.etree.ElementTree as ET
 2 tree = ET.parse("xmltest.xml")
 3 root = tree.getroot()
 4 print(root.tag)
 5 #遍历xml文档
 6 for child in root:
 7     print(child.tag, child.attrib)
 8     for i in child:
 9         print(i.tag,i.text)
10 #只遍历year 节点
11 for node in root.iter('year'):
12     print(node.tag,node.text)
13 #修改
14 for node in root.iter('year'):
15     new_year = int(node.text) + 1
16     node.text = str(new_year)
17     node.set("updated","yes")
18 tree.write("xmltest.xml")
19 #删除node
20 for country in root.findall('country'):
21    rank = int(country.find('rank').text)
22    if rank > 50:
23      root.remove(country)
24 tree.write('output.xml')
25 
26 ###########自己创建xml文档
27 import xml.etree.ElementTree as ET 
28 new_xml = ET.Element("namelist")
29 name = ET.SubElement(new_xml,"name",attrib={"enrolled":"yes"})
30 age = ET.SubElement(name,"age",attrib={"checked":"no"})
31 sex = ET.SubElement(name,"sex")
32 sex.text = '33'
33 name2 = ET.SubElement(new_xml,"name",attrib={"enrolled":"no"})
34 age = ET.SubElement(name2,"age")
35 age.text = '19'
36 et = ET.ElementTree(new_xml) #生成文档对象
37 et.write("test.xml", encoding="utf-8",xml_declaration=True)
38 ET.dump(new_xml) #打印生成的格式

操作

（9）configparser模块：用于生成和修改配置文档（很少在程序中修改配置文件）

（10）hashlib模块：用于加密相关的操作，3.x里代替了md5模块和sha模块，主要提供 SHA1, SHA224, SHA256, SHA384, SHA512 ，MD5 算法

 1 import hashlib
 2 m = hashlib.md5()
 3 m.update(b"Hello")
 4 m.update(b"It's me")
 5 print(m.digest())
 6 m.update(b"It's been a long time since last time we ...")
 7 print(m.digest()) #2进制格式hash
 8 print(len(m.hexdigest())) #16进制格式hash
 9 '''
10 def digest(self, *args, **kwargs): # real signature unknown
11     """ Return the digest value as a string of binary data. """
12     pass
13 def hexdigest(self, *args, **kwargs): # real signature unknown
14     """ Return the digest value as a string of hexadecimal digits. """
15     pass
16 '''
17 import hashlib 
18 # ######## md5 ########
19 hash = hashlib.md5()
20 hash.update('admin')
21 print(hash.hexdigest())
22 # ######## sha1 ########
23 hash = hashlib.sha1()
24 hash.update('admin')
25 print(hash.hexdigest())
26 # ######## sha256 ########
27 hash = hashlib.sha256()
28 hash.update('admin')
29 print(hash.hexdigest())
30 # ######## sha384 ########
31 hash = hashlib.sha384()
32 hash.update('admin')
33 print(hash.hexdigest())
34 # ######## sha512 ########
35 hash = hashlib.sha512()
36 hash.update('admin')
37 print(hash.hexdigest())

hashlib

1 import hmac
2 h = hmac.new('wueiqi')
3 h.update('hellowo')
4 print h.hexdigest()

hmac 模块

（11）re模块：用于对python的正则表达式的操作；匹配（动态模糊的匹配）；关键是匹配条件

 1 '.'     默认匹配除\n之外的任意一个字符，若指定flag DOTALL,则匹配任意字符，包括换行
 2 '^'     匹配字符开头，若指定flags MULTILINE,这种也可以匹配上(r"^a","\nabc\neee",flags=re.MULTILINE)
 3 '$'     匹配字符结尾，或e.search("foo$","bfoo\nsdfsf",flags=re.MULTILINE).group()也可以
 4 '*'     匹配*号前的字符0次或多次，re.findall("ab*","cabb3abcbbac")  结果为['abb', 'ab', 'a']
 5 '+'     匹配前一个字符1次或多次，re.findall("ab+","ab+cd+abb+bba") 结果['ab', 'abb']
 6 '?'     匹配前一个字符1次或0次
 7 '{m}'   匹配前一个字符m次
 8 '{n,m}' 匹配前一个字符n到m次，re.findall("ab{1,3}","abb abc abbcbbb") 结果'abb', 'ab', 'abb']
 9 '|'     匹配|左或|右的字符，re.search("abc|ABC","ABCBabcCD").group() 结果'ABC'
10 '(...)' 分组匹配，re.search("(abc){2}a(123|456)c", "abcabca456c").group() 结果 abcabca456c
11 '[a-z]' 匹配a到z任意一个字符
12 '[^()]' 匹配除()以外的任意一个字符  
13 '\A'    只从字符开头匹配，re.search("\Aabc","alexabc") 是匹配不到的
14 '\Z'    匹配字符结尾，同$
15 '\d'    匹配数字0-9
16 '\D'    匹配非数字
17 '\w'    匹配[A-Za-z0-9]
18 '\W'    匹配非[A-Za-z0-9]
19 '\s'    匹配空白字符、\t、\n、\r , re.search("\s+","ab\tc1\n3").group() 结果 '\t'  
20 '(?P<name>...)' 分组匹配 re.search("(?P<province>[0-9]{4})(?P<city>[0-9]{2})(?P<birthday>[0-9]{4})","371481199306143242").groupdict("city")
21 结果{'province': '3714', 'city': '81', 'birthday': '1993'}

正则表达式

①、match：从起始位置开始去匹配

1 #match
2 import re                              
3 obj = re.match('\d+', '123uua123sf')       #从第一个字符开始匹配一个到多个数字
4 print(obj)                               
5 #<_sre.SRE_Match object; span=(0, 3), match='123'>
6 if obj:                                   #如果有匹配到字符则执行，为空不执行
7     print(obj.group())                    #打印匹配到的内容
8 #123

View Code

②、search：最前面去匹配（不一定是最开始位置），匹配最前

 1 #search
 2 import  re
 3 obj = re.search('\d+', 'a123uu234asf')     #从数字开始匹配一个到多个数字
 4 print(obj)
 5 #<_sre.SRE_Match object; span=(1, 4), match='123'>
 6 if obj:                                   #如果有匹配到字符则执行，为空不执行
 7     print(obj.group())                    #打印匹配到的内容
 8 #123
 9 import  re
10 obj = re.search('\([^()]+\)', 'sdds(a1fwewe2(3uusfdsf2)34as)f')     #匹配最里面（）的内容
11 print(obj)
12 #<_sre.SRE_Match object; span=(13, 24), match='(3uusfdsf2)'>
13 if obj:                                   #如果有匹配到字符则执行，为空不执行
14     print(obj.group())                    #打印匹配到的内容
15 #(3uusfdsf2)

View Code

③、group与groups的区别

 1 #group与groups的区别
 2 import  re
 3 a = "123abc456"
 4 b = re.search("([0-9]*)([a-z]*)([0-9]*)", a)
 5 print(b)
 6 #<_sre.SRE_Match object; span=(0, 9), match='123abc456'>
 7 print(b.group())
 8 #123abc456
 9 print(b.group(0))
10 #123abc456
11 print(b.group(1))
12 #123
13 print(b.group(2))
14 #abc
15 print(b.group(3))
16 #456
17 print(b.groups())
18 #('123', 'abc', '456')

View Code

④、findall上述两中方式均用于匹配单值，即：只能匹配字符串中的一个，如果想要匹配到字符串中所有符合条件的元素，则需要使用 findall；findall没有group 用法

1 #findall
2 import  re
3 obj = re.findall('\d+', 'a123uu234asf')     #匹配多个
4 if obj:                                   #如果有匹配到字符则执行，为空不执行
5     print(obj)                             #生成的内容为列表
6 #['123', '234']

View Code

⑤、sub：用于替换匹配的字符串

1 #sub
2 import  re
3 content = "123abc456"
4 new_content = re.sub('\d+', 'ABC', content)
5 print(new_content)
6 #ABCabcABC

View Code

⑥、split：根据指定匹配进行分组(分割)

 1 #split
 2 import  re
 3 content = "1 - 2 * ((60-30+1*(9-2*5/3+7/3*99/4*2998+10*568/14))-(-4*3)/(16-3*2) )"
 4 new_content = re.split('\*', content)       #用*进行分割，分割为列表
 5 print(new_content)
 6 #['1 - 2 ', ' ((60-30+1', '(9-2', '5/3+7/3', '99/4', '2998+10', '568/14))-(-4', '3)/(16-3', '2) )'] 
 7 content = "'1 - 2 * ((60-30+1*(9-2*5/3+7/3*99/4*2998+10*568/14))-(-4*3)/(16-3*2) )'"
 8 new_content = re.split('[\+\-\*\/]+', content)
 9 # new_content = re.split('\*', content, 1)
10 print(new_content)
11 #["'1 ", ' 2 ', ' ((60', '30', '1', '(9', '2', '5', '3', '7', '3', '99', '4', '2998', '10', '568', '14))',
12 #  '(', '4', '3)', '(16', '3', "2) )'"]
13 inpp = '1-2*((60-30 +(-40-5)*(9-2*5/3 + 7 /3*99/4*2998 +10 * 568/14 )) - (-4*3)/ (16-3*2))'
14 inpp = re.sub('\s*','',inpp)                #把空白字符去掉
15 print(inpp)
16 new_content = re.split('\(([\+\-\*\/]?\d+[\+\-\*\/]?\d+){1}\)', inpp, 1)
17 print(new_content)
18 #['1-2*((60-30+', '-40-5', '*(9-2*5/3+7/3*99/4*2998+10*568/14))-(-4*3)/(16-3*2))']

View Code

（12）urllib模块：提供了一系列用于操作URL的功能（利用程序去执行各种HTTP请求。如果要模拟浏览器完成特定功能，需要把请求伪装成浏览器。伪装的方法是先监控浏览器发出的请求，再根据浏览器的请求头来伪装，User-Agent头就是用来标识浏览器的。）

 1 #!/usr/bin/env python
 2 # -*- coding:utf-8 -*-
 3 #-Author-solo
 4 import urllib.request
 5 def getdata():
 6     url = "http://www.baidu.com"
 7     data = urllib.request.urlopen(url).read()
 8     data = data.decode("utf-8")
 9     print(data)
10 getdata()
11 
12 ###urlopen返回的类文件对象支持close、read、readline、和readlines方法

View Code

十二、面向对象

面向过程编程：通过代码的层层堆积来实现功能。不易迭代和维护。
函数式编程：将某功能代码封装到函数中，仅调用函数即可
面向对象编程：利用“类”和“对象”来创建各种模型来实现对真实世界的描述；使用面向对象编程的原因一方面是因为它可以使程序的维护和扩展变得更简单，并且可以大大提高程序开发效率，另外，基于面向对象的程序可以使它人更加容易理解你的代码逻辑，从而使团队开发变得更从容。

 1 #经典类
 2 class A():
 3     def __init__(self):
 4         print("A")
 5 class B(A):
 6     pass
 7 class C(A):
 8     def __init__(self):
 9         print("C")
10 class D(B,C):
11     pass
12 obj = D()
13 #A
14 #新式类
15 class A(object):
16     def __init__(self):
17         print("A")
18 class B(A):
19     pass
20 class C(A):
21     def __init__(self):
22         print("C")
23 class D(B,C):
24     pass
25 obj = D()
26 #C

经典类、新式类

 1 #属性方法
 2 class Flight(object):
 3     def __init__(self, name):
 4         self.flight_name = name
 5     def checking_status(self):
 6         print("checking flight %s status " % self.flight_name)
 7         return 1
 8     @property
 9     def flight_status(self):
10         status = self.checking_status()
11         if status == 0:
12             print("flight got canceled...")
13         elif status == 1:
14             print("flight is arrived...")
15         elif status == 2:
16             print("flight has departured already...")
17         else:
18             print("cannot confirm the flight status...,please check later")
19     @flight_status.setter  # 修改     执行修改操作时触发
20     def flight_status(self, status):
21         status_dic = {
22         0: "canceled",
23         1:"arrived",
24         2: "departured"
25         }
26         print("\033[31;1mHas changed the flight status to \033[0m", status_dic.get(status))
27     @flight_status.deleter  # 删除
28     def flight_status(self):
29         print("status got removed...")
30 f = Flight("CA980")
31 f.flight_status = 0  # 触发@flight_status.setter 只执行setter装饰的代码
32 del f.flight_status  #  触发@flight_status.deleter 只执行deleter装饰的代码
33 
34 #执行相应的操作，触发相应的装饰器，此时不会再触发原来的属性，只执行装饰器下面的代码，需要做相应的操作可在代码块里添加（修改，删除）；只是触发了而已，装饰器并没有做什么操作

航班查询

类的特殊成员方法：
① __doc__　　表示类的描述信息

1 #__doc__
2 class Foo:
3     """ 描述类信息，这是用于看片的神奇 """
4     def func(self):
5         pass
6 print(Foo.__doc__)
7 # 描述类信息，这是用于看片的神奇

View Code

② __module__ 和 __class__
__module__ 表示当前操作的对象在哪个模块
__class__ 表示当前操作的对象的类是什么

 1 # __module__ 和  __class__
 2 class Foo:
 3     """ 描述类信息，这是用于看片的神奇 """
 4     def func(self):
 5         pass
 6 A = Foo()
 7 print(A.__module__)
 8 print(A.__class__)
 9 # __main__
10 # <class '__main__.Foo'>

View Code

③ __init__ 构造方法，通过类创建对象时，自动触发执行

④ __del__析构方法，当对象在内存中被释放时，自动触发执行

⑤ __call__ 对象后面加括号，触发执行
注：__init__的执行是由创建对象触发的，即：对象 = 类名() ；而对于 __call__ 方法的执行是由对象后加括号触发的，即：对象() 或者类()()

1 # __call__
2 class Foo:
3     def __init__(self):
4         pass
5     def __call__(self, *args, **kwargs):
6         print('__call__')
7 obj = Foo()  # 执行 __init__
8 obj()  # 执行 __call__
9 #__call__

View Code

⑥ __dict__ 查看类或对象中的所有成员

 1 class Province:
 2     country = 'China'
 3     def __init__(self, name, count):
 4         self.name = name
 5         self.count = count
 6     def func(self, *args, **kwargs):
 7         print('func')
 8 # 获取类的成员，即：静态字段、方法、
 9 print(Province.__dict__)
10 # 输出：{'__init__': <function Province.__init__ at 0x0054D588>, '__dict__': <attribute '__dict__' of 'Province' objects>,
11 #  '__doc__': None, 'func': <function Province.func at 0x0054D4B0>, '__weakref__': <attribute '__weakref__' of 'Province' objects>,
12 #  'country': 'China', '__module__': '__main__'}
13 obj1 = Province('HeBei', 10000)
14 print(obj1.__dict__)
15 # 获取 对象obj1 的成员
16 # 输出：{'count': 10000, 'name': 'HeBei'}

View Code

⑦ __str__ 如果一个类中定义了__str__方法，那么在打印对象时，默认输出该方法的返回值

1 #__str__
2 class Foo:
3     def __str__(self):
4         return 'solo'
5 obj = Foo()
6 print(obj)              #输出__str__返回值 而不是内存地址
7 # 输出：solo

View Code

⑧ __getitem__、__setitem__、__delitem__
用于索引操作，如字典。以上分别表示获取、设置、删除数据

 1 #__getitem__、__setitem__、__delitem__
 2 class Foo(object):
 3     def __getitem__(self, key):
 4         print('__getitem__', key)
 5     def __setitem__(self, key, value):
 6         print('__setitem__', key, value)
 7     def __delitem__(self, key):
 8         print('__delitem__', key)
 9 obj = Foo()
10 result = obj['k1']  # 自动触发执行 __getitem__
11 obj['k2'] = 'solo'  # 自动触发执行 __setitem__
12 del obj['k1']
13 # __getitem__ k1
14 # __setitem__ k2 solo
15 # __delitem__ k1

View Code

⑨ __new__ \ __metaclass__

1 print type(f) # 输出：<class '__main__.Foo'>       表示，obj 对象由Foo类创建
2 print type(Foo) # 输出：<type 'type'>              表示，Foo类对象由 type 类创建

f对象是Foo类的一个实例，Foo类对象是 type 类的一个实例，即：Foo类对象是通过type类的构造方法创建

是由 type 类实例化产生那么问题来了，类默认是由 type 类实例化产生，type类中如何实现的创建类？类又是如何创建对象？
答：类中有一个属性 __metaclass__，其用来表示该类由谁来实例化创建，所以，我们可以为 __metaclass__ 设置一个type类的派生类，从而查看类创建的过程

 1 class MyType(type):
 2     def __init__(self, what, bases=None, dict=None):
 3         print("--MyType init---")
 4         super(MyType, self).__init__(what, bases, dict)
 5     def __call__(self, *args, **kwargs):
 6         print("--MyType call---")
 7         obj = self.__new__(self, *args, **kwargs)
 8         self.__init__(obj, *args, **kwargs)
 9 class Foo(object):
10     __metaclass__ = MyType
11     def __init__(self, name):
12         self.name = name
13         print("Foo ---init__")
14     def __new__(cls, *args, **kwargs):
15         print("Foo --new--")
16         return object.__new__(cls)
17 # 第一阶段：解释器从上到下执行代码创建Foo类
18 # 第二阶段：通过Foo类创建obj对象
19 obj = Foo("solo")

View Code

反射：通过字符串映射或修改程序运行时的状态、属性、方法。有以下4个方法
① hasattr(obj,str) 判断一个对象obj里是否有对应的str字符串的方法
② getattr(obj,str) 根据字符串去获取obj对象里的对应的方法的内存地址

 1 class Foo(object):
 2     def __init__(self,name):
 3         self.name = name
 4     def func(self):
 5         print("func",self.name)
 6 obj = Foo("alex")
 7 str = "func"
 8 print(hasattr(obj,str))   # 检查是否含有成员 有没有obj.str属性
 9 if hasattr(obj,str):
10    getattr(obj,str)()      #getattr(obj,str) = obj.str
11 # True
12 # func alex

View Code

③ setattr(obj,'y','z') obj.y = z 通过字符串添加属性

 1 def bulk(self):
 2     print("%s is yelling"%self.name)
 3 class Foo(object):
 4     def __init__(self,name):
 5         self.name = name
 6     def func(self):
 7         print("func",self.name)
 8 obj = Foo("alex")
 9 str = "talk"
10 print(hasattr(obj,str))   # 检查是否含有成员 有没有obj.str属性
11 if hasattr(obj,str):
12    getattr(obj,str)()      # getattr(obj,str) = obj.str
13 else:
14     setattr(obj,str,bulk)   # setattr(obj,str,bulk 相当于 obj.str = bulk
15     getattr(obj,str)()
16 # False
17 # alex is yelling

View Code

④ delattr(obj,str) 删除obj.str 通过字符串删除属性

 1 class Foo(object):
 2     def __init__(self,name):
 3         self.name = name
 4     def func(self):
 5         print("func",self.name)
 6 obj = Foo("alex")
 7 str = "name"
 8 if hasattr(obj,str):
 9    delattr(obj,str)      # 删除属性obj.str
10 print(obj.name)
11 # Traceback (most recent call last):
12 #   File "C:/Users/L/PycharmProjects/s14/preview/Day7/main.py", line 40, in <module>
13 #     print(obj.name)
14 # AttributeError: 'Foo' object has no attribute 'name'

View Code

十三、Python垃圾回收机制

概述：和许多其它的高级语言一样，Python使用了垃圾回收器来自动销毁那些不再使用的对象。每个对象都有一个引用计数，当这个引用计数为0时Python能够安全地销毁这个对象

问题点：由于一次仅能有一个对象被回收，引用计数无法回收循环引用的对象。

解决方案：弱引用：减少循环引用，减少内存中不必要的对象存在的数量。对象可能在任何时刻被回收。

详细内容

posted @ 2017-11-14 10:08 李小小小伟阅读(11436) 评论(0) 编辑收藏举报

会员力量，点亮园子希望

刷新页面返回顶部

Python学习笔记整理总结【语言基础篇】

公告