python-python基础5（模块）

一、模块介绍

定义：本质上就是.py结尾的python文件。模块，用一砣代码实现了某个功能的代码集合。

类似于函数式编程和面向过程编程，函数式编程则完成一个功能，其他代码用来调用即可，提供了代码的重用性和代码间的耦合。而对于一个复杂的功能来，可能需要多个函数才能完成（函数又可以在不同的.py文件中），n个 .py 文件组成的代码集合就称为模块。

导入模块：

import 本质就是把python文件解释一遍。例如 import test，就是把test里的所有代码赋值给test这个变量，所以在其它地方调用test就需要test.xxx来调用。

而from test import *这种方式，是直接执行test里面的所有代码，调用可以直接xxx.

from...import...与import语句基本一致，唯一不同的是：使用import foo导入模块后，引用模块中的名字都需要加上foo.作为前缀，而使用from foo import x,get,change,Foo则可以在当前执行文件中直接引用模块foo中的名字，如下

from foo import x,get,change #将模块foo中的x和get导入到当前名称空间
a=x #直接使用模块foo中的x赋值给a
get() #直接执行foo中的get函数
change() #即便是当前有重名的x，修改的仍然是源文件中的x

无需加前缀的好处是使得我们的代码更加简洁，坏处则是容易与当前名称空间中的名字冲突，如果当前名称空间存在相同的名字，则后定义的名字会覆盖之前定义的名字。

查看导入模块的搜索路径：sys.path，如果不是python自带的模块，一般只会找同级目录下的模块，如果需要导入其它目录下的模块，要把那个模块所在的上级目录给追加进sys.path里。例：

p=os.path.dirname(os.path.dirname(os.path.abspath(__file__)))
sys.path.append(p)

如果导入的是包，那么就是执行包下面的__init__文件。

如果想要导入某个包里的一个模块，可以在那个包中的__init__.py文件中写from . import xxx，导入xxx模块

我们还可以在当前位置为导入的模块起一个别名

import foo as f #为导入的模块foo在当前位置起别名f，以后再使用时就用这个别名f
f.x
f.get()

还可以为导入的一个名字起别名

from foo import get as get_x
get_x()

通常在被导入的名字过长时采用起别名的方式来精简代码，另外为被导入的名字起别名可以很好地避免与当前名字发生冲突，还有很重要的一点就是：可以保持调用方式的一致性，例如我们有两个模块json和pickle同时实现了load方法，作用是从一个打开的文件中解析出结构化的数据，但解析的格式不同，可以用下述代码有选择性地加载不同的模块

if data_format == 'json':
    import json as serialize #如果数据格式是json，那么导入json模块并命名为serialize
elif data_format == 'pickle':
    import pickle as serialize #如果数据格式是pickle，那么导入pickle模块并命名为serialize

data=serialize.load(fn) #最终调用的方式是一致的

模块的分类

模块分为三种：

自定义模块
内置标准模块（又称标准库）
开源模块

二、time & datetime模块

time模块

在Python中，通常有这几种方式来表示时间：1）时间戳 2）格式化的时间字符串 3）元组（struct_time）共九个元素。由于Python的time模块实现主要调用C库，所以各个平台可能有所不同。

1.以元组方式返回本地当前时间
>>> time.localtime()
time.struct_time(tm_year=2017, tm_mon=5, tm_mday=8, tm_hour=16, tm_min=13, tm_sec=34, tm_wday=0, tm_yday=128, tm_isdst=0)

2.以元组方式返回格林威治时间
>>> time.gmtime()   
time.struct_time(tm_year=2017, tm_mon=5, tm_mday=8, tm_hour=8, tm_min=13, tm_sec=38, tm_wday=0, tm_yday=128, tm_isdst=0)

3.将元组时间转换为时间戳
>>> x = time.localtime()
>>> time.mktime(x)
1494232890.0

4.将元组时间转换为字符串格式时间
>>> x = time.localtime()
>>> time.strftime('%Y-%m-%d %H:%M:%S',x)
'2017-05-08 16:57:38'

5.将字符串格式时间转换为元组格式时间
>>> time.strptime('2017-05-08 17:03:12','%Y-%m-%d %H:%M:%S')
time.struct_time(tm_year=2017, tm_mon=5, tm_mday=8, tm_hour=17, tm_min=3, tm_sec=12, tm_wday=0, tm_yday=128, tm_isdst=-1)

6.元组格式时间转换为字符串格式时间
>>> time.asctime()
'Tue May  9 15:23:21 2017'
>>> x = time.localtime()
>>> time.asctime(x)
'Tue May  9 15:23:39 2017'

7.时间戳转换成字符串格式时间
>>> time.ctime()
'Tue May  9 16:07:24 2017'
>>> time.ctime(987867475)
'Sat Apr 21 23:37:55 2001'

格式参照:

字符串	功能
%a	本地（locale）简化星期名称
%A	本地完整星期名称
%b	本地简化月份名称
%B	本地完整月份名称
%c	本地相应的日期和时间表示
%d	一个月中的第几天（01 - 31）
%H	一天中的第几个小时（24小时制，00 - 23）
%I	第几个小时（12小时制，01 - 12）
%j	一年中的第几天（001 - 366）
%m	月份（01 - 12）
%M	分钟数（00 - 59）
%p	本地am或者pm的相应符
%S	秒（01 - 61）
%w	一个星期中的第几天（0 - 6，0是星期天）
%W	和%U基本相同，不同的是%W以星期一为一个星期的开始。
%x	本地相应日期
%X	本地相应时间
%y	去掉世纪的年份（00 - 99）
%Y	完整的年份
%Z	时区的名字（如果不存在为空字符）
%%	%’字符
%U	一年中的周数。（00 - 53，周日是一个周的开始。）第一个星期天之前的所有天数都放在第0周

datetime模块

import datetime
1.返回当前时间
>>> datetime.datetime.now()
datetime.datetime(2017, 5, 9, 17, 7, 0, 514481)

2.时间戳转换成日期
>>> datetime.date.fromtimestamp(1178766678)
datetime.date(2007, 5, 10)

3.当前时间+3天
>>> datetime.datetime.now() + datetime.timedelta(+3)
datetime.datetime(2017, 5, 12, 17, 12, 42, 124379)

4.当前时间-3天
>>> datetime.datetime.now() + datetime.timedelta(-3)
datetime.datetime(2017, 5, 6, 17, 13, 18, 474406)

5.当前时间+3小时
>>> datetime.datetime.now() + datetime.timedelta(hours=3)
datetime.datetime(2017, 5, 9, 20, 13, 55, 678310)

6.当前时间+30分钟
>>> datetime.datetime.now() + datetime.timedelta(minutes=30)
datetime.datetime(2017, 5, 9, 17, 44, 40, 392370)

三、random模块

import random
print (random.random())  #0.6445010863311293  
#random.random()用于生成一个0到1的随机符点数: 0 <= n < 1.0
print (random.randint(1,7)) #1到7之间的随机整数
#random.randint()的函数原型为：random.randint(a, b)，用于生成一个指定范围内的整数。
# 其中参数a是下限，参数b是上限，生成的随机数n: a <= n <= b 
print (random.randrange(1,10)) #5
#random.randrange的函数原型为：random.randrange([start], stop[, step])，
# 从指定范围内，按指定基数递增的集合中 获取一个随机数。如：random.randrange(10, 100, 2)，
# 结果相当于从[10, 12, 14, 16, ... 96, 98]序列中获取一个随机数。
# random.randrange(10, 100, 2)在结果上与 random.choice(range(10, 100, 2) 等效。
print(random.choice('liukuni')) #i
#random.choice从序列中获取一个随机元素。
# 其函数原型为：random.choice(sequence)。参数sequence表示一个有序类型。
# 这里要说明一下：sequence在python不是一种特定的类型，而是泛指一系列的类型。
# list, tuple, 字符串都属于sequence。有关sequence可以查看python手册数据模型这一章。
# 下面是使用choice的一些例子：
print(random.choice("学习Python"))#学
print(random.choice(["JGood","is","a","handsome","boy"]))  #List
print(random.choice(("Tuple","List","Dict")))   #List
print(random.sample([1,2,3,4,5],3))    #[1, 2, 5]
#random.sample的函数原型为：random.sample(sequence, k)，从指定序列中随机获取指定长度的片断。sample函数不会修改原有序列。

实际应用：

import random
import string
#随机整数：
print( random.randint(0,99))  #70
 
#随机选取0到100间的偶数：
print(random.randrange(0, 101, 2)) #4
 
#随机浮点数：
print( random.random()) #0.2746445568079129
print(random.uniform(1, 10)) #9.887001463194844
 
#随机字符：
print(random.choice('abcdefg&#%^*f')) #f
 
#多个字符中选取特定数量的字符：
print(random.sample('abcdefghij',3)) #['f', 'h', 'd']
 
#随机选取字符串：
print( random.choice ( ['apple', 'pear', 'peach', 'orange', 'lemon'] )) #apple
#洗牌#
items = [1,2,3,4,5,6,7]
print(items) #[1, 2, 3, 4, 5, 6, 7]
random.shuffle(items)
print(items) #[1, 4, 7, 2, 5, 3, 6]

生成随机验证码：

import random
checkcode = ''
for i in range(4):
    current = random.randrange(0,4)
    if current != i:
        temp = chr(random.randint(65,90))
    else:
        temp = random.randint(0,9)
    checkcode += str(temp)
print (checkcode)

改良版（有大小写）：

import random

checkcore=''
for i in range(6):
    tmp=random.randint(0,5)
    if tmp==i:
        chrnum=random.randint(65,122)
        if chrnum in range(91,97):
            current=chr(random.randint(65,90))
        else:
            current=chr(chrnum)
    else:
        current=random.randint(0,9)
    checkcore+=str(current)
print(checkcore)

四、OS模块

提供对操作系统进行调用的接口

os.getcwd() 获取当前工作目录，即当前python脚本工作的目录路径
os.chdir("dirname")  改变当前脚本工作目录；相当于shell下cd
os.curdir  返回当前目录: ('.')
os.pardir  获取当前目录的父目录字符串名：('..')
os.makedirs('dirname1/dirname2')    可生成多层递归目录
os.removedirs('dirname1')    若目录为空，则删除，并递归到上一级目录，如若也为空，则删除，依此类推
os.mkdir('dirname')    生成单级目录；相当于shell中mkdir dirname
os.rmdir('dirname')    删除单级空目录，若目录不为空则无法删除，报错；相当于shell中rmdir dirname
os.listdir('dirname')    列出指定目录下的所有文件和子目录，包括隐藏文件，并以列表方式打印
os.remove()  删除一个文件
os.rename("oldname","newname")  重命名文件/目录
os.stat('path/filename')  获取文件/目录信息
os.sep    输出操作系统特定的路径分隔符，win下为"\\",Linux下为"/"
os.linesep    输出当前平台使用的行终止符，win下为"\t\n",Linux下为"\n"
os.pathsep    输出用于分割文件路径的字符串
os.name    输出字符串指示当前使用平台。win->'nt'; Linux->'posix'
os.system("bash command")  运行shell命令，直接显示
os.environ  获取系统环境变量
os.path.abspath(path)  返回path规范化的绝对路径
os.path.split(path)  将path分割成目录和文件名二元组返回
os.path.dirname(path)  返回path的目录。其实就是os.path.split(path)的第一个元素
os.path.basename(path)  返回path最后的文件名。如何path以／或\结尾，那么就会返回空值。即os.path.split(path)的第二个元素
os.path.exists(path)  如果path存在，返回True；如果path不存在，返回False
os.path.isabs(path)  如果path是绝对路径，返回True
os.path.isfile(path)  如果path是一个存在的文件，返回True。否则返回False
os.path.isdir(path)  如果path是一个存在的目录，则返回True。否则返回False
os.path.join(path1[, path2[, ...]])  将多个路径组合后返回，第一个绝对路径之前的参数将被忽略
os.path.getatime(path)  返回path所指向的文件或者目录的最后存取时间
os.path.getmtime(path)  返回path所指向的文件或者目录的最后修改时间

五、sys模块

sys.argv           命令行参数List，第一个元素是程序本身路径
sys.exit(n)        退出程序，正常退出时exit(0)
sys.version        获取Python解释程序的版本信息
sys.maxint         最大的Int值
sys.path           返回模块的搜索路径，初始化时使用PYTHONPATH环境变量的值
sys.platform       返回操作系统平台名称
sys.stdout.write('please:')
val = sys.stdin.readline()[:-1]

六、shutil模块

高级的文件、文件夹、压缩包处理模块

shutil.copyfileobj(fsrc, fdst[, length])
将文件内容拷贝到另一个文件中，可以部分内容

shutil.copyfile(src, dst)
拷贝文件

shutil.copymode(src, dst)
仅拷贝权限。内容、组、用户均不变

shutil.copystat(src, dst)
拷贝状态的信息，包括：mode bits, atime, mtime, flags

shutil.copy(src, dst)
拷贝文件和权限

shutil.copy2(src, dst)
拷贝文件和状态信息

shutil.ignore_patterns(*patterns)
shutil.copytree(src, dst, symlinks=False, ignore=None)
递归的去拷贝文件

例如：copytree(source, destination, ignore=ignore_patterns('*.pyc', 'tmp*'))

shutil.rmtree(path[, ignore_errors[, onerror]])
递归的去删除文件

shutil.move(src, dst)
递归的去移动文件

shutil.make_archive(base_name, format,...)

创建压缩包并返回文件路径，例如：zip、tar

base_name：压缩包的文件名，也可以是压缩包的路径。只是文件名时，则保存至当前目录，否则保存至指定路径，
如：www =>保存至当前路径
如：/Users/wupeiqi/www =>保存至/Users/wupeiqi/
format：压缩包种类，“zip”, “tar”, “bztar”，“gztar”
root_dir：要压缩的文件夹路径（默认当前目录）
owner：用户，默认当前用户
group：组，默认当前组
logger：用于记录日志，通常是logging.Logger对象

#将 /Users/wupeiqi/Downloads/test 下的文件打包放置当前程序目录
 
import shutil
ret = shutil.make_archive("wwwwwwwwww", 'gztar', root_dir='/Users/wupeiqi/Downloads/test')
 
 
#将 /Users/wupeiqi/Downloads/test 下的文件打包放置 /Users/wupeiqi/目录
import shutil
ret = shutil.make_archive("/Users/wupeiqi/wwwwwwwwww", 'gztar', root_dir='/Users/wupeiqi/Downloads/test')

七、shelve模块

shelve模块是一个简单的k,v将内存数据通过文件持久化的模块，可以持久化任何pickle可支持的python数据格式

存数据：

import shelve
import time

d = shelve.open('shelve_test')  # 打开一个文件
info={"name":"jehu","job":"IT"}
name = ["jehu", "rain", "test"]
d["name"] = name  # 持久化列表
d["info"] = info  # 持久化类
d["time"] = time.ctime()

d.close()

读数据：

import shelve
import time

d = shelve.open('shelve_test')  # 打开一个文件
print(d.get("name"))
print(d.get("info"))
print(d.get("time"))
print(d.items())
for i in d.items():
    print(i)
d.close()

执行结果：

['jehu', 'rain', 'test']
{'name': 'jehu', 'job': 'IT'}
Fri Feb 21 16:27:11 2020
ItemsView(<shelve.DbfilenameShelf object at 0x0000016A75BB4640>)
('name', ['jehu', 'rain', 'test'])
('info', {'name': 'jehu', 'job': 'IT'})
('time', 'Fri Feb 21 16:27:11 2020')

八、xml处理模块

xml是实现不同语言或程序之间进行数据交换的协议，跟json差不多，但json使用起来更简单

xml的格式如下，就是通过<>节点来区别数据结构的:

<?xml version="1.0"?>
<data>
    <country name="Liechtenstein">
        <rank updated="yes">2</rank>
        <year>2008</year>
        <gdppc>141100</gdppc>
        <neighbor name="Austria" direction="E"/>
        <neighbor name="Switzerland" direction="W"/>
    </country>
    <country name="Singapore">
        <rank updated="yes">5</rank>
        <year>2011</year>
        <gdppc>59900</gdppc>
        <neighbor name="Malaysia" direction="N"/>
    </country>
    <country name="Panama">
        <rank updated="yes">69</rank>
        <year>2011</year>
        <gdppc>13600</gdppc>
        <neighbor name="Costa Rica" direction="W"/>
        <neighbor name="Colombia" direction="E"/>
    </country>
</data>

xml协议在各个语言里的都是支持的，在python中可以用以下模块操作xml

import xml.etree.ElementTree as ET

tree = ET.parse("xmltest.xml")
root = tree.getroot()
print(root.tag)

# 遍历xml文档
for child in root:
    print(child.tag, child.attrib)
    for i in child:
        print(i.tag, i.text,i.attrib)

# 只遍历year 节点
for node in root.iter('year'):
    print(node.tag, node.text)

修改和删除xml文档内容

import xml.etree.ElementTree as ET
 
tree = ET.parse("xmltest.xml")
root = tree.getroot()
 
#修改
for node in root.iter('year'):
    new_year = int(node.text) + 1
    node.text = str(new_year)
    node.set("updated","yes")
 
tree.write("xmltest.xml")
 
 
#删除node
for country in root.findall('country'):
   rank = int(country.find('rank').text)
   if rank > 50:
     root.remove(country)
 
tree.write('output.xml')

自己创建xml文档

import xml.etree.ElementTree as ET
 
 
new_xml = ET.Element("namelist")
name = ET.SubElement(new_xml,"name",attrib={"enrolled":"yes"})
age = ET.SubElement(name,"age",attrib={"checked":"no"})
sex = ET.SubElement(name,"sex")
sex.text = '33'
name2 = ET.SubElement(new_xml,"name",attrib={"enrolled":"no"})
age = ET.SubElement(name2,"age")
age.text = '19'
 
et = ET.ElementTree(new_xml) #生成文档对象
et.write("test.xml", encoding="utf-8",xml_declaration=True)
 
ET.dump(new_xml) #打印生成的格式

九、PyYAML模块

python也可以很容易的处理yaml文档格式，只不过需要安装一个模块，参考文档：http://pyyaml.org/wiki/PyYAMLDocumentation

十、ConfigParser模块

用于生成和修改常见配置文档，当前模块的名称在 python 3.x 版本中变更为 configparser。

来看一个好多软件的常见文档格式如下

[DEFAULT]
ServerAliveInterval = 45
Compression = yes
CompressionLevel = 9
ForwardX11 = yes
 
[bitbucket.org]
User = hg
 
[topsecret.server.com]
Port = 50022
ForwardX11 = no

如果想用python生成一个这样的文档怎么做呢？

import configparser
 
config = configparser.ConfigParser()
config["DEFAULT"] = {'ServerAliveInterval': '45',
                      'Compression': 'yes',
                     'CompressionLevel': '9'}
 
config['bitbucket.org'] = {}
config['bitbucket.org']['User'] = 'hg'
config['topsecret.server.com'] = {}
topsecret = config['topsecret.server.com']
topsecret['Host Port'] = '50022'     # mutates the parser
topsecret['ForwardX11'] = 'no'  # same here
config['DEFAULT']['ForwardX11'] = 'yes'
with open('example.ini', 'w') as configfile:
   config.write(configfile)

写完了还可以再读出来哈。

>>> import configparser
>>> config = configparser.ConfigParser()
>>> config.sections()
[]
>>> config.read('example.ini')
['example.ini']
>>> config.sections()
['bitbucket.org', 'topsecret.server.com']
>>> 'bitbucket.org' in config
True
>>> 'bytebong.com' in config
False
>>> config['bitbucket.org']['User']
'hg'
>>> config['DEFAULT']['Compression']
'yes'
>>> topsecret = config['topsecret.server.com']
>>> topsecret['ForwardX11']
'no'
>>> topsecret['Port']
'50022'
>>> for key in config['bitbucket.org']: print(key)
...
user
compressionlevel
serveraliveinterval
compression
forwardx11
>>> config['bitbucket.org']['ForwardX11']
'yes'

configparser增删改查语法

import ConfigParser
  
config = ConfigParser.ConfigParser()
config.read('i.cfg')
  
# ########## 读 ##########
#secs = config.sections()
#print secs
#options = config.options('group2')
#print options
  
#item_list = config.items('group2')
#print item_list
  
#val = config.get('group1','key')
#val = config.getint('group1','key')
  
# ########## 改写 ##########
#sec = config.remove_section('group1')
#config.write(open('i.cfg', "w"))
  
#sec = config.has_section('wupeiqi')
#sec = config.add_section('wupeiqi')
#config.write(open('i.cfg', "w"))
  
  
#config.set('group2','k1',11111)
#config.write(open('i.cfg', "w"))
  
#config.remove_option('group2','age')
#config.write(open('i.cfg', "w"))

十一、hashlib模块　

用于加密相关的操作，3.x里代替了md5模块和sha模块，主要提供 SHA1, SHA224, SHA256, SHA384, SHA512 ，MD5 算法

import hashlib
 
m = hashlib.md5()
m.update(b"Hello")
m.update(b"It's me")
print(m.digest())
m.update(b"It's been a long time since last time we ...")
 
print(m.digest()) #2进制格式hash
print(len(m.hexdigest())) #16进制格式hash

import hashlib
 
# ######## md5 ########
 
hash = hashlib.md5()
hash.update('admin')
print(hash.hexdigest())
 
# ######## sha1 ########
 
hash = hashlib.sha1()
hash.update('admin')
print(hash.hexdigest())
 
# ######## sha256 ########
 
hash = hashlib.sha256()
hash.update('admin')
print(hash.hexdigest())

十二、re模块

常用正则表达式符号

'.'     默认匹配除\n之外的任意一个字符，若指定flag DOTALL,则匹配任意字符，包括换行
'^'     匹配字符开头，若指定flags MULTILINE,这种也可以匹配上(r"^a","\nabc\neee",flags=re.MULTILINE)
'$'     匹配字符结尾，或e.search("foo$","bfoo\nsdfsf",flags=re.MULTILINE).group()也可以
'*'     匹配*号前的字符0次或多次，re.findall("ab*","cabb3abcbbac")  结果为['abb', 'ab', 'a']
'+'     匹配前一个字符1次或多次，re.findall("ab+","ab+cd+abb+bba") 结果['ab', 'abb']
'?'     匹配前一个字符1次或0次
'{m}'   匹配前一个字符m次
'{n,m}' 匹配前一个字符n到m次，re.findall("ab{1,3}","abb abc abbcbbb") 结果'abb', 'ab', 'abb']
'|'     匹配|左或|右的字符，re.search("abc|ABC","ABCBabcCD").group() 结果'ABC'
'(...)' 分组匹配，re.search("(abc){2}a(123|456)c", "abcabca456c").group() 结果 abcabca456c
 
 
'\A'    只从字符开头匹配，re.search("\Aabc","alexabc") 是匹配不到的
'\Z'    匹配字符结尾，同$
'\d'    匹配数字0-9
'\D'    匹配非数字
'\w'    匹配[A-Za-z0-9]
'\W'    匹配非[A-Za-z0-9]
's'     匹配空白字符、\t、\n、\r , re.search("\s+","ab\tc1\n3").group() 结果 '\t'
 
'(?P<name>...)' 分组匹配 re.search("(?P<province>[0-9]{4})(?P<city>[0-9]{2})(?P<birthday>[0-9]{4})","371481199306143242").groupdict("city") 结果{'province': '3714', 'city': '81', 'birthday': '1993'}

最常用的匹配语法

re.match 从头开始匹配
re.search 匹配包含
re.findall 把所有匹配到的字符放到以列表中的元素返回
re.splitall 以匹配到的字符当做列表分隔符
re.sub      匹配字符并替换

posted @ 2020-02-21 02:38 jehuzzh 阅读(160) 评论(0) 收藏举报

刷新页面返回顶部

jehu的技术之路

python-python基础5（模块）

格式参照:

公告