Python文件基本操作 - 木流

打开文件：

　　file_obj = file("文件路径","模式") #2.0版本的

file_obj = open("文件路径"，“模式”) #3.0版本的

打开文件的模式有：

r，以只读方式打开文件
w，打开一个文件只用于写入。如果该文件已存在则将其覆盖。如果该文件不存在，创建新文件。
a，打开一个文件用于追加。如果该文件已存在，文件指针将会放在文件的结尾。也就是说，新的内容将会被写入到已有内容之后。如果该文件不存在，创建新文件进行写入。
w+，打开一个文件用于读写。如果该文件已存在则将其覆盖。如果该文件不存在，创建新文件。

写入文件内容：

file_obj.write("文件内容")

关闭文件句柄：

file.obj.close() #注意每一次文件操作都要关闭文件句柄，不然报错

1 #打开文件f.txt，写入内容，关闭文件
2 f = open("f.txt","w")
3 f.write("This is the first line\n")
4 f.write("This is the second line\n")
5 f.write("This is the third line\n")
6 f.write("This is the 4 line\n")
7 f.write("This is the 5 line\n")
8 f.close()

读取文件内容：

file_obj.read() #一次性加载文件所有内容到内存

 1 #读文件所有内容
 2 f = open("f.txt","r")
 3 aa = f.read()
 4 print(aa)
 5 f.close()
 6 
 7 #执行结果：
 8 This is the first line
 9 This is the second line
10 This is the third line
11 This is the 4 line
12 This is the 5 line

file_obj.readlines() #一次性加载文件所有内容到内存,并根据行分割成字符串

1 #读文件所有内容，结果分割成了列表形式
2 f = open("f.txt","r")
3 aa = f.readlines()
4 print(aa)
5 f.close()
6 
7 #执行结果：
8 ['This is the first line\n', 'This is the second line\n', 'This is the third line\n', 'This is the 4 line\n', 'This is the 5 line\n']

 1 #一行行显示输出
 2 f = open("f.txt","r")
 3 for line in f:
 4     print(line)
 5 f.close()
 6 
 7 #执行结果：其中有空行是因为写了换行符
 8 This is the first line
 9 
10 This is the second line
11 
12 This is the third line
13 
14 This is the 4 line
15 
16 This is the 5 line

应用：

 1 #判断字符是否存在行
 2 f = open("f.txt","r")
 3 for line in f:
 4     if "5" in line:
 5         print("this is the five line")
 6     else:
 7         print(line)
 8 f.close()
 9 
10 #执行结果：
11 This is the first line
12 
13 This is the second line
14 
15 This is the third line
16 
17 This is the 4 line
18 
19 this is the five line

追加文件内容：

file_obj = open("f.txt","a")

 1 #追加
 2 f = open("f.txt","a")
 3 f.write("7")
 4 f.write("8")
 5 f.close()
 6 
 7 #文件内容：
 8 This is the first line
 9 This is the second line
10 This is the third line
11 This is the 4 line
12 This is the 5 line
13 78

文件打开方法二：用with就可以不需要加关闭文件f.close()了

 1 #另外一种文件打开方式
 2 file2 = "f.txt"
 3 with open(file2,"r") as f:
 4     aa = f.read()
 5     print(aa)
 6 
 7 执行结果：
 8 This is the first line
 9 This is the second line
10 This is the third line
11 This is the 4 line
12 This is the 5 line
13 78

"+" 表示可以同时读写某个文件

r+，可读写文件。【可读；可写；可追加】
w+，写读
a+，同a

"U"表示在读取时，可以将 \r \n \r\n自动转换成 \n （与 r 或 r+ 模式同使用）

"b"表示处理二进制文件（如：FTP发送上传ISO镜像文件，linux可忽略，windows处理二进制文件时需标注）

注意：3.0版本的f.read()读的是字符（2.7读的是字节）；f.tell()指定的是指针位置（读的是字节），f.seek()按照字节操作

如果结合使用有点坑，因为一个中文代表3个字节。

补充：如果3.0版本打开文件模式是“rb”，则都是按照字节操作。

 1 #
 2 f = open("test.log","w")
 3 f.write("无ddddd")
 4 f.close()
 5 
 6 f = open("test.log","r")
 7 ret = f.read(2)      #指定读取2个字符，默认多读 9 f.close()
10 
11 #执行结果：
12 无d

 1 #查看当前指针位置，默认在0位置
 2 f = open("test.log","r",encoding="utf -8")
 3 print(f.tell())     #指针的默认位置在0位置
 4 f.close()
 5 
 6 f = open("test.log","r",encoding="utf -8")
 7 f.read(2)         #按照字符读
 8 print(f.tell())     #因为第一个是中文，指针指向字节位置4，按照字节读,查看当前指针位置
 9 f.close()
10 
11 #执行结果：
12 #由于第一个是中文3个字节，指针在位置2，f.tell()读出来的字节在中文中，会报错（即也可以说乱码了）

1 #指定当前指针位置
2 f = open("test.log","r",encoding="utf -8")
3 f.seek(1)       #指定当前指针位置
4 f.read()         #按照字符读
5 ret = f.read()     #因为第一个中文是3个字节，所以会报错
6 f.close()
7 print(ret)
8 
9 #执行结果：由于指定指针在中文里，所以报错了

 1 #在原文件保留指针前面的数据，后面的删除
 2 f = open("test.log","r",encoding="utf -8")
 3 f.seek(5)
 4 print(f.read())    #读指针后面的，并不改变原文件
 5 f.close()
 6 
 7 f = open("test.log","r+",encoding="utf -8")   #需开放可读写权限
 8 f.seek(5)
 9 f.truncate()      #删除了原文件指针5后面的数据，保留了前面的，改变了原文件
10 f.close()

#补充：

从一个文件一行行读取内容写到新文件中

1 #从一个文件一行行读取内容写到新文件中
2 with open("test.log") as read_file,open("new_test.log","w") as write_file:
3     for line in read_file:
4         write_file.write(line)
5         aa = line.startswith("back")    #判断字符串头是否匹配“back”
6         print(aa)     #返回True或者False

python读文件时按字符还是子节？

 1     #python2.0版本
 2         f = open('ha.log','r')
 3         data = f.read()
 4         f.tell() # 按照字节进行操作
 5         f.close()
 6     
 7     #python3.0版本
 8         f = open('ha.log','r')
 9         data = f.read()
10         f.tell() # 按照字符进行操作
11                 f.seek(5)   #按照字节进行操作
12         f.close()
13         
14         f = open('ha.log','rb')
15         data = f.read()
16         f.tell() # 按照字节进行操作
17         f.close()

字节和字符串转换问题：

utf-8格式中，一个中文=3个字节；1个字节=8bit

 1 #3.0版本，字节和字符串可以直接互相转换
 2 s = "吴佩琪"
 3 for item in s:
 4     print(item)    #for循环，字符循环；2.0版本是以字节循环的
 5 s_bytes = bytes(s,"utf8")
 6 print(s_bytes)      #字符串转换为字节；2.0版本str==bytes，形同虚设
 7 new_str = str(s_bytes,"utf8")
 8 print(new_str)      #字节转换为字符串
 9 
10 #输出结果：
11 吴
12 佩
13 琪
14 b'\xe5\x90\xb4\xe4\xbd\xa9\xe7\x90\xaa'
15 吴佩琪

发表于 2016-04-21 17:07 木流阅读(522) 评论(0) 收藏举报