python 实时遍历日志文件

推荐日志处理项目：https://github.com/olajowon/loggrove

首先尝试使用 python open 遍历一个大日志文件，

使用 readlines() 还是 readline() ?

总体上 readlines() 不慢于python 一次次调用 readline()，因为前者的循环在C语言层面，而使用readline() 的循环是在Python语言层面。

但是 readlines() 会一次性把全部数据读到内存中，内存占用率会过高，readline() 每次只读一行，对于读取大文件，需要做出取舍。

如果不需要使用 seek() 定位偏移， for line in open('file') 速度更佳。

使用 readlines()，适合量级较小的日志文件

 1 p = 0
 2 with open(filepath, 'r+') as f:
 3     f.seek(p, 0)
 4     while True:
 5         lines = f.readlines()
 6         if lines:
 7             print lines
 8             p = f.tell()
 9             f.seek(p, 0)
10         time.sleep(1)

使用 readline()，避免内存占用率过大

1 p = 0
2 with open('logs.txt', 'r+') as f:
3     while True:
4         line = f.readline()
5         if line:
6             print line

################## 华丽分割 ##########################

现在尝试使用 tail -F log.txt 动态输出

由于 os.system() , commands.getstatusoutput() 属于一次性执行就拜拜，最终选择 subprocess.Popen()，

subprocess 模块目的是启动一个新的进程并与之通信，最常用是定义类Popen，使用Popen可以创建进程，并与进程进行交互。

1 import subprocess
2 import time
3 
4 p = subprocess.Popen('tail -F log.txt', shell=True, stdout=subprocess.PIPE,stderr=subprocess.PIPE,)
5 while True:
6    line = p.stdout.readline()
7    if line:
8         print line

posted @ 2016-04-12 11:27 卑鄙的wo 阅读(12374) 评论(0) 收藏举报

刷新页面返回顶部

卑鄙的wo

python 实时遍历日志文件

公告