[python学习篇][模块学习][os] [3]os.walk
- os.walk原代码
def walk(top, topdown=True, onerror=None, followlinks=False): """Directory tree generator. Example: import os from os.path import join, getsize for root, dirs, files in os.walk('python/Lib/email'): print root, "consumes", print sum([getsize(join(root, name)) for name in files]), print "bytes in", len(files), "non-directory files" if 'CVS' in dirs: dirs.remove('CVS') # don't visit CVS directories """ islink, join, isdir = path.islink, path.join, path.isdir try: names = listdir(top) except error, err: if onerror is not None: onerror(err) return dirs, nondirs = [], [] for name in names: if isdir(join(top, name)): dirs.append(name) else: nondirs.append(name) if topdown: yield top, dirs, nondirs for name in dirs: new_path = join(top, name) if followlinks or not islink(new_path): for x in walk(new_path, topdown, onerror, followlinks): yield x if not topdown: yield top, dirs, nondirs __all__.append("walk")
测试 os.walk()代码
import os from os.path import getsize, join top = r'd:\temp\performance' a = os.walk(top, topdown = True) parent, dirs, files = a.next() print "parent = %s, dirs = %s, files = %s" % (parent, dirs, files) parent, dirs, files = a.next() print "parent = %s, dirs = %s, files = %s" % (parent, dirs, files) parent, dirs, files = a.next() print "parent = %s, dirs = %s, files = %s" % (parent, dirs, files) parent, dirs, files = a.next() print "parent = %s, dirs = %s, files = %s" % (parent, dirs, files)
输出结果:
parent = d:\temp\performance, dirs = ['antutu'], files = ['antutuvideo_startpage.xml'] parent = d:\temp\performance\antutu, dirs = ['log', 'screen'], files = [] parent = d:\temp\performance\antutu\log, dirs = [], files = ['log.txt'] parent = d:\temp\performance\antutu\screen, dirs = [], files = ['antututest_score.png']
目录结构:
>>> top = r"d:\temp" >>> listdir = os.listdir >>> listdir, join = os.listdir, os.path.join >>> listdir(top) ['0.png', '0.xml', '01.png', 'performance'] >>> listdir(join(top, 'performance')) ['antutu', 'antutuvideo_startpage.xml'] >>>
深入理解yield
- 通常的for...in...循环中,in后面是一个数组,这个数组就是一个可迭代对象,类似的还有链表,字符串,文件。它可以是mylist = [1, 2, 3],也可以是mylist = [x*x for x in range(3)]。
它的缺陷是所有数据都在内存中,如果有海量数据的话将会非常耗内存。 - 生成器是可以迭代的,但只可以读取它一次。因为用的时候才生成。比如 mygenerator = (x*x for x in range(3)),注意这里用到了(),它就不是数组,而上面的例子是[]。
-
>>> mylist = [x for x in range(1,4)]
>>> print mylist
[1, 2, 3]
>>> mygenerator = (x for x in range(1,4))
>>> print type(mygenerator)
<type 'generator'>
>>>
-
- 我理解的生成器(generator)能够迭代的关键是它有一个next()方法,工作原理就是通过重复调用next()方法,直到捕获一个异常。可以用上面的mygenerator测试。
-
>>> for i in mygenerator:print i
...
1
2
3
>>>
-
- 带有 yield 的函数不再是一个普通函数,而是一个生成器generator,可用于迭代,工作原理同上。
- yield 是一个类似 return 的关键字,迭代一次遇到yield时就返回yield后面的值。重点是:下一次迭代时,从上一次迭代遇到的yield后面的代码开始执行。
- 简要理解:yield就是 return 返回一个值,并且记住这个返回的位置,下次迭代就从这个位置后开始。

浙公网安备 33010602011771号