[python学习篇][模块学习][os] [3]os.walk

 
  1.  os.walk原代码

    

def walk(top, topdown=True, onerror=None, followlinks=False):
    """Directory tree generator.
    Example:
    import os
    from os.path import join, getsize
    for root, dirs, files in os.walk('python/Lib/email'):
        print root, "consumes",
        print sum([getsize(join(root, name)) for name in files]),
        print "bytes in", len(files), "non-directory files"
        if 'CVS' in dirs:
            dirs.remove('CVS')  # don't visit CVS directories

    """

    islink, join, isdir = path.islink, path.join, path.isdir

    try:
        names = listdir(top)
    except error, err:
        if onerror is not None:
            onerror(err)
        return

    dirs, nondirs = [], []
    for name in names:
        if isdir(join(top, name)):
            dirs.append(name)
        else:
            nondirs.append(name)

    if topdown:
        yield top, dirs, nondirs
    for name in dirs:
        new_path = join(top, name)
        if followlinks or not islink(new_path):
            for x in walk(new_path, topdown, onerror, followlinks):
                yield x
    if not topdown:
        yield top, dirs, nondirs

__all__.append("walk")

测试 os.walk()代码

import os
from os.path import getsize, join

top = r'd:\temp\performance'
a = os.walk(top, topdown = True)
parent, dirs, files = a.next()
print "parent = %s, dirs = %s, files = %s" % (parent, dirs, files)
parent, dirs, files = a.next()
print "parent = %s, dirs = %s, files = %s" % (parent, dirs, files)
parent, dirs, files = a.next()
print "parent = %s, dirs = %s, files = %s" % (parent, dirs, files)
parent, dirs, files = a.next()
print "parent = %s, dirs = %s, files = %s" % (parent, dirs, files)

输出结果:

parent = d:\temp\performance, dirs = ['antutu'], files = ['antutuvideo_startpage.xml']
parent = d:\temp\performance\antutu, dirs = ['log', 'screen'], files = []
parent = d:\temp\performance\antutu\log, dirs = [], files = ['log.txt']
parent = d:\temp\performance\antutu\screen, dirs = [], files = ['antututest_score.png']

 

目录结构:

>>> top = r"d:\temp"
>>> listdir = os.listdir
>>> listdir, join = os.listdir, os.path.join
>>> listdir(top)
['0.png', '0.xml', '01.png', 'performance']
>>> listdir(join(top, 'performance'))
['antutu', 'antutuvideo_startpage.xml']
>>>

 

深入理解yield 

  • 通常的for...in...循环中,in后面是一个数组,这个数组就是一个可迭代对象,类似的还有链表,字符串,文件。它可以是mylist = [1, 2, 3],也可以是mylist = [x*x for x in range(3)]。
    它的缺陷是所有数据都在内存中,如果有海量数据的话将会非常耗内存。
  • 生成器是可以迭代的,但只可以读取它一次。因为用的时候才生成。比如 mygenerator = (x*x for x in range(3)),注意这里用到了(),它就不是数组,而上面的例子是[]。
    • >>> mylist = [x for x in range(1,4)]
      >>> print mylist
      [1, 2, 3]
      >>> mygenerator = (x for x in range(1,4))
      >>> print type(mygenerator)
      <type 'generator'>
      >>>

  • 我理解的生成器(generator)能够迭代的关键是它有一个next()方法,工作原理就是通过重复调用next()方法,直到捕获一个异常。可以用上面的mygenerator测试。
    •   

      >>> for i in mygenerator:print i
      ...
      1
      2
      3
      >>>

  • 带有 yield 的函数不再是一个普通函数,而是一个生成器generator,可用于迭代,工作原理同上。
  • yield 是一个类似 return 的关键字,迭代一次遇到yield时就返回yield后面的值。重点是:下一次迭代时,从上一次迭代遇到的yield后面的代码开始执行。
  • 简要理解:yield就是 return 返回一个值,并且记住这个返回的位置,下次迭代就从这个位置后开始。



  

posted @ 2017-07-26 16:14  liuzhipenglove  阅读(306)  评论(0)    收藏  举报