改善python的91个建议(一)

1.理解pythonic

快排示例：

def quick_sort(array):
    less = []
    greater = []
    if len(array) <= 1:
        return array
    pivot = array.pop()
    for x in array:
        if x <= pivot:
            less.append(x)
        else:
            greater.append(x)
    return quick_sort(less) + [pivot] + quick_sort(greater)

import random
quick_sort([random.randint(1, 1000) for i in range(10)])
Out[31]:
[16, 342, 385, 449, 552, 693, 827, 831, 890, 939]

# 变量交换
a, b = b, a
# 上下文管理
with open(path, 'r') as f:
    do_sth_with(f)
# 不应当过分地追求奇技淫巧
a = [1, 2, 3, 4]
a[::-1] # 不推荐。好吧，自从学了切片我一直用的这个
list(reversed(a))   # 推荐

推荐深入学习 Flask、gevent 和 requests。

三元操作符：x if bool else y

通过适当添加空行使代码布局更为优雅、合理

建议 6：编写函数的 4 个原则

函数设计要尽量短小，嵌套层次不宜过深

函数申明应该做到合理、简单、易于使用

函数参数设计应该考虑向下兼容

一个函数只做一件事，尽量保证函数语句粒度的一致性建议 7：将常量集中到一个文件

建议 7：将常量集中到一个文件

在 Python 中应当如何使用常量：

通过命名风格提醒使用者该变量代表常量，如常量名全部大写
通过自定义类实现常量功能：将存放常量的文件命名为constant.py，并在其中定义一系列常量

class _const:
    class ConstError(TypeError): pass
    class ConstCaseError(ConstError): pass
    
    def __setattr__(self, name, value):
        if self.__dict__.has_key(name):
            raise self.ConstError, "Can't change const.%s" % name
        if not name.isupper():
            raise self.ConstCaseError, \
                    'const name "%s" is not all uppercase' % name
        self.__dict__[name] = value

import sys
sys.modules[__name__] = _const()
import const
const.MY_CONSTANT = 1
const.MY_SECOND_CONSTANT = 2
const.MY_THIRD_CONSTANT = 'a'
const.MY_FORTH_CONSTANT = 'b'

其他模块中引用这些常量时，按照如下方式进行即可：

from constant import const
print(const.MY_CONSTANT)

建议 8：利用 assert 语句来发现问题

>>> y = 2
>>> assert x == y, "not equals"
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
AssertionError: not equals
>>> x = 1
>>> y = 2
# 以上代码相当于
>>> if __debug__ and not x == y:
...     raise AssertionError("not equals")
... 
Traceback (most recent call last):
  File "<stdin>", line 2, in <module>
AssertionError: not equals

建议 10：充分利用 Lazy evaluation 的特性

# 生成器实现菲波拉契序列的例子
def fib():
    a, b = 0, 1
    while True:
        yield a
        a, b = b, a + b

建议 11：理解枚举替代实现的缺陷

利用 Python 的动态特征，可以实现枚举：

# 方式一
class Seasons:
    Spring, Summer, Autumn, Winter = range(4)
# 方式二
def enum(*posarg, **keysarg):
    return type("Enum", (object,), dict(zip(posarg, range(len(posarg))), **keysarg))
Seasons = enum("Spring", "Summer", "Autumn", Winter=1)
Seasons.Spring
# 方式三
>>> from collections import namedtuple
>>> Seasons = namedtuple('Seasons', 'Spring Summer Autumn Winter')._make(range(4))
>>> Seasons.Spring
0
# 但通过以上方式实现枚举都有不合理的地方
>>> Seasons._replace(Spring=2)                                             │
Seasons(Spring=2, Summer=1, Autumn=2, Winter=3)  
# Python3.4 中加入了枚举，仅在父类没有任何枚举成员的时候才允许继承

建议 19：有节制地使用 from...import 语句

查看预加载的模块信息: sys.modules.items()

当加载一个模块时，解释器实际上完成了如下动作：

在sys.modules中搜索该模块是否存在，如果存在就导入到当前局部命名空间，如果不存在就为其创建一个字典对象，插入到sys.modules中
加载前确认是否需要对模块对应的文件进行编译，如果需要则先进行编译
执行动态加载，在当前命名空间中执行编译后的字节码，并将其中所有的对象放入模块对应的字典中

>>> dir()
['__builtins__', '__doc__', '__loader__', '__name__', '__package__', '__spec__']
>>> import test
testing module import
>>> dir()
['__builtins__', '__doc__', '__loader__', '__name__', '__package__', '__spec__', 'test']
>>> import sys
>>> 'test' in sys.modules.keys()
True
>>> id(test)
140367239464744
>>> id(sys.modules['test'])
140367239464744
>>> dir(test)
['__builtins__', '__cached__', '__doc__', '__file__', '__loader__', '__name__', '__package__', '__spec__', 'a', 'b']
>>> sys.modules['test'].__dict__.keys()
dict_keys(['__file__', '__builtins__', '__doc__', '__loader__', '__package__', '__spec__', '__name__', 'b', 'a', '__cached__'])

从上可以看出，对于用户自定义的模块，import 机制会创建一个新的 module 将其加入当前的局部命名空间中，同时在 sys.modules 也加入该模块的信息，但本质上是在引用同一个对象，通过test.py所在的目录会多一个字节码文件。

建议 20：优先使用 absolute import 来导入模块

建议 22：使用 with 自动关闭资源

...

Python 还提供contextlib模块，通过 Generator 实现，其中的 contextmanager 作为装饰器来提供一种针对函数级别上的上下文管理器，可以直接作用于函数/对象而不必关心__enter__()和__exit__()的实现。

在Python中，读写文件这样的资源要特别注意，必须在使用完毕后正确关闭它们。正确关闭文件资源的一个方法是使用try...finally：

try:
    f = open('/path/to/file', 'r')
    f.read()
finally:
    if f:
        f.close()

写try...finally非常繁琐。Python的with语句允许我们非常方便地使用资源，而不必担心资源没有关闭，所以上面的代码可以简化为：

with open('/path/to/file', 'r') as f:
    f.read()

并不是只有open()函数返回的fp对象才能使用with语句。实际上，任何对象，只要正确实现了上下文管理，就可以用于with语句。

实现上下文管理是通过__enter__和__exit__这两个方法实现的。例如，下面的class实现了这两个方法：

class Query(object):

    def __init__(self, name):
        self.name = name

    def __enter__(self):
        print('Begin')
        return self

    def __exit__(self, exc_type, exc_value, traceback):
        if exc_type:
            print('Error')
        else:
            print('End')

    def query(self):
        print('Query info about %s...' % self.name)

这样我们就可以把自己写的资源对象用于with语句：

with Query('Bob') as q:
    q.query()

@contextmanager

编写__enter__和__exit__仍然很繁琐，因此Python的标准库contextlib提供了更简单的写法，上面的代码可以改写如下：

from contextlib import contextmanager

class Query(object):

    def __init__(self, name):
        self.name = name

    def query(self):
        print('Query info about %s...' % self.name)

@contextmanager
def create_query(name):
    print('Begin')
    q = Query(name)
    yield q
    print('End')

@contextmanager这个decorator接受一个generator，用yield语句把with ... as var把变量输出出去，然后，with语句就可以正常地工作了：

with create_query('Bob') as q:
    q.query()

很多时候，我们希望在某段代码执行前后自动执行特定代码，也可以用@contextmanager实现。例如：

@contextmanager
def tag(name):
    print("<%s>" % name)
    yield
    print("</%s>" % name)

with tag("h1"):
    print("hello")
    print("world")

上述代码执行结果为：

<h1>
hello
world
</h1>

代码的执行顺序是：

with语句首先执行yield之前的语句，因此打印出<h1>；
yield调用会执行with语句内部的所有语句，因此打印出hello和world；
最后执行yield之后的语句，打印出</h1>。

因此，@contextmanager让我们通过编写generator来简化上下文管理。

@closing

如果一个对象没有实现上下文，我们就不能把它用于with语句。这个时候，可以用closing()来把该对象变为上下文对象。例如，用with语句使用urlopen()：

from contextlib import closing
from urllib.request import urlopen

with closing(urlopen('https://www.python.org')) as page:
    for line in page:
        print(line)

closing也是一个经过@contextmanager装饰的generator，这个generator编写起来其实非常简单：

@contextmanager
def closing(thing):
    try:
        yield thing
    finally:
        thing.close()

它的作用就是把任意对象变为上下文对象，并支持with语句。

@contextlib还有一些其他decorator，便于我们编写更简洁的代码。

建议 23：使用 else 子句简化循环（异常处理）

else 子句的执行条件：在循环正常结束和循环条件不成立时被执行

else子句不被执行的条件：由 break 语句中断时不执行

else子句，不仅可用于for循环，同样，可以利用这颗语法糖作用在 while 和 try...except 中。

for x in []:  # 循环条件不存立
    print x
else:
    print 'sd'
Out: 'sd'
while 1 != 1:  # 循环条件不存立
    pass
else:
    print 'ok'
Out: 'ok'

for i in range(10):  # for 正常执行结束
    pass
else:
    print 'ok'
# Out: 'ok'
i = 0
while i < 5:  # while正常循环结束
    print i
    i +=1
else:
    print 'ok'
# Out: 'ok

for i in range(10):   # for循环由break结束循环，不会执行else子句
    break
else:
    print 'ok'
i = 0
while i < 5:  # while循环由break结束循环，不会执行else子句
    break
    i +=1
else:
    print 'ok'

建议 24：遵循异常处理的几点基本原则

异常处理的几点原则：

注意异常的粒度，不推荐在 try 中放入过多的代码

谨慎使用单独的 except 语句处理所有异常，最好能定位具体的异常

注意异常捕获的顺序，在适合的层次处理异常，Python 是按内建异常类的继承结构处理异常的，所以推荐的做法是将继承结构中子类异常在前抛出，父类异常在后抛出

使用更为友好的异常信息，遵守异常参数的规范

建议 25：避免 finally 中可能发生的陷阱

当 finally 执行完毕时，之前临时保存的异常将会再次被抛出，但如果 finally 语句中产生了新的异常或执行了 return 或 break 语句，那么临时保存的异常将会被丢失，从而异常被屏蔽。

在实际开发中不推荐 finally 中使用 return 语句进行返回。

建议 26：深入理解 None，正确判断对象是否为空

0,''为False，但要注意[0]，['']为True!

建议 27：连接字符串优先使用 join 而不是 +

+涉及到更多的内存操作

建议 29：区别对待可变对象和不可变对象

Python 中一切皆对象，每个对象都有一个唯一的标识符（id）、类型（type）和值。数字、字符串、元组属于不可变对象，字典、列表、字节数组属于可变对象。

class Student(object):
    def __init__(self, name, course=[]):    # 问题就出在这里
        self.name = name
        self.course = course
    def addcourse(self, coursename):
        self.course.append(coursename)
    def printcourse(self):
        for item in self.course:
            print(item)

stuA = Student('Wang yi')
stuA.addcourse('English')
stuA.addcourse('Math')
print("{}'s course: ".format(stuA.name))
stuA.printcourse()
print('---------------------------')

OUT:

Wang yi's course: 
English
Math
---------------------------

stuB = Student('Su san')
stuB.addcourse('Chinese')
stuB.addcourse('Physics')
print("{}'s course: ".format(stuB.name))
stuB.printcourse()
print('---------------------------')

OUT:

---------------------------
Su san's course:
English
Math
Chinese
Physics

搞毛啊...实例stuB并没有增加课程English,Math，但是却输出了，再来个stuC:

stuC = Student('Wang yi')
stuC.addcourse('xxxx')
stuC.addcourse('yy')
print("{}'s course: ".format(stuA.name))
stuA.printcourse()
print('---------------------------')

OUT:

Wang yi's course: 
English
Math
Chinese
Physics
xxxx
yy
---------------------------
坑来了，实例stuC把前面的课程都加进来了.....不同实例是不一样的对象，不是吗？为什么这样！
这是因为：

默认参数在初始化时仅仅被评估一次，以后直接使用第一次评估的结果，course 指向的是 list 的地址，每次操作的实际上是 list 所指向的具体列表(相同的内存id)，所以对于可变对象的更改会直接影响原对象。

course初始化的时侯指明了默认参数，是列表，列表是可变的，因此每次实例都是指向同一块内存id：

stuA.course
#OUT:  ['English', 'Math', 'Chinese', 'Physics', 'xxxx', 'yy']
id(stuA.course)
# OUT: 47956336

stuB.course
#OUT:  ['English', 'Math', 'Chinese', 'Physics', 'xxxx', 'yy']
id(stuB.course)
# OUT: 47956336

再看name，初始化的时侯没有指定类型，因此每次实例都指向不同的内存id:

id(stuA.name)
# OUT: 47961920
id(stuB.name)
# OUT: 47960288

当然如果指明的默认参数是不可变类型或者None，每次实例也是指向不同的内存id：

class Student(object):
    def __init__(self, name='', course=None):    # 问题就出在这里
        self.name = name
        self.course = list()  # 在创建实例对象时侯，动态生成列表
    def addcourse(self, coursename):
        self.course.append(coursename)
    def printcourse(self):
        for item in self.course:
            print(item)
stuA = Student('Wang yi')
stuA.addcourse('English')
stuA.addcourse('Math')
print("{}'s course: ".format(stuA.name))
stuA.printcourse()
print('---------------------------')
# OUT:
# Wang yi's course: 
# English
# Math
# ---------------------------
stuB = Student('Su san')
stuB.addcourse('Chinese')
stuB.addcourse('Physics')
print("{}'s course: ".format(stuB.name))
stuB.printcourse()
# OUT:
# Su san's course: 
# Chinese
# Physics

id(stuA.course)  # course虽然是列表，但是是在创建实例的时侯动态生成的。在初始化的时侯没有指明为列表[],而是None。则每次实例的course指向不同的内存
# OUT: 48503344
id(stuB.course)
# OUT: 48503424

id(stuA.name)   #name参数虽然在初始化的时侯指明了默认参数为字符串''。但是因为字符串是不可变数据类型，因此不需要在实例的时侯动态创建，每次实例的name都会指向不同的内存
# OUT: 47963424
id(stuB.name)
# OUT: 47961792

因此，如果参数是列表、字典、字节等可变的数据类型，如果要指明默认参数，最好的方法是：传入None作为默认参数，在创建对象的时候动态生成列表。

建议 31：记住函数传参既不是传值也不是传引用

正确的说法是传对象（call by object）或传对象的引用（call-by-object-reference），函数参数在传递过程中将整个对象传入，对可变对象的修改在函数外部以及内部都可见，对不可变对象的”修改“往往是通过生成一个新对象然后是赋值实现的。

建议 32：警惕默认参数潜在的问题

其中就是默认参数如果是可变对象，在调用者和被调用者之间是共享的。和建议29同一回事

建议 34：深入理解 str() 和repr() 的区别

总结几点：

str()面向用户，返回用户友好和可读性强的字符串类型；repr()面向 Python 解释器或开发人员，返回 Python 解释器内部的含义

解释器中输入a默认调用repr()，而print(a)默认调用str()

repr()返回值一般可以用eval()还原对象：obj == eval(repr(obj))

以上两个方法分别调用内建的__str__()和__repr__()，一般来说类中都应该定义__repr__()，
但当可读性比准确性更为重要时应该考虑__str__()，用户实现__repr__()方法的时候最好保证其返回值可以用eval()是对象还原

eval将字符串当表达式使用，因此使用eval的时侯要注意安全性

建议 35：分清 staticmethod 和 classmethod 的适用场景

posted on 2014-03-21 11:18 myworldworld 阅读(231) 评论(0) 收藏举报

刷新页面返回顶部

myworldworld