Python Miscellanea

This post will be updated irregularly.
Last update date: Aug 6, 2019

Strings

msg = 'line1\n'
msg += 'line2\n'
msg += 'line3\n'

This is inefficient because a new string gets created upon each pass. Use a list and join it together:

msg = ['line1', 'line2', 'line3']
'\n'.join(msg)

Do not rely on CPython’s efficient implementation of in-place string concatenation for statements in the form a += b or a = a + b. This optimization is fragile even in CPython (it only works for some types) and isn’t present at all in implementations that don’t use refcounting. In performance sensitive parts of the library, the ''.join() form should be used instead. This will ensure that concatenation occurs in linear time across various implementations.

Similarly avoid the + operator on strings:

# slow
msg = 'hello ' + my_var + ' world'

# faster
msg = 'hello %s world' % my_var

# or better:
msg = 'hello {} world'.format(my_var)

Use string methods instead of the string module.
String methods are always much faster and share the same API with unicode strings.

Use ''.startswith() and ''.endswith() instead of string slicing to check for prefixes or suffixes.
startswith() and endswith() are cleaner and less error prone. For example:

Yes:

if foo.startswith('bar'):

No:

if foo[:3] == 'bar':

Sets vs Lists

Lists are slightly faster than sets when you just want to iterate over the values.

Sets, however, are significantly faster than lists if you want to check if an item is contained within it. They can only contain unique items though.

It turns out tuples perform in almost exactly the same way as lists, except for their immutability.

Generators - Memory Saving Techniques

When iterating, use a generator (f(n) for n in range(m)) instead of a list [f(n) for n in range(m)].

# Generate Fibonacci series
def fab(max): 
    n, a, b = 0, 0, 1 
    while n < max: 
        yield b  # Use `yield`
        a, b = b, a + b 
        n = n + 1
 
for n in fab(5): 
    print(n)

Some Builtin Functions

defaultdict, OrderedDict, counter, deque, namedtuple.

See [1] for example.

Avoid “RuntimeError: dictionary changed size during iteration” Error

In Python 2.x calling keys makes a copy of the key that you can iterate over while modifying the dict:

for i in d.keys():

Note that this doesn't work in Python 3.x because keys returns an iterator instead of a list.

Another way is to use list to force a copy of the keys to be made. This one also works in Python 3.x:

for i in list(d):

State Return None Explicitly

Be consistent in return statements. Either all return statements in a function should return an expression, or none of them should. If any return statement returns an expression, any return statements where no value is returned should explicitly state this as return None, and an explicit return statement should be present at the end of the function (if reachable).

Yes:

def foo(x):
    if x >= 0:
        return math.sqrt(x)
    else:
        return None

def bar(x):
    if x < 0:
        return None
    return math.sqrt(x)

No:

def foo(x):
    if x >= 0:
        return math.sqrt(x)

def bar(x):
    if x < 0:
        return
    return math.sqrt(x)

'2 * x' Is Faster Than 'x << 1'

This seems to be because multiplication of small numbers is optimized in CPython 3.5, in a way that left shifts by small numbers are not.

Comparing Decimals

See [10] and [11].

References

[1] Intermediate Python — Python Tips 0.1 documentation. https://book.pythontips.com/en/latest/
[2] performance - Python Sets vs Lists - Stack Overflow. https://stackoverflow.com/questions/2831212/python-sets-vs-lists
[3] PyBites – 5 tips to speed up your Python code. https://pybit.es/faster-python.html
[4] PythonSpeed/PerformanceTips - Python Wiki. https://wiki.python.org/moin/PythonSpeed/PerformanceTips
[5] Python yield 使用浅析 | 菜鸟教程. https://www.runoob.com/w3cnote/python-yield-used-analysis.html
[6] 生成器 - 廖雪峰的官方网站. https://www.liaoxuefeng.com/wiki/1016959663602400/1017318207388128
[7] python - How to avoid "RuntimeError: dictionary changed size during iteration" error? - Stack Overflow. Retrieved July 10, 2019, from https://stackoverflow.com/questions/11941817/how-to-avoid-runtimeerror-dictionary-changed-size-during-iteration-error
[8] PEP 8: The Style Guide for Python Code. Retrieved July 12, 2019, from https://pep8.org/
[9] Times-two faster than bit-shift, for Python 3.x integers? - Stack Overflow. https://stackoverflow.com/questions/37053379/times-two-faster-than-bit-shift-for-python-3-x-integers
[10] decimal — Decimal fixed point and floating point arithmetic — Python 3.7.4 documentation. https://docs.python.org/3/library/decimal.html
[11] python decimal comparison - Stack Overflow. https://stackoverflow.com/questions/1062008/python-decimal-comparison

posted @ 2019-07-07 21:19  resolvent  阅读(262)  评论(0编辑  收藏  举报