[笔记] Google Python Course
原始地址在这里
Intro and string
-
Python sets
__name__to__main__while executing -
from package_name import func as aliasecreates namespace package_name, and you can use func directly -
Use
strinstead of oldstring -
Use
//for integer division instead of/ -
Something on
str:- A raw string:
r'raw_string'(treat everything literally); a unicode string:u'unicode_string' - Print without new line:
print mystr,(note the trailing comma) s.lower(), s.upper()-- returns the lowercase or uppercase version of the strings.strip()-- returns a string with whitespace removed from the start and ends.isalpha()/s.isdigit()/s.isspace()... -- tests if all the string chars are in the various character classess.startswith('other'), s.endswith('other')-- tests if the string starts or ends with the given other strings.find('other')-- searches for the given other string (not a regular expression) within s, and returns the first index where it begins or -1 if not founds.replace('old', 'new')-- returns a string where all occurrences of 'old' have been replaced by 'new's.split('delim')-- returns a list of substrings separated by the given delimiter. The delimiter is not a regular expression, it's just text. 'aaa,bbb,ccc'.split(',') -> ['aaa', 'bbb', 'ccc']. As a convenient special case s.split() (with no arguments) splits on all whitespace chars.s.join(list)-- opposite of split(), joins the elements in the given list together using the string as the delimiter. e.g. '---'.join(['aaa', 'bbb', 'ccc']) -> aaa---bbb---cccs.encode(utf-8)-- convert unicode to utf-8. **The built-in print does not work fully with unicode strings. **after = unicode(before,'uft-8')--convert utf-8 to unicode- The reason why python doesn't have
s.len()but instead hass.__len__:link - Note:
my_str[m,n]doesn't produce mystr[n]. It ends atmy_str[n-1]. However,str[m:]reaches to the end. stris immutable, i.e., cannot be changed. (You can convert it into a list to assign values to items)
- A raw string:
-
Formatted output:
var_to_print = (" %formatter1 %formatter2") % (tuple, to_print) -
Group multi-line code
():-
# add parens to make the long-line work: text = ("%d little pigs come out or I'll %s and %s and %s" % (3, 'huff', 'puff', 'blow down'))
-
-
Do not put boolean test in parentheses, e.g.,
if some_boolean_exp:(note the trailing colon) -
Difference between
deland set toNone:link
List, tuple and sorting
-
List assignment
=only makes new list point to the old -
Examples for
forandin(useforin favor of your own loop. Userangeto generate loop indices if you need):-
list = ['larry', 'curly', 'moe'] if 'curly' in list: print 'yay' ## print the numbers from 0 through 99 for i in range(100): print i
-
-
Example on
while-
## Access every 3rd element in a list i = 0 while i < len(a): print a[i] i = i + 3
-
-
List methods:
list.append(elem)-- adds a single element to the end of the list. Common error: does not return the new list, just modifies the original.list.insert(index, elem)-- inserts the element at the given index, shifting elements to the right.list.extend(list2)adds the elements in list2 to the end of the list. Using + or += on a list is similar to using extend().list.index(elem)-- searches for the given element from the start of the list and returns its index. Throws a ValueError if the element does not appear (use "in" to check without a ValueError).list.remove(elem)-- searches for the first instance of the given element and removes it (throws ValueError if not present)list.sort()-- sorts the list in place (does not return it). (The sorted() function shown below is preferred.)list.reverse()-- reverses the list in place (does not return it)list.pop(index)-- removes and returns the element at the given index. Returns the rightmost element if index is omitted (roughly the opposite of append()).- Common error: *note that the above methods **do not *return the modified list, they just modify the original list and return None.
-
Favor
new_list = sorted(list, key=func)instead oflist.sort(key=func)(doesn't return sorted list).sorted()can work on any enumerable objects whilesort()can't. However,sort()is slightly faster thansorted()if the elements to sort are already in a list.keytransfers element to "proxy"to compare
-
Python sort is stable, which means that sorting the list by length leaves the elements in alphabetical order when the length is equal.
-
## "key" argument specifying str.lower function to use for sorting print sorted(strs, key=str.lower) ## ['aa', 'BB', 'CC', 'zz']
-
-
Tuple
(elem1, elem2...)is immutable but can contain mutable elements (like list). Because tuple only holds references, and the mutability is affected by presence of method that changes the data. See this for explanation. -
To create a size-1 tuple, the lone element must be followed by a comma.
tuple = ('hi',) ## size-1 tuple -
A way to assigning tuple:
(x, y, z) = (42, 13, "hike") -
List comprehension:
-
Example:
nums = [1, 2, 3, 4] squares = [ n * n for n in nums ] ## [1, 4, 9, 16] -
Conditional evaluation
## Select values <= 2 nums = [2, 8, 1, 6] small = [ n for n in nums if n <= 2 ] ## [2, 1] ## Select fruits containing 'a', change to upper case fruits = ['apple', 'cherry', 'bannana', 'lemon'] afruits = [ s.upper() for s in fruits if 'a' in s ] # note the "if" ## ['APPLE', 'BANNANA']
Dict, Hash, and Files
-
Looping through keys in a
dictis in an arbitrary order. Usefor key in sorted(dict.keys())to loop sequentially -
More
dictexample:-
## Get the .keys() list: print dict.keys() ## ['a', 'o', 'g'] ## Likewise, there's a .values() list of values print dict.values() ## ['alpha', 'omega', 'gamma'] ## Common case -- loop over the keys in sorted order, ## accessing each key/value for key in sorted(dict.keys()): print key, dict[key] ## .items() is the dict expressed as (key, value) tuples print dict.items() ## [('a', 'alpha'), ('o', 'omega'), ('g', 'gamma')] ## This loop syntax accesses the whole dict by looping ## over the .items() tuple list, accessing one (key, value) ## pair on each iteration. for k, v in dict.items(): print k, '>', v ## a > alpha o > omega g > gamma
-
-
iterkeys(), itervalues() and iteritems()are slightly faster -
dictformatted output:hash = {} hash['word'] = 'garfield' hash['count'] = 42 s = 'I want %(count)d copies of %(word)s' % hash # %d for int, %s for string # 'I want 42 copies of garfield' -
Difference between
delandNone: this.delcan also used to delete list and dict entries -
File open:
rU:convert whatever EOL to '\n',r: read,w:override,a: append -
f.readlines()read the whole file into memory,f.read()read the whole file into a string -
import codecsfor reading unicode a file -
sys.exit(0): abort
Regex
import re
# use r to transfer raw input in regex matching
match = re.search(r'pattern', str) # only returns first matching
match = re.findall(r'pattern', string)# returns a list of matching strings.
-
Python's regex is Perl Compatible Regular Expressions
-
a, X, 9, <-- ordinary characters just match themselves exactly. The meta-characters which do not match themselves because they have special meanings are: . ^ $ * + ? { [ ] \ | ( ) (details below) -
.(a period) -- matches any single character except newline '\n' -
\w-- (lowercase w) matches a "word" character: a letter or digit or underbar [a-zA-Z0-9_]. Note that although "word" is the mnemonic for this, it only matches a single word char, not a whole word.\W(upper case W) matches any non-word character. -
\b-- boundary between word and non-word -
\s-- (lowercase s) matches a single whitespace character -- space, newline, return, tab, form [ \n\r\t\f]. \S (upper case S) matches any non-whitespace character. -
\t, \n, \r-- tab, newline, return -
\d-- decimal digit [0-9] (some older regex utilities do not support but \d, but they all support \w and \s) -
^= start,$= end -- match the start or end of the string -
\-- inhibit the "specialness" of a character. So, for example, use . to match a period or \ to match a slash. If you are unsure if a character has special meaning, such as '@', you can put a slash in front of it, @, to make sure it is treated just as a character. -
+-- 1 or more occurrences of the pattern to its left, e.g. 'i+' = one or more i's -
*-- 0 or more occurrences of the pattern to its left -
?-- match 0 or 1 occurrences of the pattern to its left -
Regex matching is greedy by default, use a trailing
?to do non-greedy matching (stop as soon as you can) -
[]character set.^to invert and-indicates range -
()group patterns for output extraction,(?: )to suppress this group -
Example:
-
str = 'purple alice@google.com, blah monkey bob@abc.com blah dishwasher' tuples = re.findall(r'([\w\.-]+)@([\w\.-]+)', str) print tuples ## [('alice', 'google.com'), ('bob', 'abc.com')] for tuple in tuples: print tuple[0] ## username print tuple[1] ## host
-
-
flags
(r'pattern',str,flag):re.IGNORECASE-- ignore upper/lowercase differences for matching, so 'a' matches both 'a' and 'A'.re.DOTALL-- allow dot (.) to match newline -- normally it matches anything but newline. This can trip you up -- you think .* matches everything, but by default it does not go past the end of a line. Note that\s(whitespace) includes newlines, so if you want to match a run of whitespace that may include a newline, you can just use \s*re.MULTILINE-- Within a string made of many lines, allow^and$to match the start and end of each line. Normally^/$would just match the start and end of the whole string.
-
re.sub(pat, replacement, str)substitution-
str = 'purple alice@google.com, blah monkey bob@abc.com blah dishwasher' ## re.sub(pat, replacement, str) -- returns new string with all replacements, ## \1 is group(1), \2 group(2) in the replacement print re.sub(r'([\w\.-]+)@([\w\.-]+)', r'\1@yo-yo-dyne.com', str) ## purple alice@yo-yo-dyne.com, blah monkey bob@yo-yo-dyne.com blah dishwasher
-
Utils
The os and os.path modules include many functions to interact with the file system. The shutil module can copy files.
-
filenames = os.listdir(dir)-- list of filenames in that directory path (not including . and ..). The filenames are just the names in the directory, not their absolute paths. -
os.path.join(dir, filename)-- given a filename from the above list, use this to put the dir and filename together to make a path -
os.path.abspath(path)-- given a path, return an absolute form, e.g. /home/nick/foo/bar.html -
os.path.dirname(path), os.path.basename(path)-- given dir/foo/bar.html, return the dirname "dir/foo" and basename "bar.html" -
os.path.exists(path)-- true if it exists -
os.mkdir(dir_path)-- makes one dir, os.makedirs(dir_path) makes all the needed dirs in this path -
shutil.copy(source-path, dest-path)-- copy a file (dest path directories should exist)
The commands module is a simple way to run an external command and capture its output.
- commands module docs
(status, output) = commands.getstatusoutput(cmd)-- runs the command, waits for it to exit, and returns its status int and output text as a tuple. The command is run with its standard output and standard error combined into the one output text. The status will be non-zero if the command failed. Since the standard-err of the command is captured, if it fails, we need to print some indication of what happened.output = commands.getoutput(cmd)-- as above, but without the status int.- There is a
commands.getstatus()but it does something else, so don't use it -- dumbest bit of method naming ever! - If you want more control over the running of the sub-process, see the "popen2" module (http://docs.python.org/lib/module-popen2.html)
- There is also a simple
os.system(cmd)which runs the command and dumps its output onto your output and returns its error code. This works if you want to run the command but do not need to capture its output into your python data structures.
Python debugger pdb
Exception handling try/except:
try:
## Either of these two lines could throw an IOError, say
## if the file does not exist or the read() encounters a low level error.
f = open(filename, 'rU')
text = f.read()
f.close()
except IOError:
## Control jumps directly to here if any of the above lines throws IOError.
sys.stderr.write('problem reading:' + filename)
## In any case, the code then continues with the line after the try/except
The module urllib provides url fetching -- making a url look like a file you can read form. The urlparse module can take apart and put together urls.
- urllib module docs
- ufile = urllib.urlopen(url) -- returns a file like object for that url
- text = ufile.read() -- can read from it, like a file (readlines() etc. also work)
- info = ufile.info() -- the meta info for that request. info.gettype() is the mime time, e.g. 'text/html'
- baseurl = ufile.geturl() -- gets the "base" url for the request, which may be different from the original because of redirects
- urllib.urlretrieve(url, filename) -- downloads the url data to the given file path
- urlparse.urljoin(baseurl, url) -- given a url that may or may not be full, and the baseurl of the page it comes from, return a full url. Use geturl() above to provide the base url.
浙公网安备 33010602011771号