字符串

3.1.2.字符串
python可以操作字符串,字符串包含在单引号或者双引号的内部,\可以用来转义引号escape quotes。

 1 >>> 'spam eggs' # single quotes
 2 'spam eggs'
 3 >>> 'doesn\'t' # use \' to escape the single quote...
 4 "doesn't"
 5 >>> "doesn't" # ...or use double quotes instead
 6 "doesn't"
 7 >>> '"Yes," he said.'
 8 '"Yes," he said.'
 9 >>> "\"Yes,\" he said."
10 '"Yes," he said.'
11 >>> '"Isn\'t," she said.'
12 '"Isn\'t," she said.'

 

在交互模式中,输出字符串封闭于enclosed 引号中,其中特殊字符通过backslash‘\’转义escape。如果一个字符串中含有单引号没有双引号,那么字符串用双引号封闭,否则就用单引号。print语句的输出清晰易读,省略掉了引号,打印出转义过的特殊字符。

>>> '"Isn\'t," she said.'
'"Isn\'t," she said.'
>>> print '"Isn\'t," she said.'
"Isn't," she said.
>>> s = 'First line.\nSecond line.' # \n means newline
>>> s # without print, \n is included in the output
'First line.\nSecond line.'
>>> print s # with print, \n produces a new line
First line.
Second line

如果你不想以‘\’开头的字符翻译成特殊字符,你可以使用raw strings,在第一个引号前加一个r。

>>> print 'C:\some\name' # here \n means newline!
C:\some
ame
>>> print r'C:\some\name' # note the r before the quote
C:\some\name

字符串可以拓展为多行,一个方法就是使用三引号'''...'''。引号内的内容会原样输出,EOL会自动的包含在字符串中,可以通过在行的结尾添加\来去掉EOL。

print """\
Usage: thingy [OPTIONS]
-h Display this usage message
-H hostname Hostname to connect to
"""

输出以下字符,注意起始的新行没有包括在其中。

Usage: thingy [OPTIONS]
-h Display this usage message
-H hostname Hostname to connect to

字符串可以concatenated使成串的string连接起来,使用运算符‘+’glued together,运算符‘*’重复。

>>> # 3 times 'un', followed by 'ium'
>>> 3 * 'un' + 'ium'
'unununium'

两个或多个字面型字符串string literals(封闭于引号中的类型)会自动的连接起来。

>>> 'Py' 'thon'
'Python'

这只对两个字面型的可以使用,不可以用于变量或者表达式。

>>> prefix = 'Py'
>>> prefix 'thon' # can't concatenate a variable and a string literal
...
SyntaxError: invalid syntax
>>> ('un' * 3) 'ium'
...
SyntaxError: invalid syntax

如果你想连接变量或者一个变量和一个字面型,使用‘+’运算符。

>>> prefix + 'thon'
'Python'

这个特性在你想要打破一个长字符串时很有用。

>>> text = ('Put several strings within parentheses '
'to have them joined together.')
>>> text
'Put several strings within parentheses to have them joined together.'

字符串可以索引index(下标表示),第一个字符以索引0开始。

>>> word = 'Python'
>>> word[0] # character in position 0
'P'
>>> word[5] # character in position 5
'n'

索引编号也可以是负数,从右边开始数,注意因为-0和0是一样的,所以负数的索引从-1开始。

>>> word[-1] # last character
'n'
>>> word[-2] # second-last character
'o'
>>> word[-6]
'P'

字符串还可以slice切片,index通常用来获得单个字符,slice可以是你获得一个字符的子集substring。

>>> word[0:2] # characters from position 0 (included) to 2 (excluded)
'Py'
>>> word[2:5] # characters from position 2 (included) to 5 (excluded)
'tho'

注意怎样使开头总是留下,而不包含结尾。s[:i] + s[i:] 等价于字符串s。

>>> word[:2] + word[2:]
'Python'
>>> word[:4] + word[4:]
'Python'

在slice 的索引编号中,第一个编号省略不写则为0,第二个编号不写则为被切片的字符串的长度。

>>> word[:2] # character from the beginning to position 2 (excluded)
'Py'
>>> word[4:] # characters from position 4 (included) to the end
'on'
>>> word[-2:] # characters from the second-last (included) to the end
'on'

一个记住slice如何工作的方法是将索引编号比作在字符之间的缝隙,第一个字符的左边界计数为0,最后一个字符的右边界计数为n,n为这个字符串的长度。

+---+---+---+---+---+---+
| P | y | t | h | o | n |
+---+---+---+---+---+---+
0 1 2 3 4 5 6
-6 -5 -4 -3 -2 -1

第一行的数给出了正数的编号,第二行给出了负数的编号。slice从i到j由边界i和j中的字符依次组成。

对于非负的编号,如果两边都有边界,切片的长度不同于编号,比如word[1:3]的长度是2。index如果使用一个过大的编号会返回一个错误。

>>> word[42] # the word only has 6 characters
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
IndexError: string index out of range

然而,超过范围的slice可以顺利的执行slicing。

>>> word[4:42]
'on'
>>> word[42:]
''

python中的string内容是不能改变immutable的,因此,给一个字符串的索引编号返回错误。

>>> word[0] = 'J'
...
TypeError: 'str' object does not support item assignment
>>> word[2:] = 'py'
...
TypeError: 'str' object does not support item assignment

如果你需要一个不同的字符串,你可以创建一个新的。

>>> 'J' + word[1:]
'Jython'
>>> word[:2] + 'py'
'Pypy'

内建的函数len()返回字符串的长度。

>>> s = 'supercalifragilisticexpialidocious'
>>> len(s)
34

 

posted @ 2015-06-21 21:04  xiaolong92  阅读(155)  评论(0)    收藏  举报