python中urlencode,quote方法

from urllib import urlencode, quote, quote_plus

 

1.urlencode:

常用于url中转换参数,规则:

接受参数形式为:[(key1, value1), (key2, value2),...] 和 {'key1': 'value1', 'key2': 'value2',...} 
返回的是形式:key2=value2&key1=value1字符串。

from urllib import urlencode
params = {'name': u'老王'.encode('utf8'), 'sex': u''.encode('utf8')}
urlencode(params)
# out:    name=%E8%80%81%E7%8E%8B&sex=%E7%94%B7
params = [('name', u'老王'.encode('utf8')), ('sex', u''.encode('utf8'))]
urlencode(params)
# out:    name=%E8%80%81%E7%8E%8B&sex=%E7%94%B7

 

    from urllib import urlencode
    params = [('name', u'老王'.encode('utf8')), ('sex', u''.encode('utf8'))]
    new_params = urlencode(params)
    url = 'http://www.hello.world/'
    url = url + '?' + new_params
    print url
    # out:       http://www.hello.world/?name=%E8%80%81%E7%8E%8B&sex=%E7%94%B7

 

2.quote:

转换url中非ascii字符,添加参数"safe=string.printable"即可

规则:对于可显示的ascii字符,不做编码;对于非ascii字符,做编码,

如:'http://www.hello.world/你好世界' --->>> 'http://www.hello.world/%E4%BD%A0%E5%A5%BD%E4%B8%96%E7%95%8C'

    from urllib import quote
    import string
    url = 'http://www.hello.world/你好世界'
    url_encode = quote(url, safe=string.printable)
    print url_encode
    # out:      http://www.hello.world/%E4%BD%A0%E5%A5%BD%E4%B8%96%E7%95%8C

 

3.quote_plus:

与 quote 相似,只是会把传入参数中的空格转化为 +

 

4.无论是使用urlencode还是quote编码,在编码之前都必须将字符转换为网页可以使用的字符编码,即utf8或gbk或gb2312等,而不能是unicode字符!

str_ = u"http://www.baidu.com/这是网页的字符"   # 在web编码之前,如果字符是unicode将报错
print quote(str_, safe=string.printable)
#   OUT:  抛出异常:KeyError: u'\u8fd9'
Traceback (most recent call last):
  File "/home/~/Desktop/scripts/tmp/tmp/tmp1.py", line 13, in <module>
    print quote(str_, safe=string.printable)
  File "/usr/lib/python2.7/urllib.py", line 1298, in quote
    return ''.join(map(quoter, s))
KeyError: u'\u8fd9

 

 

str_ = "http://www.baidu.com/这是网页的字符"    # web编码时的,字符符合web要求,如这里的utf8,则正常执行
print quote(str_, safe=string.printable)
# OUT:  http://www.baidu.com/%E8%BF%99%E6%98%AF%E7%BD%91%E9%A1%B5%E7%9A%84%E5%AD%97%E7%AC%A6

 

str_ = "http://www.baidu.com/这是网页的字符".decode("utf8").encode("gbk")    # web编码时的,字符符合web要求,如这里的GBK,则正常执行
print quote(str_, safe=string.printable)
#   OUT:    http://www.baidu.com/%D5%E2%CA%C7%CD%F8%D2%B3%B5%C4%D7%D6%B7%FB

 

str_ = "http://www.baidu.com/这是网页的字符".decode("utf8").encode("gb2312")    # web编码时的,字符符合web要求,如这里的gb2312,则正常执行
print quote(str_, safe=string.printable)
# OUT:    http://www.baidu.com/%D5%E2%CA%C7%CD%F8%D2%B3%B5%C4%D7%D6%B7%FB

同样的,urlencode编码前,字符也要转换成符合web要求的字符集。如果是unicode,将抛出异常

import json
import string
from urllib import quote, urlencode

params = [('name', u'老王'), ('sex', u'')]
p=urlencode(params)
url = 'http://www.hello.world/' + '?' + p
print url
#
Traceback (most recent call last):
  File "/home/~/Desktop/scripts/tmp/tmp/tmp1.py", line 7, in <module>
    p=urlencode(params)
  File "/usr/lib/python2.7/urllib.py", line 1342, in urlencode
    v = quote_plus(str(v))
UnicodeEncodeError: 'ascii' codec can't encode characters in position 0-1: ordinal not in range(128)

 

 必须先转成web字符集:

params = [('name', u'老王'.encode("gbk")), ('sex', u''.encode("gbk"))]    # web字符集:utf8,gbk,gb2312
p=urlencode(params)
url = 'http://www.hello.world/' + '?' + p
print url
#   OUT:    http://www.hello.world/?name=%C0%CF%CD%F5&sex=%C4%D0

 












posted on 2018-03-28 14:30  myworldworld  阅读(1040)  评论(0)    收藏  举报

导航