python 爬取整理

请求部分

url解析

from urllib import parse
url = "http://www.baidu.com/s?"
info = {"wd":"kidd"}
url = url + parse.urlencode(info)
print(url) #http://www.baidu.com/s?wd=kidd

 

url的编码与解码

为何要这需要使用呢?

如果一个请求中包含?=  / + 等特殊符号时可能会发生冲突。如果你直接 http://www.baidu.com/s?wd=/a+b=?/ 搜过内容肯定会有差别。

from urllib import parse
# 编码
url = "http://www.baidu.com/s?wd="
info = parse.quote("/a+b=?/")
url += info
print(url) # http://www.baidu.com/s?wd=/a%2Bb%3D%3F/

# 解码
parse_url = parse.unquote(url)
print(parse_url) # http://www.baidu.com/s?wd=/a+b=?/

requests好像不能实现,如果能实现麻烦告诉我。

 

requests的post请求

data数据不是字典

data = "name=kidd"
response = requests.post("http://httpbin.org/post",data=data)
print(response.text)

返回结果,放在data中

"{
  "args": {}, 
  "data": "name=kidd", 
  "files": {}, 
  "form": {}, 
  "headers": {
    "Accept": "*/*", 
    "Accept-Encoding": "gzip, deflate", 
    "Content-Length": "9", 
    "Host": "httpbin.org", 
    "User-Agent": "python-requests/2.23.0", 
    "X-Amzn-Trace-Id": "Root=1-5edeee36-d00dd8b083c14254ec60605a"
  }, 
  "json": null, 
  "origin": "39.77.220.193", 
  "url": "http://httpbin.org/post"
}"

data是字典

data = {"name":"kidd"}
response = requests.post("http://httpbin.org/post",data=data)
print(response.text)

返回数据,放在form中,数据在form才算成功

{
  "args": {}, 
  "data": "", 
  "files": {}, 
  "form": {
    "name": "kidd"
  }, 
  "headers": {
    "Accept": "*/*", 
    "Accept-Encoding": "gzip, deflate", 
    "Content-Length": "9", 
    "Content-Type": "application/x-www-form-urlencoded", 
    "Host": "httpbin.org", 
    "User-Agent": "python-requests/2.23.0", 
    "X-Amzn-Trace-Id": "Root=1-5edeeee5-f0544530bbb1b22824acd930"
  }, 
  "json": null, 
  "origin": "39.77.220.193", 
  "url": "http://httpbin.org/post"
}

 

posted @ 2020-06-09 10:12  Sun先生  Views(205)  Comments(0Edit  收藏  举报