爬虫作业（3）

from bs4 import BeautifulSoup
html=BeautifulSoup("<!DOCTYPE html>\n<html>\n<head>\n<meta charset=‘utf-8‘>\n<title>菜鸟教程(runoob.com)</title>\n</head>\n<body>\n<h1>我的第一标题</h1>\n<p id=‘frist‘>我的第一段落。</p>\n</body>\n</table>\n</html>","html.parser")
print(html.head,"学号后两位：02")

import re
r=html.text
pattern=re.findall(u'[\u1100-\uFFFDh]+?',r)
print(pattern)

posted @ 2020-12-13 21:40 dt_1005 阅读(100) 评论(0) 收藏举报

刷新页面返回顶部

dt_1005

爬虫作业（3）

公告