摘要: spider: import scrapyfrom yswPro.items import YswproItemfrom selenium import webdriverclass YswSpider(scrapy.Spider): name = 'ysw' # allowed_domains = 阅读全文
posted @ 2021-10-22 08:32 小毂 阅读(45) 评论(0) 推荐(0) 编辑
摘要: import requestsimport jsonif __name__ == '__main__': url = 'https://movie.douban.com/j/chart/top_list' headers = { 'User-Agent': 'Mozilla/5.0 (Windows 阅读全文
posted @ 2021-10-22 08:28 小毂 阅读(50) 评论(0) 推荐(0) 编辑
摘要: import requestsfrom lxml import etreeif __name__=='__main__': url = 'https://www.runoob.com/python3/python3-examples.html' headers = { 'User-Agent':'M 阅读全文
posted @ 2021-10-22 08:27 小毂 阅读(34) 评论(0) 推荐(0) 编辑
摘要: import requestsimport reimport xlwtif __name__=="__main__": wb = xlwt.Workbook() ws = wb.add_sheet('电影') url = 'https://maoyan.com/board/%d' headers = 阅读全文
posted @ 2021-10-22 08:26 小毂 阅读(27) 评论(0) 推荐(0) 编辑
摘要: import requestsimport xlwtfrom lxml import etreeif __name__=="__main__": wb = xlwt.Workbook() ws = wb.add_sheet('网站') url = 'https://www.hao123.com/' 阅读全文
posted @ 2021-10-22 08:25 小毂 阅读(37) 评论(3) 推荐(0) 编辑
摘要: 1.上传spark-2.4.0-bin-hadoop2.6.tgz到/opt目录,并解压到/usr/localtar -zxf /opt/spark-2.4.0-bin-hadoop2.6.tgz -C /usr/local/ 进入/usr/local/spark-2.4.0-bin-hadoop2 阅读全文
posted @ 2021-10-21 19:11 小毂 阅读(95) 评论(0) 推荐(0) 编辑
摘要: # 新建的虚拟机,可以更改为与文档中相同的主机名 hostnamectl set-hostname master hostnamectl set-hostname slave1 hostnamectl set-hostname slave2 .ssh免密登录 1),生成秘钥(群发) ssh-keyg 阅读全文
posted @ 2021-10-21 18:43 小毂 阅读(61) 评论(0) 推荐(0) 编辑