Python - 随笔分类 - seonwee

openpyxl库常用操作封装

摘要：from openpyxl import * class excel(): def __init__(self,file): self.file = file self.wb = load_workbook(self.file) sheets = self.wb.get_sheet_names() 阅读全文

posted @ 2021-11-21 22:53 seonwee 阅读(118) 评论(0) 推荐(0)

文本余弦相似度分析

摘要：# -*- coding: utf-8 -*- # @Time : 2021/10/11 23:19 # @Author : DaWeiGuo # @File : xiangsidu.py # @Software: PyCharm # -*- coding: utf-8 -*- import jie 阅读全文

posted @ 2021-11-09 12:45 seonwee 阅读(190) 评论(0) 推荐(0)

Scrapy框架基本使用_对内容提取出来的url的进一步跟进

摘要：爬取网址 https://www.dytt8.net/index.htm 创建项目创建爬虫项目结构规定要爬取的内容，如下随便点击一项进入对其编写xpath表达式 5.1 提取描述文字xpath表达式 5.2 提取链接xpath表达式 5.3 图片链接提取xpath表达式定义Item 7. 阅读全文

posted @ 2021-09-09 16:19 seonwee 阅读(410) 评论(0) 推荐(0)

selenium_基本使用

摘要：from selenium import webdriver path = "驱动存放路径" browser = webdriver.Chrome(path) browser.get("https://www.baidu.com")#访问网页 browser.page_source()#获取网页源码阅读全文

posted @ 2021-09-08 11:11 seonwee 阅读(61) 评论(0) 推荐(0)

selenium_handless使用的配置模板

摘要：from selenium import webdriver from selenium.webdriver.chrome.options import Options def share_browser(): chrome_options = Options() chrome_options.ad 阅读全文

posted @ 2021-09-07 22:15 seonwee 阅读(467) 评论(0) 推荐(0)

selenium_phantomjs的基本使用

摘要：from selenium import webdriver path = 'phantomjs.exe'#自己电脑上存放的路径 browser = webdriver.PhantomJS(path) url = 'https://www.baidu.com' browser.get(url) br 阅读全文

posted @ 2021-09-07 22:12 seonwee 阅读(108) 评论(0) 推荐(0)

scrapy框架简单使用样例——以当当网为例

摘要：scrapy框架简单使用样例——以当当网为例要爬取的地址为http://category.dangdang.com/cp01.01.02.00.00.00.html 创建项目，命令行运行命令 scrapy startproject 项目名称 (这里命名为dangdang) 项目结构如下：命令行c 阅读全文

posted @ 2021-09-07 20:22 seonwee 阅读(517) 评论(0) 推荐(0)

湖南强智科技教务系统python模拟登录并爬取成绩（财院）

摘要：其实之前有写过一篇帖子了旧帖地址（知乎）在之前使用教务系统的过程中，偶然一次发现登上教务系统后再退出来的后的登录网址竟然不需要验证码，想着之前有写过教务系统的爬虫模拟登录，没验证码的岂不是更好干（之前那次折腾了好久hhh，后面还是用selenium实现的成绩爬取，相比直接爬取，selenium的阅读全文

posted @ 2021-01-24 14:27 seonwee 阅读(2770) 评论(0) 推荐(0)

seonwee

“热爱是所有的理由和答案”。

随笔分类 - Python

公告