马蹄哒哒

2020年7月1日

摘要： >>> from docx import Document >>> word=Document(r'F:\word练习\qq.docx') >>> for 段落 in word.paragraphs: print(段落.text) 标题一我是二级标题今天下午下雨，但是我还是觉得好热我是一级标题阅读全文

posted @ 2020-07-01 23:51 马蹄哒哒阅读(259) 评论(0) 推荐(0)

2020年6月28日

将一行拆分为多行

摘要： import pandas a=pandas.DataFrame({'Country': ['China,US', 'Japan,EU', 'UK,Australia', 'Singapore,Netherland'], 'Number': [100, 150, 120, 90], 'Value': 阅读全文

posted @ 2020-06-28 18:24 马蹄哒哒阅读(341) 评论(0) 推荐(0)

cut方法的使用

摘要： import pandas a=pandas.read_excel(r'D:\scrapy网络爬虫\nba.xlsx') bins=[0,5000000,max(a['Salary'])] group_by=['底','高'] a['new_col']=pandas.cut(a['Salary'], 阅读全文

posted @ 2020-06-28 14:36 马蹄哒哒阅读(282) 评论(0) 推荐(0)

2020年6月27日

将多行合并并为一行

摘要： import pandas a=pandas.read_excel() def abc(x): return ','.join(x.values) b=a.groupby(['列名'1])['列名2'].apply(abc) c=b.reset_index() print(c) 阅读全文

posted @ 2020-06-27 22:00 马蹄哒哒阅读(587) 评论(0) 推荐(0)

2020年6月26日

合并多个Series为DataFrame并且重置索引

摘要： a=['序号',1,2,3,4,5] b=['成本',20,45,12,34,67] import pandas c=pandas.Series(a) d=pandas.Series(b) e=pandas.DataFrame(list(zip(c,d))) print(e) 0 1 0 序号成本阅读全文

posted @ 2020-06-26 16:52 马蹄哒哒阅读(2808) 评论(0) 推荐(0)

2020年6月24日

用CrwalSpider爬取boss直聘

摘要： from boss.items import BossItem class ZhiPinSpider(CrwalSpider): name='Zhipin' allwed_domains=['zhipin.com'] start_urls=['https://www.zhipin.com/c1000 阅读全文

posted @ 2020-06-24 14:06 马蹄哒哒阅读(400) 评论(1) 推荐(0)

设置piplines.py数据管道

摘要： from scrapy.exporters import JsonLinesItemExporter class BossPipleline(object): def __init__(self): self.fp=open('jobs.json','wb') self.exporter=JsonL 阅读全文

posted @ 2020-06-24 13:50 马蹄哒哒阅读(263) 评论(0) 推荐(0)

在middlewares.py文件里添加代理ip

摘要： import random import base64 #方法一 #设置代理ip class IpProxyDownLoadMiddleWares(object): Proxys=['178.44.170.152:8080','110.44.113.182:8080','209.126.124.73 阅读全文

posted @ 2020-06-24 12:45 马蹄哒哒阅读(332) 评论(0) 推荐(0)

2020年6月23日

爬取豆瓣电影

摘要： import requests import time from lxml import etree import json #获取网页函数 def getpage(url): try: headers={'User-Agent':'Mozilla/5.0 (Linux; Android 6.0; 阅读全文

posted @ 2020-06-23 21:45 马蹄哒哒阅读(146) 评论(0) 推荐(0)

animation绘制动画图

摘要： import numpy from matplotlib import pyplot from matplotlib import animation def update_points(num): point_ani.set_data(x[num],y[num]) #更新点的位置，将这里的(x[n 阅读全文

posted @ 2020-06-23 17:55 马蹄哒哒阅读(681) 评论(0) 推荐(0)

公告