布都御魂

2023年11月13日

摘要： def del_nt(title_list): title_new = [] for title_old in title_list: title = re.sub('\s', '', title_old) if title == '': pass else: title_new.append(ti 阅读全文

posted @ 2023-11-13 09:07 布都御魂阅读(21) 评论(0) 推荐(0)

2023年8月28日

用pandas把文件某一列转换成列表

摘要： import pandas as pdimport numpy as nppath = '产业布局-企业.xlsx'# 详情页链接title = pd.read_excel(path, usecols=[2])title_arr = np.asarray(title.stack()) # Dataf 阅读全文

posted @ 2023-08-28 14:15 布都御魂阅读(160) 评论(0) 推荐(0)

2023年8月24日

删除文章里的中文符号和空格，换成英文逗号，并获取最后两个标签

摘要： def update_biaoqian(tag_list, title): if tag_list==[''] print('没有标签，取标题作为标签') titless = re.sub('\s', ',', title) tag_list = title.replace('、', ',').re 阅读全文

posted @ 2023-08-24 14:16 布都御魂阅读(46) 评论(0) 推荐(0)

2023年8月23日

删除列表里的空格

摘要： def del_nt(title_list): title_new = [] for title_old in title_list: title = re.sub('\s', '', title_old) if title == '': pass else: title_new.append(ti 阅读全文

posted @ 2023-08-23 10:43 布都御魂阅读(18) 评论(0) 推荐(0)

2023年7月27日

re.error: multiple repeat 解决方法

摘要：内容里有特殊字符，用re.escape(pattern)转义一下阅读全文

posted @ 2023-07-27 15:27 布都御魂阅读(461) 评论(0) 推荐(1)

2023年7月25日

把字符串中的所有br标签换成连续两个br标签

摘要： # 把所有br标签换成一个br标签content = re.sub(r"(<br>)\1+", r"\1", content)# 把一个换成2个br标签content = re.sub("<br>", '<br><br>', content)print(f'展示图片原图片:{picurl}') 阅读全文

posted @ 2023-07-25 16:46 布都御魂阅读(51) 评论(0) 推荐(0)

2023年7月19日

mysql 插入数据时，出现"\xF0\x9F\x8F\x80"这种情况的处理！

摘要：删除内容中的表情符号 import emoji import re def del_emoji(text): text = emoji.demojize(text) result = re.sub(':\S+?:', ' ', text) result = result.replace("(●'◡' 阅读全文

posted @ 2023-07-19 16:29 布都御魂阅读(100) 评论(0) 推荐(0)

2023年7月14日

删除img标签里的width和height属性，并在img标签前后加一个br标签

摘要： # 提取img标签 tree_img = etree.HTML(content) width = tree_img.xpath('//img//@width')[0] height = tree_img.xpath('//img//@height')[0] # 替换掉width=，和height= 阅读全文

posted @ 2023-07-14 09:58 布都御魂阅读(112) 评论(0) 推荐(0)

2023年7月12日

矿机之家

摘要： import hashlib import random import re import time from lxml import etree import pymysql import requests def strip_tags(string, allowed_tags=''): if a 阅读全文

posted @ 2023-07-12 16:00 布都御魂阅读(69) 评论(0) 推荐(0)

2023年7月11日

多个列表组合成一个字典

摘要： list1 = ['组', '2023-1-1', '2023-1-2', '2023-1-3', '总业绩'] list2 = ['一组', '1', '2', '3', '6'] list3 = ['二组', '4', '5', '6', '15'] list4 = ['三组', '7', '8 阅读全文

posted @ 2023-07-11 16:45 布都御魂阅读(34) 评论(0) 推荐(0)

公告