09 2019 档案

摘要:#items.py import scrapy class InsistItem(scrapy.Item): comment=scrapy.Field() #pipelines.py import json class InsistPipeline(object): def __init__(self): self.f=open('tencent.json','... 阅读全文
posted @ 2019-09-24 09:48 晨曦yd 阅读(482) 评论(0) 推荐(0)
摘要:#这里只爬取第一页 items.py import scrapy #定义爬取数据 class InsistItem(scrapy.Item): image_urls=scrapy.Field() tengxun.py import scrapy from insist.items import InsistItem import json class TengxunSpider(scrapy.Sp 阅读全文
posted @ 2019-09-20 23:05 晨曦yd 阅读(201) 评论(0) 推荐(0)
摘要:scrapy startproject insist #创建项目 scrapy genspider teng carees.tencent.com#创建爬虫(爬虫名字+域名) items.py #需要爬取的信息 import scrapy class InsistItem(scrapy.Item): # define the fields for your item here like: posi 阅读全文
posted @ 2019-09-20 08:36 晨曦yd 阅读(492) 评论(0) 推荐(0)
摘要:# -*- coding: utf-8 -*- #这只是爬虫文件内容,使用pycharm运行,在terminal中使用命令行,要用爬虫名字import scrapy from insist.items import InsistItem class InsistsSpider(scrapy.Spider): name = 'insists' allowed_domains = ['itcast.c 阅读全文
posted @ 2019-09-15 22:17 晨曦yd 阅读(162) 评论(0) 推荐(0)
摘要:import urllib.request import re import urllib import csv from selenium import webdriver from lxml import etree import requests x=0 header=['日期','开盘价','最高价','最低价','收盘价','涨跌额','涨跌幅','成交量','成交金额','振幅','换 阅读全文
posted @ 2019-09-03 12:37 晨曦yd 阅读(342) 评论(0) 推荐(0)