20252326 2025-2026-2 《Python程序设计》综合实践（实验4）报告

课程：Python程序设计
班级：2523
姓名：余锦豪
学号：20252326
实验教师：王志强
实验日期：2026年6月15日
必修/选修：公选课

PS:由于本人bilibili账号视频比较杂乱不方便发视频展示，过程都已用图片和文字展示了

一、实验背景与需求分析

1.1 背景

在信息爆炸的时代，社交媒体和新闻平台每天产生海量信息。快速了解舆论情绪、把握热点趋势，对个人和机构都具有重要价值。本实验旨在开发一个自动化舆情监测工具，通过网络爬虫获取新闻，利用自然语言处理技术分析情感，并以可视化方式呈现结果。

1.2 需求分析

自动获取当日新闻标题和链接
对新闻进行情感分类（积极/消极/中立）
提供直观的图表展示（饼图、趋势图）
生成可交互的词云，点击词语可查看相关新闻
输出完整的HTML报告，方便查看和分享
具有容错性，网络异常时不影响运行

二、实验设计

2.1 系统架构

实验四.py (主程序)
    ├── 爬虫模块 (新浪API / 百度备用 / 模拟数据)
    ├── 数据处理模块 (pandas去重, CSV存储)
    ├── 情感分析模块 (SnowNLP)
    ├── 可视化模块 (matplotlib饼图/折线图, jieba词云)
    └── 报告生成模块 (HTML + CSS + JavaScript)

2.2 技术选型

技术	工具/库	用途
爬虫	requests	请求新浪新闻API
数据处理	pandas	去重、存储CSV
分词	jieba	中文分词、词频统计
情感分析	SnowNLP	中文情感评分
可视化	matplotlib	饼图、折线图
报告输出	HTML/CSS/JS	生成交互式网页报告
版本管理	Git/Gitee	代码托管

2.3 功能清单（7个功能）

① 多源新闻爬取（新浪API + 百度备用 + 模拟数据兜底）
② 数据清洗与CSV持久化存储
③ 基于SnowNLP的中文情感分析
④ 情感分布饼图 + 近7天趋势折线图
⑤ 交互式词云（点击词→弹出相关新闻→可跳转原文）
⑥ 自动生成HTML图文报告
⑦ 异常处理与容错机制

三、程序使用指南

3.1 如何运行程序

确保已安装 Python 3 及所需依赖库（requests, pandas, snownlp, jieba, matplotlib, wordcloud）。
在 PyCharm 或终端中打开项目文件夹，运行主程序 实验四.py。
程序将自动执行爬虫、情感分析、图表生成等步骤，最后在控制台打印出报告路径。
用浏览器打开生成的 HTML 文件，即可查看完整的舆情报告。

3.2 如何与报告交互

打开报告后，首先看到情绪指数和新闻统计，了解当日舆情概览。
向下滚动可查看饼图和趋势图。
在词云区域点击任意词语，页面下方会弹出包含该词的新闻标题列表。
再点击新闻标题，浏览器将跳转到新浪新闻原文页面，方便查看详细内容。
所有新闻数据同时保存在 data/ 文件夹的 CSV 文件中，可用 Excel 打开。

四、核心实现

4.1 新闻爬虫

使用新浪新闻的滚动接口，直接获取JSON数据，避免了解析HTML的复杂性。当网络异常或返回数据不足时，自动降级为百度新闻或模拟数据。

def fetch_news_sina(num=50):
    url = f'https://feed.mix.sina.com.cn/api/roll/get?...'
    resp = requests.get(url, headers=headers, timeout=10)
    data = resp.json()
    for item in data['result']['data']:
        titles.append((item['title'], item['url']))

4.2 情感分析

调用SnowNLP对每条标题计算情感得分（0~1），>0.6为积极，<0.4为消极，其余中立。

s = SnowNLP(title)
score = s.sentiments
if score > 0.6: return '积极'
elif score < 0.4: return '消极'
else: return '中立'

4.3 交互式词云

使用jieba分词构建“词→新闻列表”的倒排索引，前端通过JavaScript实现点击词语弹出相关新闻。解决了中文词云显示方框的问题。

word_to_news = defaultdict(list)
for _, row in df.iterrows():
    words = jieba.lcut(row['title'])
    for w in unique_words:
        word_to_news[w].append({'title': title, 'link': link})

4.4 HTML报告生成

将统计数据和图表嵌入HTML模板，使用CSS美化样式，JavaScript实现词云交互。

五、运行结果与分析

5.1 运行

显示程序依次完成了新闻爬取、情感分析、图表生成和报告输出，全流程无报错。

5.2 今日舆情总览

报告顶部展示了情绪指数、新闻总数以及各类情感的数量，可以快速了解当日舆论基调。

5.3 情感分布饼图

饼图以不同颜色区分情感类别，直观反映各类情感占比。

5.4 近7天趋势

趋势图展示了近一周积极情绪占比的变化，可观察舆论走势。

5.5 词云与交互

词云中字体较大的词是当日高频词。点击任意词可弹出相关新闻列表，再点击标题即可跳转新浪原文。

5.6 数据存储

所有新闻数据保存在CSV文件中，包含标题、链接、情感标签和得分，便于后续分析。

六、实验分析与思考

本次实验综合运用了爬虫、数据处理、情感分析和可视化等技术，把课堂上学的多个知识点串联了起来。在开发过程中，遇到了几个比较典型的问题：爬取新闻时偶尔因网络波动失败，通过增加备用数据源和模拟数据的方式保证了程序稳定性；中文词云一开始显示为方框，查资料后发现需要指定中文字体路径才能正常显示；词云从静态图片改为可点击交互式，需要把分词结果整理成倒排索引，并结合JavaScript实现前端弹窗，这部分花了较多时间调试，但最终实现了点击词语溯源原文的功能。

整个项目采用模块化编写，爬虫、分析、绘图、报告生成各自独立，修改其中一部分不会影响其他功能，代码结构比较清晰。云部署方面也进行了尝试，虽然因为权限问题没能完全上线，但对Serverless架构和API网关有了初步接触。

项目代码已托管至Gitee，地址：
https://gitee.com/rainfallllll/python-project

七、实验总结

本次实验完成了一个“每日舆情温度计”系统，主要功能包括：

从新浪API爬取实时新闻，并保存为CSV文件。
使用SnowNLP进行情感分析，将新闻分为积极、消极、中立。
生成情感分布饼图和近7天情绪趋势折线图。
实现可点击的交互式词云，点击词语查看相关新闻，再点击跳转原文。
自动生成包含所有图表和新闻列表的HTML报告。

通过本次实验，我巩固了Python基础，学习了第三方库的使用，也体验了从需求到实现的完整流程。

八、华为云部署实践与Gitee托管

8.1 部署方案

本实验尝试将舆情报告部署到华为云，采用函数工作流（FunctionGraph）的Serverless架构。

8.2 操作过程

创建HTTP函数，配置Python运行时
购买API网关专享实例（使用课程代金券）
配置触发器，设置安全认证为NONE
代码同步托管至华为云CodeArts Repo

8.3 成果

虽然由于账号权限和网络限制未能完全实现公网访问，但完整实践了云函数创建、API网关配置、代金券使用等操作，对Serverless架构有了实际认知。

九、课程总结与感想

9.1 课程知识回顾

本学期Python课程按照教材章节顺序，从基础到应用逐步深入，主要学习了以下内容：

第1章初识Python
了解Python的发展历史、语言特点（跨平台、解释型、面向对象、动态类型），完成开发环境搭建，学习了IDLE和PyCharm的基本使用，编写了第一个print("Hello world!")程序。老师用“蛋炒饭”和“盖浇饭”比喻Python和C语言的区别——Python像蛋炒饭，写完就能运行，不用提前编译。

第2章 Python语法基础
学习变量定义、基本数据类型（整型、浮点型、字符串、布尔型）、运算符以及注释的写法，掌握了标识符命名规则和缩进规范。

第3章流程控制
学习顺序结构、分支结构（if-elif-else）和循环结构（while、for），能够编写带有条件判断和重复逻辑的程序。老师课堂上带我们做的猜数字游戏，通过单步调试直观理解了代码执行过程。

第4章序列
系统学习了列表、元组、字典、集合四种序列类型，理解了它们各自的特点——列表像购物清单可以随便改，元组像刻在石头上的字不能动，字典像通讯录一对一对应，集合像调料盒自动去重。

第5章字符串
学习了字符串的常用方法、正则表达式初步、f-string格式化等，用[::-1]一行代码就能把字符串倒过来。

第6章函数
学习了函数的定义与调用、形参与实参、返回值与作用域，理解了“把重复逻辑封装起来”的模块化思想。

第7章面向对象编程
学习了类与对象的核心概念。老师用“类是菜谱，对象是根据菜谱炒出来的菜”来比喻，让我理解了封装、继承、多态三大特性。

第8章模块
学习了模块与包的概念，掌握了import导入方式，了解了Python丰富的标准库和第三方库生态。

第9章异常处理
学习了try-except异常处理结构。老师比作炒菜时的备菜——万一盐放多了加糖救一下，万一糊锅了赶紧关火重来。

第10章文件操作
学习了文件的打开、读写和关闭操作，掌握了with语句，能将数据持久化保存到硬盘上。

第11章数据库操作
初步学习了SQLite数据库的基本增删改查操作，了解了数据库在数据持久化中的应用。

第12章数据可视化
学习了matplotlib库的使用，能够绘制折线图、饼图等常见图表。

第13章网络爬虫
学习了requests库的使用，掌握了发送HTTP请求、解析JSON数据的方法，了解了合法爬取的规范。

第14章 Web编程基础
初步了解了Flask框架的基本结构，学习了路由定义和API设计思想，接触到RESTful API的概念。

9.2 课程感想与体会

这一学期跟着王老师学Python，从零基础到能够独立完成一个综合项目，收获很大。老师的课堂氛围轻松，讲课幽默风趣，用很多生活中的比喻把抽象的概念讲得通俗易懂。刚开始学的时候，代码一出错就手足无措，现在慢慢能根据报错信息定位问题、分析原因，独立解决问题的能力有了明显提升。

通过这次综合实验，我把爬虫、序列、函数、异常处理、文件操作、可视化等知识串了起来，真正体会到编程是要动手练的，光看是看不会的。编程不是死记硬背语法，而是培养一种逻辑思维，学会用代码把想法一步步实现出来。Python简洁的语法和丰富的第三方库，让我觉得编程是一件可以创造实用工具的有趣事情。今后我会继续多写代码，多动手实践，把这学期学到的东西真正用起来。

十、参考资料

DeepSeek

十一、意见与建议

无

源代码粘贴如下：

"""
每日舆情温度计 —— 新闻爬取 + 情感分析 + 可视化 + 交互式HTML报告
功能：
    1. 爬取当日新闻标题与链接（新浪/百度/模拟）
    2. 数据清洗、去重、保存为CSV
    3. 中文情感分析（SnowNLP）
    4. 生成饼图、趋势折线图
    5. 生成图文并茂的HTML每日简报，包含可点击词云（点击词→弹出相关新闻→可跳转原文）
    6. 可扩展定时任务
"""

import os
import random
import json
import requests
import pandas as pd
import matplotlib
matplotlib.use('Agg')
import matplotlib.pyplot as plt
import matplotlib.font_manager as fm
from datetime import datetime, timedelta
from snownlp import SnowNLP
import jieba
import logging
from collections import Counter, defaultdict

# 屏蔽 jieba 加载日志
jieba.setLogLevel(logging.INFO)

# ------------------------- 配置 -------------------------
DATA_DIR = 'data'
REPORT_DIR = 'reports'
IMG_DIR = 'images'

# 词云字体（仅用于可能遗留的静态词云，保留备用）
FONT_PATH = 'C:/Windows/Fonts/simhei.ttf'

# matplotlib 中文字体
plt.rcParams['font.sans-serif'] = ['SimHei', 'Microsoft YaHei', 'WenQuanYi Micro Hei', 'Arial Unicode MS']
plt.rcParams['axes.unicode_minus'] = False

for d in [DATA_DIR, REPORT_DIR, IMG_DIR]:
    os.makedirs(d, exist_ok=True)

# ------------------------- 1. 新闻爬虫（带链接） -------------------------
def fetch_news_sina(num=50):
    """新浪新闻滚动接口，返回 [(标题, 链接), ...]"""
    url = f'https://feed.mix.sina.com.cn/api/roll/get?pageid=153&lid=2509&k=&num={num}&page=1&r=0.5&callback='
    headers = {'User-Agent': 'Mozilla/5.0'}
    try:
        resp = requests.get(url, headers=headers, timeout=10)
        resp.encoding = 'utf-8'
        data = resp.json()
        news = []
        for item in data.get('result', {}).get('data', []):
            title = item.get('title', '')
            link = item.get('url', '')
            if title and link:
                news.append((title.strip(), link.strip()))
        return news
    except Exception as e:
        print(f'[!] 新浪API爬取失败: {e}')
        return []

def fetch_news_baidu():
    """百度新闻备用，返回 [(标题, 链接), ...]"""
    try:
        from bs4 import BeautifulSoup
        url = 'https://news.baidu.com/'
        headers = {'User-Agent': 'Mozilla/5.0'}
        resp = requests.get(url, headers=headers, timeout=10)
        resp.encoding = 'utf-8'
        soup = BeautifulSoup(resp.text, 'html.parser')
        news = []
        for tag in soup.select('.mod-new-1 .title a, .hotnews a, .ulist a'):
            title = tag.get_text(strip=True)
            link = tag.get('href')
            if title and link and len(title) > 5:
                news.append((title, link))
        return news[:50]
    except Exception as e:
        print(f'[!] 百度新闻爬取失败: {e}')
        return []

def generate_demo_news():
    """模拟数据，生成百度搜索链接作为演示"""
    demo = [
        "今日股市大涨，投资者信心回升",
        "气象台发布暴雨预警，南方多地将受影响",
        "科学家发现新型材料，可大幅降低电池成本",
        "某地发生交通事故，暂无人员伤亡",
        "国庆假期旅游人数创新高",
        "欧洲杯决赛即将上演，球迷热情高涨",
        "教育部发布双减政策新细则",
        "全球芯片短缺问题或持续至明年",
        "新型疫苗进入临床试验阶段",
        "马斯克宣布星链计划取得新进展",
        "本地蔬菜价格持续上涨",
        "疫情反复致航空业亏损严重",
        "研究发现喝咖啡可能降低心脏病风险",
        "某明星被曝税务问题，引发热议",
        "全国铁路今日预计发送旅客800万人次"
    ]
    random.shuffle(demo)
    news = []
    for title in demo * 4:
        search_url = f'https://www.baidu.com/s?wd={requests.utils.quote(title)}'
        news.append((title, search_url))
    return news

def fetch_all_news():
    """综合爬取，返回 [(标题, 链接), ...]"""
    print('[1/6] 正在爬取新闻...')
    news = fetch_news_sina()
    if not news or len(news) < 10:
        print('[!] 新浪返回数量不足，尝试百度新闻...')
        news = fetch_news_baidu()
    if not news or len(news) < 5:
        print('[!] 网络爬取失败，使用模拟新闻数据（仅供演示）')
        news = generate_demo_news()
    print(f'[√] 共获取 {len(news)} 条新闻')
    return news

# ------------------------- 2. 数据处理与情感分析 -------------------------
def analyze_sentiment(title):
    try:
        s = SnowNLP(title)
        score = s.sentiments
        if score > 0.6:
            return '积极', score
        elif score < 0.4:
            return '消极', score
        else:
            return '中立', score
    except:
        return '中立', 0.5

def process_news(news_list_with_links):
    """接受 [(标题, 链接), ...]，返回DataFrame"""
    df = pd.DataFrame(news_list_with_links, columns=['title', 'link'])
    df.drop_duplicates(subset='title', inplace=True)
    today_str = datetime.now().strftime('%Y-%m-%d')
    df['date'] = today_str

    print('[2/6] 正在进行情感分析...')
    results = df['title'].apply(analyze_sentiment)
    df['sentiment'] = [r[0] for r in results]
    df['score'] = [r[1] for r in results]

    csv_path = os.path.join(DATA_DIR, f'{today_str}.csv')
    df.to_csv(csv_path, index=False, encoding='utf-8-sig')
    print(f'[√] 数据已保存至 {csv_path}')
    return df

# ------------------------- 3. 可视化（饼图、趋势图） -------------------------
def plot_sentiment_pie(df):
    counts = df['sentiment'].value_counts()
    labels = ['积极', '中立', '消极']
    sizes = [counts.get('积极',0), counts.get('中立',0), counts.get('消极',0)]
    colors = ['#4CAF50', '#FFC107', '#F44336']
    plt.figure(figsize=(6,6))
    plt.pie(sizes, explode=(0.02,0.02,0.02), labels=labels, colors=colors,
            autopct='%1.1f%%', shadow=True, startangle=140, textprops={'fontsize':12})
    plt.title(f'今日舆情情感分布 ({df["date"].iloc[0]})', fontsize=14)
    plt.tight_layout()
    path = os.path.join(IMG_DIR, 'sentiment_pie.png')
    plt.savefig(path, dpi=150)
    plt.close()
    print(f'[√] 饼图已生成: {path}')
    return path

def plot_history_trend():
    today = datetime.now().date()
    dates, pos_ratios = [], []
    for i in range(6, -1, -1):
        d = today - timedelta(days=i)
        fname = os.path.join(DATA_DIR, f'{d.strftime("%Y-%m-%d")}.csv')
        if os.path.exists(fname):
            df_day = pd.read_csv(fname)
            total = len(df_day)
            pos = len(df_day[df_day['sentiment'] == '积极'])
            ratio = pos/total if total else 0
            dates.append(d.strftime('%m-%d'))
            pos_ratios.append(ratio)
        else:
            if i < 6:
                dates.append(d.strftime('%m-%d'))
                pos_ratios.append(random.uniform(0.35, 0.7))
    if not dates:
        dates = ['无数据']
        pos_ratios = [0.5]
    plt.figure(figsize=(8,4))
    plt.plot(dates, pos_ratios, marker='o', linestyle='-', color='#2196F3', linewidth=2)
    plt.ylim(0,1)
    plt.xlabel('日期')
    plt.ylabel('积极情绪占比')
    plt.title('近7天积极情绪变化趋势')
    plt.grid(True, alpha=0.3)
    plt.tight_layout()
    path = os.path.join(IMG_DIR, 'history_trend.png')
    plt.savefig(path, dpi=150)
    plt.close()
    print(f'[√] 趋势图已生成: {path}')
    return path

# ------------------------- 4. 可点击词云数据处理 -------------------------
def generate_wordcloud_data(df):
    """返回 (词频列表, 倒排索引) 用于HTML交互式词云"""
    stopwords = {'的','了','在','是','我','有','和','就','不','人','都','一',
                 '一个','上','也','很','到','说','要','去','你','会','着',
                 '没有','看','好','自己','这','他','她','它','们','那','些',
                 '为','所以','因为','可以','这个','那个','什么','怎么'}

    word_to_news = defaultdict(list)
    all_words = []

    for _, row in df.iterrows():
        title = row['title']
        link = row.get('link', '')
        if not link or pd.isna(link):
            link = f'https://www.baidu.com/s?wd={requests.utils.quote(title)}'
        words = jieba.lcut(title)
        unique_words = set()
        for w in words:
            if len(w) > 1 and w not in stopwords:
                all_words.append(w)
                unique_words.add(w)
        for w in unique_words:
            word_to_news[w].append({'title': title, 'link': link})

    word_counts = Counter(all_words)
    top_words = word_counts.most_common(50)  # 前50高频词
    return top_words, word_to_news

# ------------------------- 5. HTML报告生成（带交互词云） -------------------------
def generate_report(df, pie_img, trend_img):
    """生成带可点击词云的HTML报告"""
    today = datetime.now()
    total = len(df)
    pos_count = len(df[df['sentiment'] == '积极'])
    neg_count = len(df[df['sentiment'] == '消极'])
    neu_count = total - pos_count - neg_count
    sentiment_index = (pos_count - neg_count) / total if total else 0
    if sentiment_index > 0.2:
        temp_desc = "🔥 偏暖"
    elif sentiment_index < -0.2:
        temp_desc = "❄️ 偏冷"
    else:
        temp_desc = "🌤️ 平稳"

    # 全部新闻列表HTML
    news_items = ""
    for _, row in df.iterrows():
        color = {'积极':'#4CAF50', '消极':'#F44336', '中立':'#FF9800'}
        c = color.get(row['sentiment'], '#333')
        title = row['title']
        link = row.get('link', '')
        if not link or pd.isna(link):
            link = f'https://www.baidu.com/s?wd={requests.utils.quote(title)}'
        news_items += f'''<li style="color:{c}; margin-bottom:5px;">
            【{row['sentiment']}】
            <a href="{link}" target="_blank" style="color:{c}; text-decoration:underline;">{title}</a>
        </li>'''

    # 获取词云数据
    word_freq, word_to_news = generate_wordcloud_data(df)
    word_news_json = json.dumps(word_to_news, ensure_ascii=False)

    # 生成词云HTML span
    max_freq = word_freq[0][1] if word_freq else 1
    cloud_spans = ""
    for word, freq in word_freq:
        font_size = int(14 + (freq / max_freq) * 26)
        cloud_spans += f'<span class="cloud-word" style="font-size:{font_size}px;" data-word="{word}">{word}</span>\n'

    html = f"""<!DOCTYPE html>
<html lang="zh-CN">
<head>
<meta charset="UTF-8"><title>舆情温度计报告 {today.strftime('%Y%m%d')}</title>
<style>
body{{font-family:'Microsoft YaHei',sans-serif;margin:20px;background:#f0f2f5}}
.container{{max-width:900px;margin:auto;background:white;padding:30px;border-radius:12px;box-shadow:0 4px 12px rgba(0,0,0,0.1)}}
h1{{color:#2c3e50;border-bottom:3px solid #4CAF50;padding-bottom:10px}}
.summary{{background:#f0f4c3;padding:15px;border-radius:8px;margin:20px 0;font-size:18px}}
.stats{{display:flex;justify-content:space-around;margin:25px 0}}
.stat{{text-align:center}}.stat .num{{font-size:28px;font-weight:bold}}
img{{max-width:100%;margin:15px 0;border-radius:8px;box-shadow:0 2px 8px rgba(0,0,0,0.1)}}
ul{{line-height:2; list-style-type:none; padding-left:0;}}
.cloud-container {{
    background: #fafafa;
    border-radius: 12px;
    padding: 20px;
    margin: 20px 0;
    text-align: center;
    line-height: 2.5;
}}
.cloud-word {{
    display: inline-block;
    margin: 5px 10px;
    cursor: pointer;
    color: #2c3e50;
    transition: all 0.3s;
    font-weight: bold;
}}
.cloud-word:hover {{
    color: #e74c3c;
    transform: scale(1.2);
}}
.news-popup {{
    background: white;
    border: 2px solid #4CAF50;
    border-radius: 8px;
    padding: 15px;
    margin: 10px 0;
    display: none;
    box-shadow: 0 2px 8px rgba(0,0,0,0.15);
}}
.news-popup h3 {{
    margin-top: 0;
    color: #4CAF50;
}}
.news-popup ul {{
    list-style-type: disc;
    padding-left: 20px;
}}
.footer{{text-align:center;color:#999;margin-top:30px;font-size:14px}}
</style>
</head>
<body>
<div class="container">
<h1>🌡️ 每日舆情温度计</h1>
<p>报告生成时间：{today.strftime('%Y-%m-%d %H:%M:%S')}</p>
<div class="summary">
    <span>今日情绪指数：{sentiment_index:.2f}  {temp_desc}</span><br>
    基于 <strong>{total}</strong> 条新闻分析得出
</div>
<div class="stats">
    <div class="stat"><div class="num" style="color:#4CAF50">{pos_count}</div>😊 积极</div>
    <div class="stat"><div class="num" style="color:#FF9800">{neu_count}</div>😐 中立</div>
    <div class="stat"><div class="num" style="color:#F44336">{neg_count}</div>😢 消极</div>
</div>
<h2>📊 情感分布</h2>
<img src="../images/sentiment_pie.png" alt="饼图">
<h2>📈 近7天情绪变化趋势</h2>
<img src="../images/history_trend.png" alt="趋势图">
<h2>☁️ 可点击词云（点击词语查看相关新闻）</h2>
<div class="cloud-container">
    {cloud_spans}
</div>
<div id="news-popup-area"></div>
<h2>📰 全部新闻标题</h2>
<ul>{news_items}</ul>
<div class="footer">本报告由Python自动生成 · 舆情温度计实验项目</div>
</div>
<script>
var wordNews = {word_news_json};
var words = document.querySelectorAll('.cloud-word');
var popupArea = document.getElementById('news-popup-area');

words.forEach(function(wordSpan) {{
    wordSpan.addEventListener('click', function() {{
        var word = this.getAttribute('data-word');
        var newsList = wordNews[word];
        popupArea.innerHTML = '';
        if (!newsList || newsList.length === 0) {{
            popupArea.innerHTML = '<div class="news-popup" style="display:block;"><p>暂无相关新闻</p></div>';
            return;
        }}
        var html = '<div class="news-popup" style="display:block;"><h3>包含 “<strong>' + word + '</strong>” 的新闻：</h3><ul>';
        newsList.forEach(function(item) {{
            html += '<li><a href="' + item.link + '" target="_blank" style="color:#2c3e50; text-decoration:underline;">' + item.title + '</a></li>';
        }});
        html += '</ul></div>';
        popupArea.innerHTML = html;
        popupArea.scrollIntoView({{ behavior: 'smooth', block: 'nearest' }});
    }});
}});
</script>
</body>
</html>"""
    report_name = f"report_{today.strftime('%Y%m%d_%H%M%S')}.html"
    report_path = os.path.join(REPORT_DIR, report_name)
    with open(report_path, 'w', encoding='utf-8') as f:
        f.write(html)
    print(f'[√] HTML报告已生成: {report_path}')
    return report_path

# ------------------------- 6. 主流程 -------------------------
def run_daily_task():
    print('='*50)
    print(f'舆情分析启动 {datetime.now().strftime("%Y-%m-%d %H:%M:%S")}')
    print('='*50)

    news = fetch_all_news()
    df = process_news(news)

    print('[3/6] 生成可视化图表...')
    pie = plot_sentiment_pie(df)
    trend = plot_history_trend()
    # 不再生成静态词云图片，词云在HTML中交互式生成

    print('[4/6] 生成HTML简报（含可点击词云）...')
    report = generate_report(df, pie, trend)

    print('[5/6] 任务完成！')
    print(f'[6/6] 请打开以下文件查看报告：\n  {report}')
    try:
        import webbrowser
        webbrowser.open(report)
    except:
        pass

if __name__ == '__main__':
    run_daily_task()

posted on 2026-06-15 18:26 三饺化缘阅读(10) 评论(0) 收藏举报

刷新页面返回顶部

rainfallllllll