详细介绍:生成网站sitemap.xml地图教程
要生成 sitemap.xml 文件,需要通过爬虫程序抓取网站的所有有效链接。以下是完整的解决方案:
步骤 1:安装必要的 Python 库
- ounter(line
pip install requests beautifulsoup4 lxml
步骤 2:创建 Python 爬虫脚本 (sitemap_generator.py)
- ounter(line
- ounter(line
- ounter(line
- ounter(line
- ounter(line
- ounter(line
- ounter(line
- ounter(line
- ounter(line
- ounter(line
- ounter(line
- ounter(line
- ounter(line
- ounter(line
- ounter(line
- ounter(line
- ounter(line
- ounter(line
- ounter(line
- ounter(line
- ounter(line
- ounter(line
- ounter(line
- ounter(line
- ounter(line
- ounter(line
- ounter(line
- ounter(line
- ounter(line
- ounter(line
- ounter(line
- ounter(line
- ounter(line
- ounter(line
- ounter(line
- ounter(line
- ounter(line
- ounter(line
- ounter(line
- ounter(line
- ounter(line
- ounter(line
- ounter(line
- ounter(line
- ounter(line
- ounter(line
- ounter(line
- ounter(line
- ounter(line
- ounter(line
- ounter(line
- ounter(line
- ounter(line
- ounter(line
- ounter(line
- ounter(line
- ounter(line
- ounter(line
- ounter(line
- ounter(line
- ounter(line
- ounter(line
- ounter(line
- ounter(line
- ounter(line
- ounter(line
- ounter(line
- ounter(line
- ounter(line
- ounter(line
- ounter(line
- ounter(line
- ounter(line
import requests
步骤 3:运行脚本
- ounter(line
python sitemap_generator.py
执行说明:
爬虫逻辑:
从首页
https://www.91kaiye.cn/开始广度优先搜索自动过滤非本站链接、锚点和无效 URL
记录每个页面的最后修改日期(默认当天)
设置更新频率为
daily,优先级为0.8
输出文件:
- 生成的
sitemap.xml格式如下:- ounter(line
- ounter(line
- ounter(line
- ounter(line
- ounter(line
- ounter(line
- ounter(line
- ounter(line
- ounter(line
- ounter(line
- 生成的
注意事项:
反爬措施:
- 如果网站有反爬机制,可能需要:
添加
time.sleep(1)延迟请求使用代理 IP
设置更真实的请求头
- 如果网站有反爬机制,可能需要:
动态内容:
对于 JavaScript 渲染的页面(如 Vue/React),需改用
Selenium或Playwright
优化建议:
在服务器上定期运行(如每周一次)
提交到 Google Search Console
- 在
robots.txt中添加:- ounter(line
Sitemap: https://www.91kaiye.cn/sitemap.xml
替代方案:使用在线工具
如果不想运行代码,可用在线服务生成:
生成后请将 sitemap.xml 上传到网站根目录,并通过百度/Google站长工具提交。

浙公网安备 33010602011771号