GPT-4官方插件收集到Excel中

1、下载json文件,格式如下。https://github.com/copilot-us/chatgpt-plugins/tree/cefe4e2a917fa07c6f1651fff2e32c45ef98a7a7

2、这里为了方便用户查看,调用百度翻译把描述翻译成中文(若不用可注释掉),需要申请百度翻译开放平台开发者,我这里申请为高级版。申请流程:http://api.fanyi.baidu.com/doc/12

3、按照使用文档编写url。百度翻译通用文本翻译使用文档:http://api.fanyi.baidu.com/doc/21

4、把需要的内容写入Excel里。

import json
import requests
import random
import hashlib
import urllib
from openpyxl import Workbook


def baiduTranslate(translate_text):
    """
    百度翻译接口
    :param translate_text: 需要翻译的文本-英
    :return: 翻译后文本-中
    """
    appid = "xxx"
    secretKey = xxx"
    myurl = 'http://api.fanyi.baidu.com/api/trans/vip/translate'
    fromLang = 'en'  # 原文语种
    toLang = 'zh'  # 译文语种
    salt = random.randint(3276, 65536)

    sign = appid + translate_text + str(salt) + secretKey
    sign = hashlib.md5(sign.encode()).hexdigest()
    url = myurl + '?q=' + urllib.parse.quote(translate_text) + '&from=' + fromLang + \
            '&to=' + toLang + '&appid=' + appid + '&salt=' + str(salt) + '&sign=' + sign
    try:
        response = requests.get(url)
        result = response.json()
        return result['trans_result'][0]['dst']
    except Exception as e:
        print(e)


with open("./chatgpt_plugins.json", "r", encoding="utf-8") as f:
    contents = f.read()
    text_dict = json.loads(contents)

items = text_dict["items"]

wb = Workbook()
ws4 = wb.create_sheet(title="plugins", index=0)
sheet = wb.active
sheet["a1"] = "插件名"
sheet["b1"] = "描述"

c = 1
for i in items:
    name = i["namespace"]
    manifest = i["manifest"]
    description = manifest["description_for_human"]
    # trans_des = baiduTranslate(description)
    sheet.cell(row=c, column=1).value = name
    sheet.cell(row=c, column=2).value = description
    c += 1

wb.save("plugins.xlsx")

 5、由于在别的网站能获取到类别,于是想在上面插件名+描述的基础上加上分类,把该网站的表格复制下来到“sheet21.xlsx” 这个文件上。https://www.startuphub.ai/a-list-of-78-chatgpt-plugins-currently-available-and-their-use-case/?__cf_chl_tk=aItOx.OMz_TtAUw8jpYdwryroMFtsbWJf90goOvLnzQ-1688974153-0-gaNycGzNFpA

-复制后的格式大概是这样,有些分类是空的,需要检查手动加上空行,否则后面取值可能导致插件名:分类没有一一对上。

-需要用到列表切片,[::3]表示从0开始每隔3取值,[2::3]表示从2开始每隔3取值。

(单纯用Excel可能有更方便的转换方式,我这里用最笨的方法顺便学习读文件和字符串、列表等操作)

6、读取该表,以插件名:分类的形式存在一个字典中。

7、读取上面生成的“plugins.xlsx”,判断插件名是否在字典中,是则把分类写到该行的第三列。由于两个网站的插件名可能格式不太一致,需要replace方法替换“ ”、“_”等字符,并且统一小写(虽然还是有些插件名不统一,先不管)。

from openpyxl import load_workbook

workbook = load_workbook("sheet21.xlsx")
sheet = workbook.active

row_datas = []
for row in sheet.iter_rows():
    row_data = [cell.value for cell in row]
    row_datas.append(row_data)

name = row_datas[::3] 
name_type = row_datas[2::3]
# print(name_type)

dict_name = {}
for i, n in enumerate(name):
    # print(n[0])
    r_name = n[0].replace(" ", "")
    p_name = r_name.lower()
    dict_name[p_name] = name_type[i][0]

# print(dict_name)

workbook_1 = load_workbook("plugins.xlsx")
sheet_1 = workbook_1["plugins"]


for index, row in enumerate(sheet_1.iter_rows()):
    row_data = [cell.value for cell in row]
    row_name = row_data[0].replace("_", "")
    l_row_name = row_name.lower()
    # print(index, l_row_name)
    if l_row_name in dict_name.keys():
        # print(l_row_name, dict_name[l_row_name])
        sheet_1.cell(row=index+1, column=3).value = dict_name[l_row_name]

workbook_1.save("plugins.xlsx")

 

posted @ 2023-07-18 11:39  jeyeshield  阅读(354)  评论(0编辑  收藏  举报