ImageNet层次标签构建

[link] 提供了标签间的层次信息,保存在 data.json文件中,可在倒数第二个单元格点击“Download JSON”获得

data.json 文件部分内容如下,如果类别有层级关系,则会嵌套在 “children”:

{
    "id": "fall11",
    "name": "ImageNet 2011 Fall Release",
    "children": [
        {
            "id": "n07565083",
            "name": "menu",
            "sift": "0802",
            "index": 922
        },
        {
            "id": "n07831146",
            "name": "carbonara",
            "sift": "0998",
            "index": 959
        },
    ]
}
...

由于内容很复杂,我想把每个类的层次关系,单独保存在一个列表中,并用 index 作为 key,方便后续使用。可以用下面的代码实现:

import json
src_file = '/path/to/data.json'
dest_file = '/path/to/res.json'
# 读取 JSON 数据
with open(src_file, 'r') as f:
    tree = json.load(f)

def recursive_collect_path(tree, path):
    if 'children' in tree:
        for child in tree['children']:
            recursive_collect_path(child, path + [tree['name']])
    else:
        res[tree['index']] = path + [tree['name']]
res = dict()
recursive_collect_path(tree, [])

# 对res按照 index 进行排序
res = dict(sorted(res.items(), key=lambda item: item[0]))
with open(dest_file, 'w') as f:
    json.dump(res, f, ensure_ascii=False, indent=4)

保存下来的 res.json 文件部分内容如下:

{
    "0": [
        "ImageNet 2011 Fall Release",
        "organism, being",
        "animal, animate being, beast, brute, creature, fauna",
        "chordate",
        "vertebrate, craniate",
        "aquatic vertebrate",
        "fish",
        "bony fish",
        "teleost fish, teleost, teleostan",
        "soft-finned fish, malacopterygian",
        "cypriniform fish",
        "cyprinid, cyprinid fish",
        "tench, Tinca tinca"
    ]
}
...
posted @ 2025-05-22 21:11  片刻的自由  阅读(17)  评论(0)    收藏  举报