ImageNet层次标签构建
[link] 提供了标签间的层次信息,保存在 data.json文件中,可在倒数第二个单元格点击“Download JSON”获得
data.json 文件部分内容如下,如果类别有层级关系,则会嵌套在 “children”:
{
"id": "fall11",
"name": "ImageNet 2011 Fall Release",
"children": [
{
"id": "n07565083",
"name": "menu",
"sift": "0802",
"index": 922
},
{
"id": "n07831146",
"name": "carbonara",
"sift": "0998",
"index": 959
},
]
}
...
由于内容很复杂,我想把每个类的层次关系,单独保存在一个列表中,并用 index 作为 key,方便后续使用。可以用下面的代码实现:
import json
src_file = '/path/to/data.json'
dest_file = '/path/to/res.json'
# 读取 JSON 数据
with open(src_file, 'r') as f:
tree = json.load(f)
def recursive_collect_path(tree, path):
if 'children' in tree:
for child in tree['children']:
recursive_collect_path(child, path + [tree['name']])
else:
res[tree['index']] = path + [tree['name']]
res = dict()
recursive_collect_path(tree, [])
# 对res按照 index 进行排序
res = dict(sorted(res.items(), key=lambda item: item[0]))
with open(dest_file, 'w') as f:
json.dump(res, f, ensure_ascii=False, indent=4)
保存下来的 res.json 文件部分内容如下:
{
"0": [
"ImageNet 2011 Fall Release",
"organism, being",
"animal, animate being, beast, brute, creature, fauna",
"chordate",
"vertebrate, craniate",
"aquatic vertebrate",
"fish",
"bony fish",
"teleost fish, teleost, teleostan",
"soft-finned fish, malacopterygian",
"cypriniform fish",
"cyprinid, cyprinid fish",
"tench, Tinca tinca"
]
}
...

浙公网安备 33010602011771号