加载自己的数据

pytorch输入数据PipeLine一般遵循一个“三步走”的策略，一般pytorch 的数据加载到模型的操作顺序是这样的：
① 创建一个 Dataset 对象。必须实现__len__()、getitem()这两个方法，这里面会用到transform对数据集进行扩充。
② 创建一个 DataLoader 对象。它是对DataSet对象进行迭代的，一般不需要事先里面的其他方法了。
③ 循环遍历这个 DataLoader 对象。将img, label加载到模型中进行训练。

第一步：创建一个 Dataset 对象。

class ImgDataset(Dataset):
    def __init__(self, annotations_path, images_path, shape=(600, 600), data_class="CAM1"):

        self.shape = shape
        self.images_path = images_path
        self.images_name, self.images_height, self.images_width, self.images_label, self.images_bbox \
            = read_data(annotations_path, data_class)
        self.transform = Transforms()

    def __len__(self):  # # # 数据集中数据的个数
        return len(self.images_name)

    def __getitem__(self, index):  # # # 返回数据
        image_path = self.images_path + self.images_name[index]
        images_height = self.images_height[index]
        image_width = self.images_width[index]
        image_bbox = self.images_bbox[index]

        img, box = self.transform(image_path, images_height, image_width, image_bbox, self.shape)
        # img = self.images_data[index]
        # box = self.bbox_data[index]
        label = self.images_label[index]
        return img, box, label

第二步：创建一个 DataLoader 对象

dataset = ImgDataset() # 加载全部数据
train_set, test_set = torch.utils.data.random_split(dataset, [num_train_samples, len(dataset) - num_train_samples])  # #划分训练集和测试集
train_loader = DataLoader(dataset=train_set, batch_size=args.batch_size, shuffle=True, num_workers=4) # 分批次划分数据，以待喂入网络

第三步：循环遍历这个 DataLoader 对象

 for epoch in range(args.num_epochs):
        for i, (img, box, label) in enumerate(train_loader):
            # forward
            if i % 5 == 0:
                print(f'[{time.strftime("%Y-%m-%d %H:%M:%S", time.localtime())} | epoch: {epoch}/{args.num_epochs}  |  iter: {i}/{n_iter}  |  img: {img.shape}  |  box: {box.shape}  |  labels: {label.shape}]')

参考文章：https://blog.csdn.net/kdongyi/article/details/103272579
https://blog.csdn.net/qq_27825451/article/details/97130890

posted @ 2021-03-10 20:42 Guang'Jun 阅读(129) 评论(0) 收藏举报

刷新页面返回顶部

Loading

Jun'Blog

加载自己的数据

公告