利用ResNet-50进行犬种鉴定

作者:如缕清风

本文为博主原创,未经允许,请勿转载:https://www.cnblogs.com/warren2123/p/15033224.html


 

一、前言

        本文基于残差网络模型,通过对ResNet-50模型进行微调,对不同狗狗品种数据集进行鉴定。

        Dog Breed Identification数据集包含20579张不同size的彩色图片,共分为120类犬种。其中训练集包含10222张图片,测试集包含10357张图片。犬种数据集样本图如下所示。

二、基于Fine Tuning构建ResNet-50模型

        随着模型深度的提升,出现网络出现退化问题,因此残差网络应运而生。通过微调已经构建好的模型,能够在相似的数据集运用,不再需要重新训练模型。本文模型构建分为四个部分:数据读取及预处理、构建ResNet-50模型以及模型微调、定义模型超参数以及评估方法、参数优化。

1、数据读取及预处理

        本文采用pandas对数据进行预处理,以及通过GPU对运算进行提速。

import os, torch, torchvision
import torch.nn as nn
import torch.nn.functional as F
import numpy as np
import pandas as pd
from torch.utils.data import DataLoader, Dataset
from torchvision import datasets, models, transforms
from PIL import Image
from sklearn.model_selection import StratifiedShuffleSplit
device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")

        通过pandas读取csv文件,显示前5条数据,发现原数据只存在两列数据,id对应图片名,breed对应犬种名称,总共有120种。

data_root = 'data'
all_labels_df = pd.read_csv(os.path.join(data_root, 'labels.csv'))
all_labels_df.head()

        根据犬种名称,将其转化为id的形式进行一一对应,并在数据集中增加新列,表示犬种类别的id标签

breeds = all_labels_df.breed.unique()
breed2idx = dict((breed, idx) for idx, breed in enumerate(breeds))
idx2breed = dict((idx, breed) for idx, breed in enumerate(breeds))
all_labels_df['label_idx'] = all_labels_df['breed'].map(breed2idx)

        将训练集分割出10%当作验证集,得到训练集和验证集两个数据集。

dataset_names = ['train', 'valid']
stratified_split = StratifiedShuffleSplit(n_splits=1, test_size=0.1, random_state=0)
train_split_idx, val_split_idx = next(iter(stratified_split.split(all_labels_df.id, all_labels_df.breed)))
train_df = all_labels_df.iloc[train_split_idx].reset_index()
val_df = all_labels_df.iloc[val_split_idx].reset_index()

        由于当前处理后的数据不能作为PyTorch的输入,所以需要自定义一个数据的读取。通过DogDataset类实现如下:

class DogDataset(Dataset):
    def __init__(self, labels_df, img_path, transform=None):
        self.labels_df = labels_df
        self.img_path = img_path
        self.transform = transform
    
    def __len__(self):
        return self.labels_df.shape[0]
    
    def __getitem__(self, idx):
        image_name = os.path.join(self.img_path, self.labels_df.id[idx]) + '.jpg'
        img = Image.open(image_name)
        label = self.labels_df.label_idx[idx]
        
        if self.transform:
            img = self.transform(img)
        
        return img, label

        下一步,定义了数据的预处理方法,包括:Resize、Crop、Normalize等三种主要方法,数据增强采用了RandomResizedCrop、RandomHorizontalFlip、RandomRotation。图片读取的方式采用PyTorch的DataLoader。

train_transforms = transforms.Compose([
    transforms.Resize(img_size),
    transforms.RandomResizedCrop(img_size),
    transforms.RandomHorizontalFlip(),
    transforms.RandomRotation(30),
    transforms.ToTensor(),
    transforms.Normalize(img_mean, img_std)
])

val_transforms = transforms.Compose([
    transforms.Resize(img_size),
    transforms.CenterCrop(img_size),
    transforms.ToTensor(),
    transforms.Normalize(img_mean, img_std)
])

image_transforms = {'train': train_transforms, 'valid': val_transforms}

train_dataset = DogDataset(train_df, os.path.join(data_root, 'train'), transform=image_transforms['train'])
val_dataset = DogDataset(val_df, os.path.join(data_root, 'train'), transform=image_transforms['valid'])

image_dataset = {'train': train_dataset, 'valid': val_dataset}

image_dataloader = {x: DataLoader(image_dataset[x], batch_size=batch_size, shuffle=True, num_workers=0) for x in dataset_names}
dataset_sizes = {x: len(image_dataset[x]) for x in dataset_names}

2、构建ResNet-50模型以及模型微调

        残差网络是由一系列残差块组成,网络的一层通常可以看做[公式], 而残差网络的一个残差块可以表示为[公式],也就是[公式],在单位映射中,[公式]便是观测值,而[公式]是预测值,所以[公式]便对应着残差,因此叫做残差网络。残差块的示意图如下所示。

        ResNet-50模型是将残差块堆叠而成,达到50层的深层神经网络。模型架构图如下所示。

        通过torchvision快速实现ResNet-50模型,由于当前模型是通过ImageNet进行训练,而ImageNet有1000个分类,本文的犬种只有120类,所以需要对模型进行重新配置。首先将模型的所有参数进行冻结,再修改模型的输出层。

model_ft = models.resnet50(pretrained=True)

for param in model_ft.parameters():
    param.requires_grad = False

num_fc_ftr = model_ft.fc.in_features
model_ft.fc = nn.Linear(num_fc_ftr, len(breeds))
model_ft = model_ft.to(device)

3、定义模型超参数以及评估方法

        模型的输入图片大小、图片预处理的MEAN、STD通过以下定义,损失函数采用交叉熵误差计算,优化函数采用Adam。

IMG_SIZE = 224
IMG_MEAN = [0.485, 0.456, 0.406]
IMG_STD = [0.229, 0.224, 0.225]

LR = 0.001
EPOCHES = 20
BATCHSIZE = 256

criterion = nn.CrossEntropyLoss()
optimizer = torch.optim.Adam([{'params': model_ft.fc.parameters()}], lr=LR)

4、参数优化

        以下是通过定义的训练次数进行模型的参数优化过程,每一次训练输出模型的训练误差、测试误差、准确率。

for epoch in range(1, EPOCHES):
    model.train()
    for batch_idx, data in enumerate(train_loader):
        x, y = data
        x = x.to(device)
        y = y.to(device)
        optimizer.zero_grad()
        y_hat = model(x)
        loss = criterion(y_hat, y)
        loss.backward()
        optimizer.step()
    print('Train Epoch: {}\t Loss: {:.6f}'.format(epoch, loss.item()))
    
    model.eval()
    test_loss = 0
    correct = 0
    with torch.no_grad():
        for i, data in enumerate(test_loader):
            x, y = data
            x = x.to(device)
            y = y.to(device)
            optimizer.zero_grad()
            y_hat = model(x)
            test_loss += criterion(y_hat, y).item()
            pred = y_hat.max(1, keepdim=True)[1]
            correct += pred.eq(y.view_as(pred)).sum().item()
    
    test_loss /= len(test_loader.dataset)
    print('\nTest set: Average loss: {:.4f}, Accuracy: {}/{} ({:.0f}%)\n'.format(test_loss, correct, len(val_dataset), 100. * correct / len(val_dataset)))

        训练输出如下所示:

Train Epoch: 1	 Loss: 2.652317
Test set: Average loss: 0.0080, Accuracy: 679/1023 (66%)
Train Epoch: 2	 Loss: 1.943187
Test set: Average loss: 0.0048, Accuracy: 765/1023 (75%)
Train Epoch: 3	 Loss: 1.791019
Test set: Average loss: 0.0038, Accuracy: 787/1023 (77%)
Train Epoch: 4	 Loss: 1.551752
Test set: Average loss: 0.0033, Accuracy: 815/1023 (80%)
Train Epoch: 5	 Loss: 1.534580
Test set: Average loss: 0.0030, Accuracy: 808/1023 (79%)
Train Epoch: 6	 Loss: 1.454419
Test set: Average loss: 0.0029, Accuracy: 803/1023 (78%)
Train Epoch: 7	 Loss: 1.365119
Test set: Average loss: 0.0027, Accuracy: 828/1023 (81%)
Train Epoch: 8	 Loss: 1.234558
Test set: Average loss: 0.0027, Accuracy: 812/1023 (79%)
Train Epoch: 9	 Loss: 1.290068
Test set: Average loss: 0.0026, Accuracy: 836/1023 (82%)
Train Epoch: 10	 Loss: 1.474308
Test set: Average loss: 0.0026, Accuracy: 826/1023 (81%)
Train Epoch: 11	 Loss: 1.262610
Test set: Average loss: 0.0026, Accuracy: 840/1023 (82%)
Train Epoch: 12	 Loss: 1.240969
Test set: Average loss: 0.0026, Accuracy: 815/1023 (80%)
Train Epoch: 13	 Loss: 1.098669
Test set: Average loss: 0.0025, Accuracy: 825/1023 (81%)
Train Epoch: 14	 Loss: 1.006308
Test set: Average loss: 0.0026, Accuracy: 822/1023 (80%)
Train Epoch: 15	 Loss: 1.144400
Test set: Average loss: 0.0025, Accuracy: 819/1023 (80%)
Train Epoch: 16	 Loss: 1.033793
Test set: Average loss: 0.0025, Accuracy: 829/1023 (81%)
Train Epoch: 17	 Loss: 1.105277
Test set: Average loss: 0.0024, Accuracy: 839/1023 (82%)
Train Epoch: 18	 Loss: 1.132692
Test set: Average loss: 0.0024, Accuracy: 833/1023 (81%)
Train Epoch: 19	 Loss: 1.225546
Test set: Average loss: 0.0025, Accuracy: 830/1023 (81%)

三、总结

        本文通过微调ResNet-50模型,对犬种图片数据进行鉴定,分类效果达到81%~82%左右,可以进一步调整超参数、更换更深的残差网络实现更高的分类效果。通过细分预测结果,得出准确率在50%及以下的犬类品种有12中,其中miniature_poodle、appenzeller、siberian_husky在30%以下。

Accuracy of            siberian_husky : 20 %
Accuracy of               appenzeller : 25 %
Accuracy of          miniature_poodle : 25 %
Accuracy of                toy_poodle : 37 %
Accuracy of              walker_hound : 42 %
Accuracy of                    collie : 44 %
Accuracy of          english_foxhound : 44 %
Accuracy of      bouvier_des_flandres : 44 %
Accuracy of                bloodhound : 50 %
Accuracy of           irish_wolfhound : 50 %
Accuracy of staffordshire_bullterrier : 50 %
Accuracy of                rottweiler : 50 %

 

posted @ 2021-07-20 09:25  如缕清风  阅读(549)  评论(0编辑  收藏  举报