A Simple Model for Cat vs. Dog Image Classification

This Colab notebook delves into the world of image classification, specifically focusing on the age-old challenge of identifying cats and dogs in pictures. To achieve this feat, we leverage the power of fast.ai, a high-level deep learning library renowned for its user-friendliness and efficiency.

The Oxford-IIIT Pet Dataset, a comprehensive collection of labeled cat and dog images, will serve as our training ground. We’ll utilize fast.ai’s intuitive functionalities, specifically the ImageDataLoaders class, to seamlessly load, manage, and pre-process our image data. This ensures our model receives images in a format that optimizes its learning process.

As the foundation of our classifier, we’ll employ the well-respected ResNet-34 architecture. This deep neural network boasts impressive accuracy and efficiency in image classification tasks. Through the magic of fast.ai, we’ll fine-tune this model on the Oxford-IIIT Pet Dataset, effectively tailoring its capabilities to the specific challenge of differentiating cats from dogs.

The error_rate metric will be our tool for evaluating the model’s performance. This metric quantifies the model’s accuracy in correctly classifying images. By minimizing the error_rate through fine-tuning with fast.ai, we aim to create a robust classifier that surpasses human-level accuracy in distinguishing these furry friends.

Install Necessary libraries

!pip install -Uqq fastbook
!pip install fastai

from fastai import *
from fastai.vision.all import *

Download Dataset. We will only be using images from this dataset.

path = untar_data(URLs.PETS)
path.ls()

files = get_image_files(path/"images")
len(files)

This code snippet defines a function label_func to label images as ‘Cat’ or ‘Dog’ based on the capitalization of the first letter of the filename.

Then, it creates a DataLoaders object named dls using the from_name_func method. This object loads images from the specified path, applies the label_func for labeling, and resizes images to 224 pixels using item_tfms.

Finally, dls.show_batch() displays a batch of images from the dataset.

def label_func(f):
    return 'Cat' if f[0].isupper() else 'Dog'

dls = ImageDataLoaders.from_name_func(path, files, label_func, item_tfms=Resize(224))
dls.show_batch()

png

This code creates a vision_learner object named learner using the DataLoaders object dls, a ResNet-34 model architecture, and the error_rate metric for evaluation.

Then, learner.fine_tune(1) fine-tunes the model for one epoch, adjusting its parameters to better fit the dataset.

Finally, learner.show_results() displays a sample of the model’s predictions on the validation set, allowing you to visually assess its performance.

learner = vision_learner(dls, resnet34, metrics=error_rate)
learner.fine_tune(1)

learner.show_results()

epoch	train_loss	valid_loss	error_rate	time
0	0.143795	0.030142	0.010825	00:53

epoch	train_loss	valid_loss	error_rate	time
0	0.073295	0.021773	0.006089	00:59

png

Testing Model

Here we are download random 10 images from duckduckgo to test our model prediction.

from fastbook import *
import random

urls = search_images_ddg('dog', max_images=50)
urls += search_images_ddg('cat', max_images=50)

urls = random.sample(urls, 10)


for i, url in enumerate(urls):
  try:
    download_url(url, f'images/{i}.jpg')
  except:
    print(f'Failed to download {url}')

files = get_image_files("images")

for file in files:
  img = PILImage.create(file)
  img.show()

  is_what, _, probs = learner.predict(file)

  print(f"Is this a : {is_what}.")
  print(f"Probability: {probs[1].item():.6f}")