Computer Vision with Python: Analyze Images Using OpenCV and Deep Learning

← Back to Home

Part 7: Computer Vision with OpenCV and Deep Learning



👁️ What Is Computer Vision?

Computer Vision (CV) enables machines to “see” and interpret visual data. It powers:

  • Face detection & recognition
  • Object recognition & tracking
  • Medical imaging & diagnostics
  • Self-driving cars & autonomous navigation
  • Industrial automation & defect detection


Tools We Will Use

  • OpenCV: Image processing and manipulation
  • TensorFlow/Keras: Build CNN models for classification
  • Pre-trained Models: Fast, accurate image recognition using MobileNet, VGG16, ResNet

Install libraries:

pip install opencv-python tensorflow matplotlib scikit-learn
python -m pip install --upgrade pip


🖼️ Step-by-Step: Image Classification with CNN

Step 1: Load & Preprocess CIFAR-10 Dataset

CIFAR-10 is a real-world dataset of 60,000 32x32 color images in 10 classes.

import tensorflow as tf
from tensorflow.keras.datasets import cifar10
import matplotlib.pyplot as plt

# Load dataset
(X_train, y_train), (X_test, y_test) = cifar10.load_data()
X_train, X_test = X_train / 255.0, X_test / 255.0

# Display sample image
plt.imshow(X_train[0])
plt.title(f"Label: {y_train[0][0]}")
plt.axis('off')
plt.show()


Step 2: Build a Convolutional Neural Network

from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Conv2D, MaxPooling2D, Flatten, Dense, Dropout

model = Sequential([
    Conv2D(32, (3,3), activation='relu', input_shape=(32,32,3)),
    MaxPooling2D((2,2)),
    Conv2D(64, (3,3), activation='relu'),
    MaxPooling2D((2,2)),
    Flatten(),
    Dense(128, activation='relu'),
    Dropout(0.3),
    Dense(10, activation='softmax')
])


Step 3: Compile & Train the Model

model.compile(
    optimizer='adam',
    loss='sparse_categorical_crossentropy',
    metrics=['accuracy']
)

history = model.fit(
    X_train, y_train,
    epochs=15,
    batch_size=64,
    validation_split=0.1
)


Step 4: Evaluate Performance

test_loss, test_acc = model.evaluate(X_test, y_test)
print(f"Test Accuracy: {test_acc*100:.2f}%")


Step 5: Visualize Training History

import matplotlib.pyplot as plt

# Accuracy plot
plt.plot(history.history['accuracy'], label='Train Accuracy')
plt.plot(history.history['val_accuracy'], label='Validation Accuracy')
plt.title('Accuracy Over Epochs')
plt.xlabel('Epoch')
plt.ylabel('Accuracy')
plt.legend()
plt.show()

# Loss plot
plt.plot(history.history['loss'], label='Train Loss')
plt.plot(history.history['val_loss'], label='Validation Loss')
plt.title('Loss Over Epochs')
plt.xlabel('Epoch')
plt.ylabel('Loss')
plt.legend()
plt.show()


Step 6: Predict Custom Images with OpenCV

import cv2
import numpy as np

image = cv2.imread('your_image.jpg')
resized = cv2.resize(image, (32, 32)) / 255.0
reshaped = resized.reshape(1, 32, 32, 3)

prediction = model.predict(reshaped)
print("Predicted class:", prediction.argmax())

Tip: Make sure your custom image is similar in size and format as your training images.



Step 7: Use Pretrained Model (MobileNetV2)

from tensorflow.keras.applications import MobileNetV2
from tensorflow.keras.applications.mobilenet_v2 import preprocess_input, decode_predictions
from tensorflow.keras.preprocessing import image

# Load model
pretrained_model = MobileNetV2(weights='imagenet')

# Prepare image
img = image.load_img('your_image.jpg', target_size=(224, 224))
x = image.img_to_array(img)
x = np.expand_dims(x, axis=0)
x = preprocess_input(x)

# Predict
preds = pretrained_model.predict(x)
print("Predicted:", decode_predictions(preds, top=1)[0])


📌 Bonus: Real-World Datasets

  • Fashion-MNIST: Clothing item classification
  • CelebA: Celebrity face recognition
  • COCO Dataset: Multi-object detection and segmentation
  • MNIST Digits: For beginner-friendly digit recognition


💡 Practice Challenges

  • Try other datasets like CelebA, COCO, or Fashion-MNIST
  • Detect faces using cv2.CascadeClassifier
  • Implement edge detection, blurring, and contour detection with OpenCV
  • Build a real-time webcam object classifier
  • Use data augmentation to increase dataset size


⚙️ Hyperparameter Tuning Ideas

HyperparameterOptions
Filters in Conv2D32, 64, 128
Dense layer units64, 128, 256
Dropout rate0.1–0.5
Batch size32, 64, 128
Learning rate0.001, 0.0005, 0.0001
Activation functionrelu, tanh, gelu


Advanced Topics (For Ambitious Learners)

  • Convolutional Neural Networks (CNNs) - deeper architectures like ResNet, VGG16
  • Transfer Learning - leverage pretrained weights for faster results
  • Image Augmentation - rotate, flip, zoom, shear images to improve generalization
  • Object Detection - YOLO, SSD for real-time detection
  • Segmentation - Mask R-CNN for pixel-level classification
  • Batch Normalization & Regularization - stabilize and speed up training


🛠️ Mini-Projects You Can Build Next

  • Face mask detection using webcam
  • Real-time traffic sign classifier
  • Pet breed recognition app
  • Fruit quality detection for agriculture
  • Medical X-ray abnormality detection


🎓 What You’ve Learned:

  • How to preprocess and classify images using CNNs
  • Using OpenCV for custom image processing
  • Applying pretrained models for instant predictions
  • Working with real-world datasets
  • Advanced CV techniques for better performance


📝 Computer Vision Cheat Sheet


# Layers
Conv2D(filters, kernel, activation) - Convolution
MaxPooling2D(pool_size) - Pooling
Flatten() - Flatten layer
Dense(units, activation) - Fully connected
Dropout(rate) - Prevent overfitting

# Activation Functions
relu, tanh, softmax

# Compilation
model.compile(optimizer='adam', loss='sparse_categorical_crossentropy', metrics=['accuracy'])

# Training
model.fit(X_train, y_train, epochs=10, batch_size=64, validation_split=0.1)

# Evaluation
test_loss, test_acc = model.evaluate(X_test, y_test)

# Prediction
predictions = model.predict(X_test)
np.argmax(predictions[i])

# OpenCV Tips
cv2.imread('image.jpg')
cv2.resize(img, (width, height))
cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
cv2.CascadeClassifier('haarcascade_frontalface_default.xml')


❓ FAQs

1. Can I classify my own images?

Yes, resize images to match training data, normalize pixel values, and use model.predict() or a pretrained model.

2. Which dataset is best for learning CV?

Start with CIFAR-10 or Fashion-MNIST for beginners, CelebA or COCO for advanced projects.

3. How can I improve accuracy?

Use deeper CNNs, data augmentation, transfer learning, dropout, and batch normalization.

4. Do I need a GPU?

GPU accelerates training but small models can run on CPU.

5. Can OpenCV detect faces in real-time?

Yes, using Haar cascades or DNN-based detectors, you can process webcam streams live.



📢 Call to Action

If you enjoyed this tutorial, share it with friends, try the practice challenges, and comment your results. Subscribe to continue learning AI and explore Reinforcement Learning in Part 8!



🧭 What’s Next?

In Part 8, we’ll explore Reinforcement Learning (RL) agents learning via trial and error using rewards and penalties.