Feature Extraction in OpenCV: A Comprehensive Tutorial

A detailed tutorial about feature extraction. What it is, how it works and how you can use it

Updated March 21, 2023


Hey! If you love Computer Vision and OpenCV as much as I do let's connect on Twitter or LinkedIn. I talk about this stuff all the time and build cool projects.


Welcome, fellow computer vision enthusiasts! Today, we’re going to explore the fascinating world of feature extraction in OpenCV. We’ll dive into the theory behind this fundamental concept, illustrate its application with engaging code examples, and ensure you leave with a solid understanding. By the end of this tutorial, even beginners will feel like experts. So, let’s get started!

What are features, and why do we need them?

Imagine trying to describe an object to someone who has never seen it before. You’d probably mention some distinguishing characteristics like shape, color, or texture. Similarly, in computer vision, we use these distinguishing characteristics, called features, to describe and identify objects in images.

Feature extraction is the process of isolating these characteristics from an image, allowing us to build a compact and meaningful representation of the object. By doing so, we can teach our algorithms to recognize and differentiate between various objects in images, even when they appear in different scales, rotations, or lighting conditions.

Feature extraction: a two-step process

Feature extraction in OpenCV typically involves two main steps:

  1. Feature detection: Identifying key points (or interest points) in an image where the features are most prominent.

  2. Feature description: Creating a descriptor (a numeric representation) of the region surrounding each key point, which can be used for matching and comparing features across different images.

Now that we’ve outlined the process, let’s dive into the theory behind some popular feature extraction algorithms.

1. SIFT (Scale-Invariant Feature Transform)

SIFT is a well-known feature extraction algorithm that can identify and describe local features in images. The idea behind SIFT is to detect distinctive points that are invariant to scale and rotation, making them suitable for matching and recognizing objects under various transformations.

How does SIFT work?

Scale-space extrema detection: SIFT constructs a scale-space pyramid by applying Gaussian filters of varying scales (σ) to the input image. It then detects local maxima and minima across adjacent scales, which correspond to potential key points.

Key point localization: SIFT refines the detected key points to eliminate low-contrast or edge points, ensuring that the remaining points are stable and distinctive.

Orientation assignment: For each key point, SIFT computes the gradient magnitude and orientation around the point. A histogram of gradient orientations is then created, and the dominant orientations are assigned to the key point.

Descriptor generation: SIFT generates a descriptor for each key point by creating a histogram of gradient orientations for a 16x16 neighborhood around the point. The descriptor is a 128-dimensional vector (4x4 histograms with 8 orientations each) that captures the local image appearance around the key point.

Code example using OpenCV’s SIFT implementation:

import cv2

# Load the image
image = cv2.imread('example_image.jpg', cv2.IMREAD_GRAYSCALE)

# Create a SIFT object
sift = cv2.SIFT_create()

# Detect key points and compute descriptors
key_points, descriptors = sift.detectAndCompute(image, None)

# Draw key points on the image
result = cv2.drawKeypoints(image, key_points, None, flags=cv2.DRAW_MATCHES_FLAGS_DRAW_RICH_KEYPOINTS)

# Display the result
cv2.imshow('SIFT Features', result)
cv2.waitKey(0)
cv2.destroyAllWindows()

2. SURF (Speeded-Up Robust Features)

SURF is an enhanced version of SIFT that aims to provide a faster and more efficient feature extraction process. It uses a Hessian matrix-based approach for key point detection, which allows for faster computation. In addition, SURF employs a more compact descriptor called the “upright” descriptor, which is less sensitive to rotations and accelerates the matching process.

How does SURF work?

Scale-space extrema detection: Similar to SIFT, SURF constructs a scale-space pyramid. However, it uses a Hessian matrix-based approach to detect local maxima and minima across different scales, which makes the process more efficient.

Key point localization: SURF refines the detected key points by interpolating the local extrema in scale and image space, ensuring that the remaining points are stable and distinctive.

Orientation assignment: For each key point, SURF computes a dominant orientation using the responses of Haar wavelets within a circular region around the key point.

Descriptor generation: SURF generates a descriptor for each key point by computing Haar wavelet responses for a square region around the key point. The descriptor is a 64-dimensional vector that captures the local image appearance around the key point, making it more compact than SIFT’s descriptor.

Code example using OpenCV’s SURF implementation:

import cv2

# Load the image
image = cv2.imread('example_image.jpg', cv2.IMREAD_GRAYSCALE)

# Create a SURF object
surf = cv2.xfeatures2d.SURF_create()

# Detect key points and compute descriptors
key_points, descriptors = surf.detectAndCompute(image, None)

# Draw key points on the image
result = cv2.drawKeypoints(image, key_points, None, flags=cv2.DRAW_MATCHES_FLAGS_DRAW_RICH_KEYPOINTS)

# Display the result
cv2.imshow('SURF Features', result)
cv2.waitKey(0)
cv2.destroyAllWindows()

3. ORB (Oriented FAST and Rotated BRIEF)

ORB is another popular feature extraction algorithm that combines the strengths of two algorithms: FAST (Features from Accelerated Segment Test) for key point detection and BRIEF (Binary Robust Independent Elementary Features) for descriptor generation. ORB is designed to be fast, efficient, and rotation invariant, making it suitable for real-time applications.

How does ORB work?

Scale-space extrema detection: ORB constructs a scale-space pyramid using FAST key point detection. It detects key points in each level of the pyramid by analyzing corner-like structures in the image.

Key point localization: ORB refines the detected key points by applying non-maximum suppression, which retains only the most stable and distinctive points.

Orientation assignment: For each key point, ORB computes a dominant orientation using the intensity centroid method, which allows for rotation invariance.

Descriptor generation: ORB generates a descriptor for each key point using a modified version of the BRIEF descriptor called the “steered” BRIEF descriptor. The descriptor is a binary vector that captures the local image appearance around the key point, making it faster and more memory-efficient than SIFT and SURF descriptors.

Code example using OpenCV’s ORB implementation:

import cv2

# Load the image
image = cv2.imread('example_image.jpg', cv2.IMREAD_GRAYSCALE)

# Create an ORB object
orb = cv2.ORB_create()

# Detect key points and compute descriptors
key_points, descriptors = orb.detectAndCompute(image, None)

# Draw key points on the image
result = cv2.drawKeypoints(image, key_points, None, flags=cv2.DRAW_MATCHES_FLAGS_DRAW_RICH_KEYPOINTS)

# Display the result
cv2.imshow('ORB Features',result)
cv2.waitKey(0)
cv2.destroyAllWindows()

Comparing feature extraction algorithms

Now that we’ve explored SIFT, SURF, and ORB, let’s briefly compare them:

  1. SIFT: Provides highly distinctive features that are invariant to scale and rotation, making it suitable for various computer vision tasks. However, it is computationally expensive and may not be ideal for real-time applications.
  2. SURF: Offers similar performance to SIFT but with faster computation and a more compact descriptor, making it a popular choice for many applications. However, it is still not as fast as ORB.
  3. ORB: Designed for real-time applications, ORB is fast, efficient, and rotation invariant. However, its binary descriptors may be less distinctive than SIFT or SURF descriptors, which could affect the matching accuracy in certain scenarios.

Ultimately, the choice of feature extraction algorithm depends on the specific application and its requirements, such as computational efficiency, feature distinctiveness, and robustness to different transformations.

Wrapping up

Congratulations! You’ve successfully explored the world of feature extraction in OpenCV. We’ve discussed the importance of features in computer vision and delved into the theory behind popular feature extraction algorithms like SIFT, SURF, and ORB. We also provided code examples to help you apply these concepts in practice.

Remember, as with any technique, practice makes perfect. So, keep experimenting with different feature extraction algorithms and applications, and you’ll soon master this essential skill in computer vision. Happy coding!


Stay up to date on the latest in Computer Vision and AI

Intuit Mailchimp