×
1 Choose EITC/EITCA Certificates
2 Learn and take online exams
3 Get your IT skills certified

Confirm your IT skills and competencies under the European IT Certification framework from anywhere in the world fully online.

EITCA Academy

Digital skills attestation standard by the European IT Certification Institute aiming to support Digital Society development

LOG IN TO YOUR ACCOUNT

CREATE AN ACCOUNT FORGOT YOUR PASSWORD?

FORGOT YOUR PASSWORD?

AAH, WAIT, I REMEMBER NOW!

CREATE AN ACCOUNT

ALREADY HAVE AN ACCOUNT?
EUROPEAN INFORMATION TECHNOLOGIES CERTIFICATION ACADEMY - ATTESTING YOUR PROFESSIONAL DIGITAL SKILLS
  • SIGN UP
  • LOGIN
  • INFO

EITCA Academy

EITCA Academy

The European Information Technologies Certification Institute - EITCI ASBL

Certification Provider

EITCI Institute ASBL

Brussels, European Union

Governing European IT Certification (EITC) framework in support of the IT professionalism and Digital Society

  • CERTIFICATES
    • EITCA ACADEMIES
      • EITCA ACADEMIES CATALOGUE<
      • EITCA/CG COMPUTER GRAPHICS
      • EITCA/IS INFORMATION SECURITY
      • EITCA/BI BUSINESS INFORMATION
      • EITCA/KC KEY COMPETENCIES
      • EITCA/EG E-GOVERNMENT
      • EITCA/WD WEB DEVELOPMENT
      • EITCA/AI ARTIFICIAL INTELLIGENCE
    • EITC CERTIFICATES
      • EITC CERTIFICATES CATALOGUE<
      • COMPUTER GRAPHICS CERTIFICATES
      • WEB DESIGN CERTIFICATES
      • 3D DESIGN CERTIFICATES
      • OFFICE IT CERTIFICATES
      • BITCOIN BLOCKCHAIN CERTIFICATE
      • WORDPRESS CERTIFICATE
      • CLOUD PLATFORM CERTIFICATENEW
    • EITC CERTIFICATES
      • INTERNET CERTIFICATES
      • CRYPTOGRAPHY CERTIFICATES
      • BUSINESS IT CERTIFICATES
      • TELEWORK CERTIFICATES
      • PROGRAMMING CERTIFICATES
      • DIGITAL PORTRAIT CERTIFICATE
      • WEB DEVELOPMENT CERTIFICATES
      • DEEP LEARNING CERTIFICATESNEW
    • CERTIFICATES FOR
      • EU PUBLIC ADMINISTRATION
      • TEACHERS AND EDUCATORS
      • IT SECURITY PROFESSIONALS
      • GRAPHICS DESIGNERS & ARTISTS
      • BUSINESSMEN AND MANAGERS
      • BLOCKCHAIN DEVELOPERS
      • WEB DEVELOPERS
      • CLOUD AI EXPERTSNEW
  • FEATURED
  • SUBSIDY
  • HOW IT WORKS
  •   IT ID
  • ABOUT
  • CONTACT
  • MY ORDER
    Your current order is empty.
EITCIINSTITUTE
CERTIFIED

What is the mathematical formula of the convolution operation on a 2D image?

by EITCA Academy / Thursday, 23 May 2024 / Published in Artificial Intelligence, EITC/AI/ADL Advanced Deep Learning, Advanced computer vision, Convolutional neural networks for image recognition

The convolution operation is a fundamental process in the realm of convolutional neural networks (CNNs), particularly in the domain of image recognition. This operation is pivotal in extracting features from images, allowing deep learning models to understand and interpret visual data. The mathematical formulation of the convolution operation on a 2D image is essential for grasping how CNNs process and analyze images.

Mathematically, the convolution operation for a 2D image can be expressed as follows:

[ (I * K)(x, y) = sum_{i=-m}^{m} sum_{j=-n}^{n} I(x+i, y+j) cdot K(i, j) ]

Where:
– ( I ) represents the input image.
– ( K ) denotes the kernel or filter.
– ( (x, y) ) are the coordinates of the output pixel.
– ( m ) and ( n ) are the half-width and half-height of the kernel, respectively.

In this equation, the kernel ( K ) slides over the input image ( I ), performing element-wise multiplication and summing the results to produce a single output pixel value. This process is repeated for each pixel in the output feature map, resulting in a transformed image that highlights specific features based on the kernel's values.

The convolution operation can be better understood through a step-by-step example. Consider a simple 3×3 kernel ( K ) and a 5×5 input image ( I ):

[ K = begin{bmatrix}
1 & 0 & -1 \
1 & 0 & -1 \
1 & 0 & -1
end{bmatrix} ] [ I = begin{bmatrix}
1 & 2 & 3 & 4 & 5 \
6 & 7 & 8 & 9 & 10 \
11 & 12 & 13 & 14 & 15 \
16 & 17 & 18 & 19 & 20 \
21 & 22 & 23 & 24 & 25
end{bmatrix} ]

To compute the convolution, we place the center of the kernel at each pixel of the input image and perform the following steps:

1. Position the kernel: Place the center of the kernel at the top-left corner of the image.
2. Element-wise multiplication: Multiply each element of the kernel by the corresponding element of the image.
3. Summation: Sum the results of the element-wise multiplication.
4. Move the kernel: Shift the kernel to the next position and repeat steps 2-3.

For the first position (top-left corner), the calculation is as follows:

[ begin{aligned}
(I * K)(1, 1) &= (1 cdot 1) + (2 cdot 0) + (3 cdot -1) \
&quad + (6 cdot 1) + (7 cdot 0) + (8 cdot -1) \
&quad + (11 cdot 1) + (12 cdot 0) + (13 cdot -1) \
&= 1 + 0 – 3 + 6 + 0 – 8 + 11 + 0 – 13 \
&= -6
end{aligned} ]

This result, -6, is the value of the output feature map at position (1, 1). Repeating this process for each position of the kernel over the input image generates the entire output feature map.

The convolution operation is typically accompanied by additional concepts such as padding and stride:

– Padding: Adding extra pixels around the border of the input image, often with zeros (zero-padding), to control the spatial dimensions of the output feature map. Padding ensures that the output feature map has the same dimensions as the input image, preserving spatial information.
– Stride: The step size by which the kernel moves across the input image. A stride of 1 means the kernel moves one pixel at a time, while a stride of 2 means the kernel moves two pixels at a time. Stride affects the spatial dimensions of the output feature map, with larger strides resulting in smaller output dimensions.

The convolution operation's output dimensions can be calculated using the following formula:

[ text{Output Width} = leftlfloor frac{text{Input Width} – text{Kernel Width} + 2 cdot text{Padding}}{text{Stride}} rightrfloor + 1 ] [ text{Output Height} = leftlfloor frac{text{Input Height} – text{Kernel Height} + 2 cdot text{Padding}}{text{Stride}} rightrfloor + 1 ]

These formulas ensure that the spatial dimensions of the output feature map are correctly determined based on the input image dimensions, kernel size, padding, and stride.

In the context of convolutional neural networks, multiple convolutional layers are stacked together, each with its own set of learnable kernels. These layers progressively extract higher-level features from the input image, enabling the network to recognize complex patterns and objects. The kernels in each layer are learned during the training process through backpropagation, optimizing the network's performance on the given task.

Convolutional layers are often followed by activation functions, such as ReLU (Rectified Linear Unit), which introduce non-linearity into the model. This non-linearity allows the network to learn more complex representations. Additionally, pooling layers, such as max pooling or average pooling, are used to reduce the spatial dimensions of the feature maps, making the model more computationally efficient and less prone to overfitting.

A practical example of a convolutional neural network for image recognition is the famous LeNet-5 architecture, designed for handwritten digit recognition. LeNet-5 consists of multiple convolutional and pooling layers, followed by fully connected layers. The convolutional layers extract features from the input images, while the fully connected layers perform the final classification.

To illustrate the convolution operation in the context of LeNet-5, consider the first convolutional layer, which takes a 32×32 input image and applies six 5×5 kernels with a stride of 1 and no padding. The output feature maps have dimensions of 28×28, calculated as follows:

[ text{Output Width} = leftlfloor frac{32 – 5 + 2 cdot 0}{1} rightrfloor + 1 = 28 ] [ text{Output Height} = leftlfloor frac{32 – 5 + 2 cdot 0}{1} rightrfloor + 1 = 28 ]

Each of the six kernels produces a separate 28×28 feature map, capturing different aspects of the input image. These feature maps are then passed through a ReLU activation function and a 2×2 max pooling layer with a stride of 2, resulting in 14×14 feature maps.

The subsequent layers in LeNet-5 continue to apply convolution and pooling operations, progressively reducing the spatial dimensions while increasing the depth of the feature maps. The final fully connected layers perform the classification based on the extracted features, outputting the predicted digit class.

The convolution operation is a cornerstone of convolutional neural networks, enabling the extraction of meaningful features from images. The mathematical formulation of the convolution operation involves sliding a kernel over the input image, performing element-wise multiplication, and summing the results. Additional concepts such as padding and stride play important roles in controlling the spatial dimensions of the output feature map. Convolutional layers, combined with activation functions and pooling layers, form the building blocks of powerful image recognition models like LeNet-5, capable of recognizing complex patterns and objects in visual data.

Other recent questions and answers regarding Convolutional neural networks for image recognition:

  • What is the formula for an activation function such as Rectified Linear Unit to introduce non-linearity into the model?
  • What is the mathematical formula for the loss function in convolution neural networks?
  • What is the equation for the max pooling?
  • How do residual connections in ResNet architectures facilitate the training of very deep neural networks, and what impact did this have on the performance of image recognition models?
  • What were the major innovations introduced by AlexNet in 2012 that significantly advanced the field of convolutional neural networks and image recognition?
  • How do pooling layers, such as max pooling, help in reducing the spatial dimensions of feature maps and controlling overfitting in convolutional neural networks?
  • What are the key differences between traditional fully connected layers and locally connected layers in the context of image recognition, and why are locally connected layers more efficient for this task?
  • How does the concept of weight sharing in convolutional neural networks (ConvNets) contribute to translation invariance and reduce the number of parameters in image recognition tasks?
  • What were Convolutional Neural Networks first designed for?

More questions and answers:

  • Field: Artificial Intelligence
  • Programme: EITC/AI/ADL Advanced Deep Learning (go to the certification programme)
  • Lesson: Advanced computer vision (go to related lesson)
  • Topic: Convolutional neural networks for image recognition (go to related topic)
Tagged under: Artificial Intelligence, CNN, Convolution, Feature Extraction, Image Processing, Kernel
Home » Artificial Intelligence » EITC/AI/ADL Advanced Deep Learning » Advanced computer vision » Convolutional neural networks for image recognition » » What is the mathematical formula of the convolution operation on a 2D image?

Certification Center

USER MENU

  • My Account

CERTIFICATE CATEGORY

  • EITC Certification (105)
  • EITCA Certification (9)

What are you looking for?

  • Introduction
  • How it works?
  • EITCA Academies
  • EITCI DSJC Subsidy
  • Full EITC catalogue
  • Your order
  • Featured
  •   IT ID
  • EITCA reviews (Medium publ.)
  • About
  • Contact

EITCA Academy is a part of the European IT Certification framework

The European IT Certification framework has been established in 2008 as a Europe based and vendor independent standard in widely accessible online certification of digital skills and competencies in many areas of professional digital specializations. The EITC framework is governed by the European IT Certification Institute (EITCI), a non-profit certification authority supporting information society growth and bridging the digital skills gap in the EU.
Eligibility for EITCA Academy 90% EITCI DSJC Subsidy support
90% of EITCA Academy fees subsidized in enrolment

    EITCA Academy Secretary Office

    European IT Certification Institute ASBL
    Brussels, Belgium, European Union

    EITC / EITCA Certification Framework Operator
    Governing European IT Certification Standard
    Access contact form or call +32 25887351

    Follow EITCI on X
    Visit EITCA Academy on Facebook
    Engage with EITCA Academy on LinkedIn
    Check out EITCI and EITCA videos on YouTube

    Funded by the European Union

    Funded by the European Regional Development Fund (ERDF) and the European Social Fund (ESF) in series of projects since 2007, currently governed by the European IT Certification Institute (EITCI) since 2008

    Information Security Policy | DSRRM and GDPR Policy | Data Protection Policy | Record of Processing Activities | HSE Policy | Anti-Corruption Policy | Modern Slavery Policy

    Automatically translate to your language

    Terms and Conditions | Privacy Policy
    EITCA Academy
    • EITCA Academy on social media
    EITCA Academy


    © 2008-2026  European IT Certification Institute
    Brussels, Belgium, European Union

    TOP

    We care about your privacy

    EITCI uses cookies and similar technologies to keep this site secure, remember your choices, provide personalized experience, measure the traffic, serve more relevant content and certification programmes. You can accept all cookies or customize your preferences. Cookies are variables used to store website specific information on your device to facilitate processing of data for personalized website visit, such as login to your account, accessing the programmes, placing enrolment orders in chosen programmes and improving your EITC certification journey. You can change or withdraw your consent at any time by clicking the Consent Preferences button at the left-bottom of your screen. We respect your choices and are committed to providing you with a transparent and secure browsing experience, which may be limited when cookies aren't accepted. For more details refer to the Privacy Policy
    Customize Consent Preferences
    We use cookies to help you navigate efficiently and perform certain functions. You will find detailed information about all cookies under each consent category below.
    The cookies categorized as Necessary are stored on your browser as they are essential for enabling the basic functionalities of the site.
    To learn more about how Google processes personal information, visit: Google privacy policy

    Necessary

    Always Active

    Necessary cookies are required to enable the basic features of this site, such as providing secure log-in or adjusting your consent preferences. These cookies do not store any personally identifiable data.

    Functional

    Functional cookies help perform certain functionalities like sharing the content of the website on social media platforms, collecting feedback, and other third-party features.

    Preferences

    Stores personalization choices such as interface preferences.

    External media and social features

    Allows embedded video, social, chat, and external interactive services that may set their own cookies. Keep off until the user chooses these features.

    Analytics

    Performance cookies are used to understand and analyze the key performance indexes of the website which helps in delivering a better user experience for the visitors.

    Marketing and conversions

    Advertisement cookies are used to provide visitors with customized advertisements based on the pages you visited previously and to analyze the effectiveness of the ad campaigns.

    CHAT WITH SUPPORT
    Do you have any questions?
    Attach files with the paperclip or paste screenshots into the message box (Ctrl+V). Max 5 file(s), 10 MB each.
    We will reply here and by email. Your conversation is tracked with a support token.