×
1 Choose EITC/EITCA Certificates
2 Learn and take online exams
3 Get your IT skills certified

Confirm your IT skills and competencies under the European IT Certification framework from anywhere in the world fully online.

EITCA Academy

Digital skills attestation standard by the European IT Certification Institute aiming to support Digital Society development

LOG IN TO YOUR ACCOUNT

CREATE AN ACCOUNT FORGOT YOUR PASSWORD?

FORGOT YOUR PASSWORD?

AAH, WAIT, I REMEMBER NOW!

CREATE AN ACCOUNT

ALREADY HAVE AN ACCOUNT?
EUROPEAN INFORMATION TECHNOLOGIES CERTIFICATION ACADEMY - ATTESTING YOUR PROFESSIONAL DIGITAL SKILLS
  • SIGN UP
  • LOGIN
  • INFO

EITCA Academy

EITCA Academy

The European Information Technologies Certification Institute - EITCI ASBL

Certification Provider

EITCI Institute ASBL

Brussels, European Union

Governing European IT Certification (EITC) framework in support of the IT professionalism and Digital Society

  • CERTIFICATES
    • EITCA ACADEMIES
      • EITCA ACADEMIES CATALOGUE<
      • EITCA/CG COMPUTER GRAPHICS
      • EITCA/IS INFORMATION SECURITY
      • EITCA/BI BUSINESS INFORMATION
      • EITCA/KC KEY COMPETENCIES
      • EITCA/EG E-GOVERNMENT
      • EITCA/WD WEB DEVELOPMENT
      • EITCA/AI ARTIFICIAL INTELLIGENCE
    • EITC CERTIFICATES
      • EITC CERTIFICATES CATALOGUE<
      • COMPUTER GRAPHICS CERTIFICATES
      • WEB DESIGN CERTIFICATES
      • 3D DESIGN CERTIFICATES
      • OFFICE IT CERTIFICATES
      • BITCOIN BLOCKCHAIN CERTIFICATE
      • WORDPRESS CERTIFICATE
      • CLOUD PLATFORM CERTIFICATENEW
    • EITC CERTIFICATES
      • INTERNET CERTIFICATES
      • CRYPTOGRAPHY CERTIFICATES
      • BUSINESS IT CERTIFICATES
      • TELEWORK CERTIFICATES
      • PROGRAMMING CERTIFICATES
      • DIGITAL PORTRAIT CERTIFICATE
      • WEB DEVELOPMENT CERTIFICATES
      • DEEP LEARNING CERTIFICATESNEW
    • CERTIFICATES FOR
      • EU PUBLIC ADMINISTRATION
      • TEACHERS AND EDUCATORS
      • IT SECURITY PROFESSIONALS
      • GRAPHICS DESIGNERS & ARTISTS
      • BUSINESSMEN AND MANAGERS
      • BLOCKCHAIN DEVELOPERS
      • WEB DEVELOPERS
      • CLOUD AI EXPERTSNEW
  • FEATURED
  • SUBSIDY
  • HOW IT WORKS
  •   IT ID
  • ABOUT
  • CONTACT
  • MY ORDER
    Your current order is empty.
EITCIINSTITUTE
CERTIFIED

Can the activation function be only implemented by a step function (resulting with either 0 or 1)?

by dkarayiannakis / Tuesday, 18 June 2024 / Published in Artificial Intelligence, EITC/AI/DLPP Deep Learning with Python and PyTorch, Neural network, Training model

The assertion that the activation function in neural networks can only be implemented by a step function, which results in outputs of either 0 or 1, is a common misconception. While step functions, such as the Heaviside step function, were among the earliest activation functions used in neural networks, modern deep learning frameworks, including those built with Python and PyTorch, employ a variety of activation functions that offer continuous, differentiable outputs. These functions are important for enabling the training of deep neural networks through gradient-based optimization methods such as backpropagation.

Step Functions and Their Limitations

A step function, specifically the binary step function, is defined mathematically as follows:

    \[ f(x) =  \begin{cases}  0 & \text{if } x < 0 \\ 1 & \text{if } x \geq 0  \end{cases} \]

The binary step function maps input values to either 0 or 1, depending on whether the input is below or above a certain threshold (typically zero). This function is non-linear and can be used to create a simple model of a neuron that either "fires" (output 1) or does not "fire" (output 0).

However, the binary step function has significant limitations:

1. Non-Differentiability: The binary step function is not differentiable at the threshold point (x=0). Differentiability is a critical property for training neural networks using gradient-based methods, as gradients are used to update the weights of the network. The lack of a gradient makes it impossible to apply gradient descent or backpropagation effectively.

2. Limited Expressiveness: The binary step function's output is binary, which limits the function's ability to model complex relationships in the data. More nuanced and continuous activation functions allow for the representation of more complex patterns and interactions.

Modern Activation Functions

To address the limitations of the step function, a variety of continuous and differentiable activation functions have been developed. These functions are designed to introduce non-linearity into the network while being amenable to gradient-based optimization. Some of the most commonly used activation functions include:

1. Sigmoid Function:

The sigmoid function maps any real-valued number to a value between 0 and 1, which can be interpreted as a probability. It is defined as:

    \[ f(x) = \frac{1}{1 + e^{-x}} \]

The sigmoid function is differentiable and has a smooth gradient, which makes it suitable for training with gradient descent. However, it suffers from the vanishing gradient problem, where the gradients become very small for extreme values of x, slowing down the training process.

2. Hyperbolic Tangent (Tanh) Function:

The tanh function is similar to the sigmoid function but maps input values to a range between -1 and 1. It is defined as:

    \[ f(x) = \tanh(x) = \frac{e^x - e^{-x}}{e^x + e^{-x}} \]

The tanh function has a steeper gradient than the sigmoid function and is zero-centered, which can help in centering the data and making the optimization process more efficient. However, it also suffers from the vanishing gradient problem.

3. Rectified Linear Unit (ReLU):

The ReLU function is one of the most popular activation functions in modern neural networks. It is defined as:

    \[ f(x) = \max(0, x) \]

ReLU is computationally efficient and helps mitigate the vanishing gradient problem by providing a constant gradient for positive input values. However, it can suffer from the "dying ReLU" problem, where neurons can become inactive and stop learning if they consistently output zero.

4. Leaky ReLU:

Leaky ReLU is a variant of ReLU that allows a small, non-zero gradient when the input is negative. It is defined as:

    \[ f(x) =     \begin{cases}     x & \text{if } x \geq 0 \\    \alpha x & \text{if } x < 0     \end{cases} \]

where \alpha is a small constant (e.g., 0.01). Leaky ReLU helps address the dying ReLU problem by ensuring that neurons continue to learn even when the input is negative.

5. Parametric ReLU (PReLU):

PReLU is an extension of Leaky ReLU where the slope of the negative part of the function is learned during training. It is defined as:

    \[ f(x) =     \begin{cases}     x & \text{if } x \geq 0 \\    \alpha x & \text{if } x < 0     \end{cases} \]

where \alpha is a learnable parameter. PReLU can adapt to the data during training, potentially improving model performance.

6. Exponential Linear Unit (ELU):

ELU is another activation function designed to improve learning by addressing the vanishing gradient problem. It is defined as:

    \[ f(x) =     \begin{cases}     x & \text{if } x \geq 0 \\    \alpha (e^x - 1) & \text{if } x < 0     \end{cases} \]

where \alpha is a positive constant. ELU has a smoother gradient than ReLU and can produce negative outputs, which helps in centering the data.

7. Softmax Function:

The softmax function is commonly used in the output layer of classification networks. It converts logits (raw prediction scores) into probabilities by exponentiating the logits and normalizing them. It is defined as:

    \[ f(x_i) = \frac{e^{x_i}}{\sum_{j} e^{x_j}} \]

where x_i is the input to the i-th neuron, and the denominator is the sum of exponentials of all inputs. The softmax function ensures that the output is a valid probability distribution, with values between 0 and 1 that sum to 1.

Implementation in PyTorch

In PyTorch, these activation functions are readily available and can be easily integrated into neural network models. Here are examples of how to implement some of these activation functions in PyTorch:

{{EJS1}}

Choosing the Right Activation Function

The choice of activation function depends on various factors, including the specific problem being addressed, the architecture of the neural network, and empirical performance. Here are some guidelines for choosing activation functions:

1. Hidden Layers: ReLU and its variants (Leaky ReLU, PReLU, ELU) are commonly used in hidden layers due to their computational efficiency and ability to mitigate the vanishing gradient problem.

2. Output Layer: The activation function for the output layer depends on the type of task:
- For binary classification, the sigmoid function is often used to produce a probability.
- For multi-class classification, the softmax function is used to produce a probability distribution over classes.
- For regression tasks, a linear activation function (or no activation function) is typically used to produce continuous output values.

3. Experimental Validation: It is often beneficial to experiment with different activation functions and evaluate their performance on the specific task. Empirical results can provide insights into which activation function works best for the given data and model architecture.

In the field of deep learning, activation functions play a important role in enabling neural networks to learn and model complex patterns in data. While step functions were used in early neural networks, modern deep learning frameworks employ a variety of continuous, differentiable activation functions that address the limitations of step functions and enhance the training process. By understanding the properties and applications of different activation functions, practitioners can make informed decisions about which functions to use in their models, ultimately leading to better performance and more accurate predictions.

Other recent questions and answers regarding Training model:

  • In a classification neural network, in which the number of outputs in the last layer corresponds to the number of classes, should the last layer have the same number of neurons?
  • The number of neurons per layer in implementing deep learning neural networks is a value one can predict without trial and error?
  • Why is it incorrect to consider activation function running on the input data of a layer?
  • What is the purpose of iterating over the dataset multiple times during training?
  • How is the loss calculated during the training process?
  • Why is it important to choose an appropriate learning rate?
  • How does the learning rate affect the training process?
  • What is the role of the optimizer in training a neural network model?

More questions and answers:

  • Field: Artificial Intelligence
  • Programme: EITC/AI/DLPP Deep Learning with Python and PyTorch (go to the certification programme)
  • Lesson: Neural network (go to related lesson)
  • Topic: Training model (go to related topic)
Tagged under: Activation Functions, Artificial Intelligence, Deep Learning, Gradient Descent, Neural Networks, PyTorch
Home » Artificial Intelligence » EITC/AI/DLPP Deep Learning with Python and PyTorch » Neural network » Training model » » Can the activation function be only implemented by a step function (resulting with either 0 or 1)?

Certification Center

USER MENU

  • My Account

CERTIFICATE CATEGORY

  • EITC Certification (105)
  • EITCA Certification (9)

What are you looking for?

  • Introduction
  • How it works?
  • EITCA Academies
  • EITCI DSJC Subsidy
  • Full EITC catalogue
  • Your order
  • Featured
  •   IT ID
  • EITCA reviews (Medium publ.)
  • About
  • Contact

EITCA Academy is a part of the European IT Certification framework

The European IT Certification framework has been established in 2008 as a Europe based and vendor independent standard in widely accessible online certification of digital skills and competencies in many areas of professional digital specializations. The EITC framework is governed by the European IT Certification Institute (EITCI), a non-profit certification authority supporting information society growth and bridging the digital skills gap in the EU.
Eligibility for EITCA Academy 90% EITCI DSJC Subsidy support
90% of EITCA Academy fees subsidized in enrolment

    EITCA Academy Secretary Office

    European IT Certification Institute ASBL
    Brussels, Belgium, European Union

    EITC / EITCA Certification Framework Operator
    Governing European IT Certification Standard
    Access contact form or call +32 25887351

    Follow EITCI on X
    Visit EITCA Academy on Facebook
    Engage with EITCA Academy on LinkedIn
    Check out EITCI and EITCA videos on YouTube

    Funded by the European Union

    Funded by the European Regional Development Fund (ERDF) and the European Social Fund (ESF) in series of projects since 2007, currently governed by the European IT Certification Institute (EITCI) since 2008

    Information Security Policy | DSRRM and GDPR Policy | Data Protection Policy | Record of Processing Activities | HSE Policy | Anti-Corruption Policy | Modern Slavery Policy

    Automatically translate to your language

    Terms and Conditions | Privacy Policy
    EITCA Academy
    • EITCA Academy on social media
    EITCA Academy


    © 2008-2026  European IT Certification Institute
    Brussels, Belgium, European Union

    TOP

    We care about your privacy

    EITCI uses cookies and similar technologies to keep this site secure, remember your choices, provide personalized experience, measure the traffic, serve more relevant content and certification programmes. You can accept all cookies or customize your preferences. Cookies are variables used to store website specific information on your device to facilitate processing of data for personalized website visit, such as login to your account, accessing the programmes, placing enrolment orders in chosen programmes and improving your EITC certification journey. You can change or withdraw your consent at any time by clicking the Consent Preferences button at the left-bottom of your screen. We respect your choices and are committed to providing you with a transparent and secure browsing experience, which may be limited when cookies aren't accepted. For more details refer to the Privacy Policy
    Customize Consent Preferences
    We use cookies to help you navigate efficiently and perform certain functions. You will find detailed information about all cookies under each consent category below.
    The cookies categorized as Necessary are stored on your browser as they are essential for enabling the basic functionalities of the site.
    To learn more about how Google processes personal information, visit: Google privacy policy

    Necessary

    Always Active

    Necessary cookies are required to enable the basic features of this site, such as providing secure log-in or adjusting your consent preferences. These cookies do not store any personally identifiable data.

    Functional

    Functional cookies help perform certain functionalities like sharing the content of the website on social media platforms, collecting feedback, and other third-party features.

    Preferences

    Stores personalization choices such as interface preferences.

    External media and social features

    Allows embedded video, social, chat, and external interactive services that may set their own cookies. Keep off until the user chooses these features.

    Analytics

    Performance cookies are used to understand and analyze the key performance indexes of the website which helps in delivering a better user experience for the visitors.

    Marketing and conversions

    Advertisement cookies are used to provide visitors with customized advertisements based on the pages you visited previously and to analyze the effectiveness of the ad campaigns.

    CHAT WITH SUPPORT
    Do you have any questions?
    Attach files with the paperclip or paste screenshots into the message box (Ctrl+V). Max 5 file(s), 10 MB each.
    We will reply here and by email. Your conversation is tracked with a support token.