×
1 Choose EITC/EITCA Certificates
2 Learn and take online exams
3 Get your IT skills certified

Confirm your IT skills and competencies under the European IT Certification framework from anywhere in the world fully online.

EITCA Academy

Digital skills attestation standard by the European IT Certification Institute aiming to support Digital Society development

LOG IN TO YOUR ACCOUNT

CREATE AN ACCOUNT FORGOT YOUR PASSWORD?

FORGOT YOUR PASSWORD?

AAH, WAIT, I REMEMBER NOW!

CREATE AN ACCOUNT

ALREADY HAVE AN ACCOUNT?
EUROPEAN INFORMATION TECHNOLOGIES CERTIFICATION ACADEMY - ATTESTING YOUR PROFESSIONAL DIGITAL SKILLS
  • SIGN UP
  • LOGIN
  • INFO

EITCA Academy

EITCA Academy

The European Information Technologies Certification Institute - EITCI ASBL

Certification Provider

EITCI Institute ASBL

Brussels, European Union

Governing European IT Certification (EITC) framework in support of the IT professionalism and Digital Society

  • CERTIFICATES
    • EITCA ACADEMIES
      • EITCA ACADEMIES CATALOGUE<
      • EITCA/CG COMPUTER GRAPHICS
      • EITCA/IS INFORMATION SECURITY
      • EITCA/BI BUSINESS INFORMATION
      • EITCA/KC KEY COMPETENCIES
      • EITCA/EG E-GOVERNMENT
      • EITCA/WD WEB DEVELOPMENT
      • EITCA/AI ARTIFICIAL INTELLIGENCE
    • EITC CERTIFICATES
      • EITC CERTIFICATES CATALOGUE<
      • COMPUTER GRAPHICS CERTIFICATES
      • WEB DESIGN CERTIFICATES
      • 3D DESIGN CERTIFICATES
      • OFFICE IT CERTIFICATES
      • BITCOIN BLOCKCHAIN CERTIFICATE
      • WORDPRESS CERTIFICATE
      • CLOUD PLATFORM CERTIFICATENEW
    • EITC CERTIFICATES
      • INTERNET CERTIFICATES
      • CRYPTOGRAPHY CERTIFICATES
      • BUSINESS IT CERTIFICATES
      • TELEWORK CERTIFICATES
      • PROGRAMMING CERTIFICATES
      • DIGITAL PORTRAIT CERTIFICATE
      • WEB DEVELOPMENT CERTIFICATES
      • DEEP LEARNING CERTIFICATESNEW
    • CERTIFICATES FOR
      • EU PUBLIC ADMINISTRATION
      • TEACHERS AND EDUCATORS
      • IT SECURITY PROFESSIONALS
      • GRAPHICS DESIGNERS & ARTISTS
      • BUSINESSMEN AND MANAGERS
      • BLOCKCHAIN DEVELOPERS
      • WEB DEVELOPERS
      • CLOUD AI EXPERTSNEW
  • FEATURED
  • SUBSIDY
  • HOW IT WORKS
  •   IT ID
  • ABOUT
  • CONTACT
  • MY ORDER
    Your current order is empty.
EITCIINSTITUTE
CERTIFIED

In the example keras.layer.Dense(128, activation=tf.nn.relu) is it possible that we overfit the model if we use the number 784 (28*28)?

by ASAD BAIG / Tuesday, 07 October 2025 / Published in Artificial Intelligence, EITC/AI/TFF TensorFlow Fundamentals, Introduction to TensorFlow, Basic computer vision with ML

The question concerns the use of the `Dense` layer in a neural network model built using Keras and TensorFlow, specifically relating to the number of units chosen for the layer and its implications on model overfitting, with reference to the input dimensionality of 28×28, which totals 784 features (commonly representing flattened grayscale images from datasets such as MNIST).

Let us begin by clarifying the syntax and context:

python
keras.layers.Dense(128, activation=tf.nn.relu)

sets up a fully connected (dense) layer with 128 output units and the ReLU activation function. If you instead use:

python
keras.layers.Dense(784, activation=tf.nn.relu)

the dense layer will have 784 output units. The question asks whether choosing this number of units, matching the input size, could lead to overfitting.

Model Capacity and Overfitting

Overfitting describes a scenario where a model learns the training data too well, including its noise and outliers, resulting in poor generalization to new, unseen data. Overfitting is heavily influenced by a model's capacity, which is determined by the number of learnable parameters (weights and biases) in the network.

In a dense layer, the number of parameters can be calculated as:

number_of_parameters = (input_dim * output_dim) + output_dim

For an input dimension of 784 and an output dimension of 784:
– Weights: 784 * 784 = 614,656
– Biases: 784
– Total parameters: 615,440

Contrast this with a smaller layer size, such as 128:
– Weights: 784 * 128 = 100,352
– Biases: 128
– Total parameters: 100,480

When the number of output units is increased, the model gains the ability to represent more complex functions. However, this increased capacity also escalates the risk of memorizing the training data, particularly if the dataset is not sufficiently large or diverse, which is a classic overfitting scenario.

Relation Between Output Units and Overfitting

The number of units in a dense layer should be chosen based on the complexity of the task and the amount of available data. Using 784 units in the first dense layer after a 784-dimensional input does not inherently guarantee overfitting, but it does significantly raise the model’s capacity. If the training set is small, or if the data does not warrant such complexity, the model is likely to fit noise and irrelevant patterns, leading to overfitting.

Specifically, in the context of the MNIST dataset (handwritten digit recognition), the input images are of size 28×28 pixels, flattened to 784 features. The task of classifying digits is relatively straightforward, and empirical evidence shows that architectures with fewer units (such as 128, 64, or even 32 per dense layer) are often sufficient to achieve high accuracy. Using 784 units, matching the input size, is typically unnecessary and can result in a network that is too powerful for the task, learning idiosyncrasies in the training data that do not generalize.

Practical Example

Consider two models trained on the MNIST dataset:

– Model A: Uses a single dense layer with 128 units and ReLU activation, followed by a softmax output layer with 10 units (for the ten digits).
– Model B: Uses a single dense layer with 784 units and ReLU activation, followed by the same softmax output layer.

Both models are trained for the same number of epochs. Model B will have over six times more parameters than Model A. While Model B may initially achieve lower training loss, it is much more susceptible to overfitting, as evidenced by a larger gap between training and validation accuracies after several epochs.

Empirical Evidence

Empirical results from experiments and literature support the idea that increasing the number of units in dense layers can improve performance up to a certain point, after which gains plateau or even deteriorate due to overfitting. Regularization techniques such as dropout, L1/L2 regularization, and early stopping are commonly employed to combat overfitting, but reducing the number of model parameters by lowering the number of units is a primary and effective strategy.

Best Practices for Selecting Number of Units

– Start Small: Begin with a smaller number of units, and increase only if the model underfits (i.e., both training and validation error are high).
– Monitor Performance: Use validation data to monitor the generalization performance. If validation loss starts to increase while training loss continues to decrease, overfitting is occurring.
– Regularization: Employ dropout layers or weight regularization if using a large number of units is necessary for the task.
– Dataset Size and Complexity: For large, complex datasets, higher capacity may be justified, but for well-structured datasets like MNIST, simpler models are preferable.

Illustrative Scenario

Suppose you are building a neural network for digit recognition using the MNIST dataset. The input layer receives 784 features (flattened 28×28 image). You decide between the following architectures:

– Option 1: `Dense(128, activation='relu')`
– Option 2: `Dense(784, activation='relu')`

After training both models:

– Option 1 achieves 98% accuracy on training and 97.5% on validation data.
– Option 2 achieves 99% accuracy on training but drops to 96% on validation data.

This demonstrates that Option 2, with higher capacity, fits the training data better but does not generalize as well, a classic sign of overfitting.

Theoretical Perspective: Universal Approximation and Practical Constraints

The Universal Approximation Theorem states that a feedforward network with a single hidden layer containing a finite number of neurons can approximate any continuous function on compact subsets of ℝⁿ, given sufficient units. However, this is a theoretical result and does not consider generalization, computational efficiency, or practical dataset constraints.

In practice, increasing the number of units beyond what is warranted by the data and task complexity leads to diminishing returns and overfitting. The goal is to find the smallest model that achieves satisfactory accuracy, balancing bias and variance.

Summary of Key Points

– Setting the number of units in `Dense` to 784 (equal to the input size) substantially increases the model's capacity.
– Higher capacity increases the risk of overfitting, especially when the dataset is small or the task is simple.
– Overfitting can result in poor performance on unseen data, even if training accuracy is high.
– Empirical results support the use of fewer units for tasks like MNIST digit recognition.
– Regularization and careful monitoring of validation performance are necessary when using large numbers of units.
– Model design should consider the complexity of the problem, the size of the dataset, and the need for generalization.

Other recent questions and answers regarding Basic computer vision with ML:

  • What is underfitting?
  • How to determine the number of images used for training an AI vision model?
  • When training an AI vision model is it necessary to use a different set of images for each training epoch?
  • Why do we need convolutional neural networks (CNNs) to handle more complex scenarios in image recognition?
  • How does the activation function "relu" filter out values in a neural network?
  • What is the role of the optimizer function and the loss function in machine learning?
  • How does the input layer of the neural network in computer vision with ML match the size of the images in the Fashion MNIST dataset?
  • What is the purpose of using the Fashion MNIST dataset in training a computer to recognize objects?

More questions and answers:

  • Field: Artificial Intelligence
  • Programme: EITC/AI/TFF TensorFlow Fundamentals (go to the certification programme)
  • Lesson: Introduction to TensorFlow (go to related lesson)
  • Topic: Basic computer vision with ML (go to related topic)
Tagged under: Artificial Intelligence, Computer Vision, Keras, MNIST, Model Capacity, Neural Networks, Overfitting, TensorFlow
Home » Artificial Intelligence » EITC/AI/TFF TensorFlow Fundamentals » Introduction to TensorFlow » Basic computer vision with ML » » In the example keras.layer.Dense(128, activation=tf.nn.relu) is it possible that we overfit the model if we use the number 784 (28*28)?

Certification Center

USER MENU

  • My Account

CERTIFICATE CATEGORY

  • EITC Certification (105)
  • EITCA Certification (9)

What are you looking for?

  • Introduction
  • How it works?
  • EITCA Academies
  • EITCI DSJC Subsidy
  • Full EITC catalogue
  • Your order
  • Featured
  •   IT ID
  • EITCA reviews (Medium publ.)
  • About
  • Contact

EITCA Academy is a part of the European IT Certification framework

The European IT Certification framework has been established in 2008 as a Europe based and vendor independent standard in widely accessible online certification of digital skills and competencies in many areas of professional digital specializations. The EITC framework is governed by the European IT Certification Institute (EITCI), a non-profit certification authority supporting information society growth and bridging the digital skills gap in the EU.

Eligibility for EITCA Academy 90% EITCI DSJC Subsidy support

90% of EITCA Academy fees subsidized in enrolment by

    EITCA Academy Secretary Office

    European IT Certification Institute ASBL
    Brussels, Belgium, European Union

    EITC / EITCA Certification Framework Operator
    Governing European IT Certification Standard
    Access contact form or call +32 25887351

    Follow EITCI on X
    Visit EITCA Academy on Facebook
    Engage with EITCA Academy on LinkedIn
    Check out EITCI and EITCA videos on YouTube

    Funded by the European Union

    Funded by the European Regional Development Fund (ERDF) and the European Social Fund (ESF) in series of projects since 2007, currently governed by the European IT Certification Institute (EITCI) since 2008

    Information Security Policy | DSRRM and GDPR Policy | Data Protection Policy | Record of Processing Activities | HSE Policy | Anti-Corruption Policy | Modern Slavery Policy

    Automatically translate to your language

    Terms and Conditions | Privacy Policy
    EITCA Academy
    • EITCA Academy on social media
    EITCA Academy


    © 2008-2025  European IT Certification Institute
    Brussels, Belgium, European Union

    TOP
    CHAT WITH SUPPORT
    Do you have any questions?