×
1 Choose EITC/EITCA Certificates
2 Learn and take online exams
3 Get your IT skills certified

Confirm your IT skills and competencies under the European IT Certification framework from anywhere in the world fully online.

EITCA Academy

Digital skills attestation standard by the European IT Certification Institute aiming to support Digital Society development

LOG IN TO YOUR ACCOUNT

CREATE AN ACCOUNT FORGOT YOUR PASSWORD?

FORGOT YOUR PASSWORD?

AAH, WAIT, I REMEMBER NOW!

CREATE AN ACCOUNT

ALREADY HAVE AN ACCOUNT?
EUROPEAN INFORMATION TECHNOLOGIES CERTIFICATION ACADEMY - ATTESTING YOUR PROFESSIONAL DIGITAL SKILLS
  • SIGN UP
  • LOGIN
  • INFO

EITCA Academy

EITCA Academy

The European Information Technologies Certification Institute - EITCI ASBL

Certification Provider

EITCI Institute ASBL

Brussels, European Union

Governing European IT Certification (EITC) framework in support of the IT professionalism and Digital Society

  • CERTIFICATES
    • EITCA ACADEMIES
      • EITCA ACADEMIES CATALOGUE<
      • EITCA/CG COMPUTER GRAPHICS
      • EITCA/IS INFORMATION SECURITY
      • EITCA/BI BUSINESS INFORMATION
      • EITCA/KC KEY COMPETENCIES
      • EITCA/EG E-GOVERNMENT
      • EITCA/WD WEB DEVELOPMENT
      • EITCA/AI ARTIFICIAL INTELLIGENCE
    • EITC CERTIFICATES
      • EITC CERTIFICATES CATALOGUE<
      • COMPUTER GRAPHICS CERTIFICATES
      • WEB DESIGN CERTIFICATES
      • 3D DESIGN CERTIFICATES
      • OFFICE IT CERTIFICATES
      • BITCOIN BLOCKCHAIN CERTIFICATE
      • WORDPRESS CERTIFICATE
      • CLOUD PLATFORM CERTIFICATENEW
    • EITC CERTIFICATES
      • INTERNET CERTIFICATES
      • CRYPTOGRAPHY CERTIFICATES
      • BUSINESS IT CERTIFICATES
      • TELEWORK CERTIFICATES
      • PROGRAMMING CERTIFICATES
      • DIGITAL PORTRAIT CERTIFICATE
      • WEB DEVELOPMENT CERTIFICATES
      • DEEP LEARNING CERTIFICATESNEW
    • CERTIFICATES FOR
      • EU PUBLIC ADMINISTRATION
      • TEACHERS AND EDUCATORS
      • IT SECURITY PROFESSIONALS
      • GRAPHICS DESIGNERS & ARTISTS
      • BUSINESSMEN AND MANAGERS
      • BLOCKCHAIN DEVELOPERS
      • WEB DEVELOPERS
      • CLOUD AI EXPERTSNEW
  • FEATURED
  • SUBSIDY
  • HOW IT WORKS
  •   IT ID
  • ABOUT
  • CONTACT
  • MY ORDER
    Your current order is empty.
EITCIINSTITUTE
CERTIFIED

Is in-sample accuracy compared to out-of-sample accuracy one of the most important features of model performance?

by Martyna Helman / Monday, 08 September 2025 / Published in Artificial Intelligence, EITC/AI/DLPP Deep Learning with Python and PyTorch, Introduction, Introduction to deep learning with Python and Pytorch

In-sample accuracy compared to out-of-sample accuracy is a fundamental concept in deep learning, and understanding the distinction between these two metrics is of central importance for building, evaluating, and deploying neural network models using Python and PyTorch. This topic directly relates to the core objective of machine learning and deep learning: to develop models that generalize well to new, unseen data, rather than merely memorizing patterns present in the training data.

Definitions and Context

In-sample accuracy refers to the performance of a model on the data used to fit or train the model. This dataset is commonly known as the training set. By contrast, out-of-sample accuracy measures the model's performance on data that were not used during the training phase. This data is drawn from the same distribution as the training data, but it is held out specifically to test the model's ability to generalize to new data. In practice, out-of-sample accuracy is estimated using validation sets and test sets.

The distinction is important due to a central risk in deep learning: *overfitting*. Overfitting occurs when a model learns not only the underlying patterns but also the noise or idiosyncrasies of the training data. As a result, while the model may achieve extremely high accuracy in-sample, it often fails to perform well on new, out-of-sample data.

Theoretical Foundations: Generalization

The ultimate goal of predictive modeling is not to perform well on the training data, but to accurately predict the outcome for new, unseen instances. This property is known as *generalization*. Generalization ability is what makes a deep learning model valuable in real-world applications.

A model with high in-sample accuracy but low out-of-sample accuracy is not considered effective. Such a model has likely fit the training data too closely (overfit), capturing noise and spurious patterns that do not generalize. Conversely, a model with similar in-sample and out-of-sample accuracy is said to generalize well. Achieving a balance between underfitting (too simple, poor performance everywhere) and overfitting (too complex, poor out-of-sample performance) is a central challenge in deep learning.

Practical Illustration in PyTorch

Suppose one builds an image classifier in PyTorch to distinguish between cats and dogs. The dataset consists of 10,000 labeled images, split into 8,000 images for training and 2,000 for testing.

– If the model achieves 98% accuracy on the training set but only 65% on the test set, this discrepancy suggests overfitting. The neural network has learned to identify intricate details in the training images that do not generalize to the test images. In this situation, the in-sample accuracy is not indicative of real-world performance.

– If the model achieves 90% accuracy on the training set and 88% on the test set, this small difference indicates that the model has learned general features relevant to both sets. The out-of-sample accuracy provides a reliable estimate of how the model will perform on new data in deployment.

Causes and Detection of Overfitting

Several factors contribute to poor out-of-sample accuracy relative to in-sample accuracy:

– Model Complexity: Deep neural networks with many parameters have enormous capacity to fit complex datasets, including memorizing random noise. If the network is too large relative to the dataset, overfitting is likely.

– Insufficient Data: When the dataset is small, the network may learn dataset-specific patterns that do not hold more generally.

– Lack of Regularization: Techniques such as weight decay, dropout, and data augmentation are designed to prevent overfitting by constraining the model or increasing the effective size of the training data.

Detecting overfitting is straightforward: one monitors both training and validation accuracies during training. A large gap, with high training accuracy and low validation/test accuracy, is a hallmark of overfitting.

Methods to Improve Out-of-Sample Accuracy

A number of strategies can be employed to narrow the gap between in-sample and out-of-sample accuracy:

1. Regularization:
Adding penalty terms to the loss function (e.g., L1 or L2 regularization) discourages overly complex models. In PyTorch, this can be implemented by including weight decay in the optimizer.

2. Dropout:
Randomly setting a subset of activations to zero during training helps prevent co-adaptation of neurons, forcing the network to learn more robust features.

3. Data Augmentation:
For image data, random cropping, flipping, rotation, and color jittering artificially enlarge the dataset and expose the model to a wider variety of input patterns.

4. Early Stopping:
Monitoring validation accuracy and halting training when it ceases to improve prevents the model from continuing to fit the training data at the expense of generalization.

5. Cross-validation:
Splitting the data into several folds and training multiple models provides a more reliable estimate of out-of-sample performance, especially when data is scarce.

The Bias-Variance Tradeoff

The difference between in-sample and out-of-sample accuracy is closely related to the bias-variance tradeoff. High bias implies the model is too simple (underfitting), leading to poor performance everywhere. High variance implies the model is too complex, fitting noise in the training data and performing poorly out-of-sample.

– Low bias, high variance: High training accuracy, low test accuracy.
– High bias, low variance: Low accuracy across both datasets.

The ideal scenario is to find a balance where both in-sample and out-of-sample accuracies are high and similar.

Role in Model Selection and Evaluation

When developing deep learning models, relying solely on in-sample accuracy leads to misleading conclusions about a model's predictive power. Out-of-sample accuracy is the preferred metric for model selection, hyperparameter tuning, and comparison between models.

For example, when using PyTorch to design a convolutional neural network (CNN) for digit recognition, one may experiment with different architectures, learning rates, or regularization schemes. The model yielding the highest test or validation accuracy is selected, not the one with the highest training accuracy.

Case Study: CIFAR-10 Classification

Consider training a PyTorch model on the CIFAR-10 dataset, a standard benchmark for image classification. The dataset consists of 60,000 images in 10 classes, with 50,000 images for training and 10,000 for testing.

– An unregularized deep CNN might achieve 99% accuracy on the training set, but only 70% on the test set.
– Applying dropout and data augmentation, the model might achieve 95% training accuracy, but 85% on the test set.

Even though the in-sample accuracy is lower in the second scenario, the out-of-sample accuracy is substantially higher, indicating better generalization and a more useful model.

PyTorch-Specific Practices

In PyTorch, training and evaluating models are typically handled by two separate phases:

– Training Phase: The model is set to `model.train()`, gradients are computed, and parameters are updated.
– Evaluation Phase: The model is set to `model.eval()`, and accuracy is computed on the validation or test set. Importantly, layers like dropout and batch normalization behave differently during evaluation.

This separation ensures that in-sample accuracy (measured during training) and out-of-sample accuracy (measured during evaluation) are not conflated.

Importance in Real-World Applications

In practical deployments, the data encountered by a model are never exactly the same as the training data. Whether predicting medical outcomes, recommending products, or detecting objects in images, the ability to perform well on new data is vital.

A model that only performs well in-sample is often of little use. For instance, a medical diagnostic tool with high in-sample accuracy but low out-of-sample accuracy could lead to misdiagnosis when deployed, with potentially severe consequences.

Quantitative Metrics and Reporting

While accuracy is a common metric, especially for balanced classification tasks, similar considerations apply to other performance metrics such as precision, recall, F1-score, and area under the curve (AUC). In all cases, the focus should be on out-of-sample (validation/test) metrics when reporting and comparing models.

Example: Code Implementation in PyTorch

Below is a basic illustration of how to monitor both in-sample and out-of-sample accuracy in a PyTorch workflow:

python
# Training loop
for epoch in range(num_epochs):
    model.train()
    correct_train = 0
    total_train = 0
    for inputs, labels in train_loader:
        # Forward pass, backward pass, optimizer step
        ...
        # Compute training accuracy
        _, predicted = torch.max(outputs.data, 1)
        total_train += labels.size(0)
        correct_train += (predicted == labels).sum().item()
    
    train_accuracy = 100 * correct_train / total_train

    # Validation phase
    model.eval()
    correct_val = 0
    total_val = 0
    with torch.no_grad():
        for inputs, labels in val_loader:
            outputs = model(inputs)
            _, predicted = torch.max(outputs.data, 1)
            total_val += labels.size(0)
            correct_val += (predicted == labels).sum().item()
    val_accuracy = 100 * correct_val / total_val

    print(f'Epoch {epoch}, Training Accuracy: {train_accuracy:.2f}%, Validation Accuracy: {val_accuracy:.2f}%')

This approach allows practitioners to visually track the evolution of both metrics and detect overfitting early.

Challenges and Nuances

– Data Distribution Shift: Out-of-sample accuracy is meaningful only if the training and test data are drawn from the same distribution. If the deployment environment differs significantly from the training data (a phenomenon known as distribution shift), even high test accuracy may not guarantee good real-world performance.

– Imbalanced Datasets: In cases where classes are imbalanced, accuracy alone may not capture model performance adequately. Alternative metrics should be considered, but the distinction between in-sample and out-of-sample performance remains relevant.

– Hyperparameter Tuning: Hyperparameters such as learning rate, batch size, and regularization strength are often selected based on out-of-sample metrics to avoid overfitting to the training set.

Understanding the difference between in-sample and out-of-sample accuracy is foundational for anyone learning deep learning with Python and PyTorch. This concept:

– Emphasizes the central goal of predictive modeling: generalization.
– Illustrates the dangers of overfitting and the limitations of relying solely on training accuracy.
– Guides the correct use of regularization techniques and model selection procedures.
– Informs the design of experiments and the interpretation of results.
– Lays the groundwork for more advanced topics, such as transfer learning, domain adaptation, and uncertainty estimation.

A solid grasp of this distinction leads to more reliable, interpretable, and deployable models, and is therefore a critical aspect of practical deep learning education and practice.

Other recent questions and answers regarding EITC/AI/DLPP Deep Learning with Python and PyTorch:

  • What is a one-hot vector?
  • Is “to()” a function used in PyTorch to send a neural network to a processing unit which creates a specified neural network on a specified device?
  • Will the number of outputs in the last layer in a classifying neural network correspond to the number of classes?
  • Can a convolutional neural network recognize color images without adding another dimension?
  • In a classification neural network, in which the number of outputs in the last layer corresponds to the number of classes, should the last layer have the same number of neurons?
  • What is the function used in PyTorch to send a neural network to a processing unit which would create a specified neural network on a specified device?
  • Can the activation function be only implemented by a step function (resulting with either 0 or 1)?
  • Does the activation function run on the input or output data of a layer?
  • Is it possible to assign specific layers to specific GPUs in PyTorch?
  • Does PyTorch implement a built-in method for flattening the data and hence doesn't require manual solutions?

View more questions and answers in EITC/AI/DLPP Deep Learning with Python and PyTorch

More questions and answers:

  • Field: Artificial Intelligence
  • Programme: EITC/AI/DLPP Deep Learning with Python and PyTorch (go to the certification programme)
  • Lesson: Introduction (go to related lesson)
  • Topic: Introduction to deep learning with Python and Pytorch (go to related topic)
Tagged under: Artificial Intelligence, Deep Learning, Generalization, Model Evaluation, Overfitting, PyTorch
Home » Artificial Intelligence » EITC/AI/DLPP Deep Learning with Python and PyTorch » Introduction » Introduction to deep learning with Python and Pytorch » » Is in-sample accuracy compared to out-of-sample accuracy one of the most important features of model performance?

Certification Center

USER MENU

  • My Account

CERTIFICATE CATEGORY

  • EITC Certification (105)
  • EITCA Certification (9)

What are you looking for?

  • Introduction
  • How it works?
  • EITCA Academies
  • EITCI DSJC Subsidy
  • Full EITC catalogue
  • Your order
  • Featured
  •   IT ID
  • EITCA reviews (Medium publ.)
  • About
  • Contact

EITCA Academy is a part of the European IT Certification framework

The European IT Certification framework has been established in 2008 as a Europe based and vendor independent standard in widely accessible online certification of digital skills and competencies in many areas of professional digital specializations. The EITC framework is governed by the European IT Certification Institute (EITCI), a non-profit certification authority supporting information society growth and bridging the digital skills gap in the EU.

Eligibility for EITCA Academy 90% EITCI DSJC Subsidy support

90% of EITCA Academy fees subsidized in enrolment by

    EITCA Academy Secretary Office

    European IT Certification Institute ASBL
    Brussels, Belgium, European Union

    EITC / EITCA Certification Framework Operator
    Governing European IT Certification Standard
    Access contact form or call +32 25887351

    Follow EITCI on X
    Visit EITCA Academy on Facebook
    Engage with EITCA Academy on LinkedIn
    Check out EITCI and EITCA videos on YouTube

    Funded by the European Union

    Funded by the European Regional Development Fund (ERDF) and the European Social Fund (ESF) in series of projects since 2007, currently governed by the European IT Certification Institute (EITCI) since 2008

    Information Security Policy | DSRRM and GDPR Policy | Data Protection Policy | Record of Processing Activities | HSE Policy | Anti-Corruption Policy | Modern Slavery Policy

    Automatically translate to your language

    Terms and Conditions | Privacy Policy
    EITCA Academy
    • EITCA Academy on social media
    EITCA Academy


    © 2008-2026  European IT Certification Institute
    Brussels, Belgium, European Union

    TOP
    CHAT WITH SUPPORT
    Do you have any questions?
    We will reply here and by email. Your conversation is tracked with a support token.