×
1 Choose EITC/EITCA Certificates
2 Learn and take online exams
3 Get your IT skills certified

Confirm your IT skills and competencies under the European IT Certification framework from anywhere in the world fully online.

EITCA Academy

Digital skills attestation standard by the European IT Certification Institute aiming to support Digital Society development

LOG IN TO YOUR ACCOUNT

CREATE AN ACCOUNT FORGOT YOUR PASSWORD?

FORGOT YOUR PASSWORD?

AAH, WAIT, I REMEMBER NOW!

CREATE AN ACCOUNT

ALREADY HAVE AN ACCOUNT?
EUROPEAN INFORMATION TECHNOLOGIES CERTIFICATION ACADEMY - ATTESTING YOUR PROFESSIONAL DIGITAL SKILLS
  • SIGN UP
  • LOGIN
  • INFO

EITCA Academy

EITCA Academy

The European Information Technologies Certification Institute - EITCI ASBL

Certification Provider

EITCI Institute ASBL

Brussels, European Union

Governing European IT Certification (EITC) framework in support of the IT professionalism and Digital Society

  • CERTIFICATES
    • EITCA ACADEMIES
      • EITCA ACADEMIES CATALOGUE<
      • EITCA/CG COMPUTER GRAPHICS
      • EITCA/IS INFORMATION SECURITY
      • EITCA/BI BUSINESS INFORMATION
      • EITCA/KC KEY COMPETENCIES
      • EITCA/EG E-GOVERNMENT
      • EITCA/WD WEB DEVELOPMENT
      • EITCA/AI ARTIFICIAL INTELLIGENCE
    • EITC CERTIFICATES
      • EITC CERTIFICATES CATALOGUE<
      • COMPUTER GRAPHICS CERTIFICATES
      • WEB DESIGN CERTIFICATES
      • 3D DESIGN CERTIFICATES
      • OFFICE IT CERTIFICATES
      • BITCOIN BLOCKCHAIN CERTIFICATE
      • WORDPRESS CERTIFICATE
      • CLOUD PLATFORM CERTIFICATENEW
    • EITC CERTIFICATES
      • INTERNET CERTIFICATES
      • CRYPTOGRAPHY CERTIFICATES
      • BUSINESS IT CERTIFICATES
      • TELEWORK CERTIFICATES
      • PROGRAMMING CERTIFICATES
      • DIGITAL PORTRAIT CERTIFICATE
      • WEB DEVELOPMENT CERTIFICATES
      • DEEP LEARNING CERTIFICATESNEW
    • CERTIFICATES FOR
      • EU PUBLIC ADMINISTRATION
      • TEACHERS AND EDUCATORS
      • IT SECURITY PROFESSIONALS
      • GRAPHICS DESIGNERS & ARTISTS
      • BUSINESSMEN AND MANAGERS
      • BLOCKCHAIN DEVELOPERS
      • WEB DEVELOPERS
      • CLOUD AI EXPERTSNEW
  • FEATURED
  • SUBSIDY
  • HOW IT WORKS
  •   IT ID
  • ABOUT
  • CONTACT
  • MY ORDER
    Your current order is empty.
EITCIINSTITUTE
CERTIFIED

How to use TensorFlow Serving?

by kenlpascual / Thursday, 29 May 2025 / Published in Artificial Intelligence, EITC/AI/GCML Google Cloud Machine Learning, First steps in Machine Learning, Plain and simple estimators

TensorFlow Serving is an open-source system developed by Google for serving machine learning models, particularly those built using TensorFlow, in production environments. Its primary purpose is to provide a flexible, high-performance serving system for deploying new algorithms and experiments while maintaining the same server architecture and APIs. This framework is widely adopted for model deployment due to its ability to manage multiple models, versioning, and efficient inference requests.

Introduction to TensorFlow Serving

TensorFlow Serving supports the deployment of trained models for inference (prediction) in a scalable and efficient way. It is designed to handle real-time predictions (online serving) and offers features such as model version management, hot-swapping of models, and advanced configuration options for model deployment.

The system is typically used in scenarios where a trained model needs to be exposed as a service, accessible via API calls. This enables seamless integration into production applications where predictions are required.

Step 1: Preparing a Trained Model

Before using TensorFlow Serving, a model must be trained and exported in the TensorFlow SavedModel format. The SavedModel is the universal serialization format for TensorFlow models, containing the graph, variables, and metadata necessary for serving.

Suppose a simple estimator model is built using TensorFlow’s high-level Estimator API:

python
import tensorflow as tf

# Define a simple linear regression estimator
feature_columns = [tf.feature_column.numeric_column("x", shape=[1])]
estimator = tf.estimator.LinearRegressor(feature_columns=feature_columns)

# Prepare training data
import numpy as np
x_train = np.array([[1.], [2.], [3.], [4.]])
y_train = np.array([[0.], [-1.], [-2.], [-3.]])

input_fn = tf.compat.v1.estimator.inputs.numpy_input_fn(
    {"x": x_train},
    y_train,
    batch_size=1,
    num_epochs=None,
    shuffle=True
)

# Train the estimator
estimator.train(input_fn=input_fn, steps=1000)

# Export the trained model
def serving_input_receiver_fn():
    inputs = {"x": tf.placeholder(shape=[None, 1], dtype=tf.float32)}
    return tf.estimator.export.ServingInputReceiver(inputs, inputs)

export_dir = estimator.export_savedmodel('exported_model', serving_input_receiver_fn)

After training, the exported model is available in the `exported_model` directory, typically with a timestamped subdirectory representing the model version.

Step 2: Installing TensorFlow Serving

TensorFlow Serving can be installed and run in several ways: natively on Linux, via Docker containers, or by building from source. The Docker approach is the most convenient and is officially supported.

To install Docker on your system, refer to the official Docker documentation. Once Docker is available, the TensorFlow Serving image can be pulled:

{{EJS15}}

Step 3: Serving the Model with TensorFlow Serving

Assuming the exported model is located in `/models/my_model/1/`, where `1` is the version number, run the TensorFlow Serving Docker container as follows:
sh
docker run -p 8501:8501 --name=tf_serving_linear \
  --mount type=bind,source=/models/my_model,target=/models/my_model \
  -e MODEL_NAME=my_model -t tensorflow/serving

Explanation of the parameters:
- `-p 8501:8501` maps the container’s port 8501 to the host, exposing the REST API.
- `--name=tf_serving_linear` assigns a name to the container.
- `--mount type=bind,source=...,target=...` mounts the local model directory into the Docker container.
- `-e MODEL_NAME=my_model` specifies the model name TensorFlow Serving will serve.
- `-t tensorflow/serving` specifies the TensorFlow Serving image.

The directory structure for models should be:

/models/
  my_model/
    1/
      saved_model.pb
      variables/

TensorFlow Serving automatically detects the version subdirectory (`1`), allowing for easy model versioning and upgrades.

Step 4: Making Predictions via REST API

Once the server is running, predictions can be made via HTTP POST requests to the REST API endpoint.

Here is an example using `curl` to send a prediction request:

sh
curl -d '{"instances": [{"x": [1.0]}, {"x": [2.0]}]}' \
     -H "Content-Type: application/json" \
     http://localhost:8501/v1/models/my_model:predict

- The `'instances'` key contains a list of input examples, each matching the input signature expected by the model (`x` in this case).

TensorFlow Serving returns a prediction response in JSON format:

json
{
  "predictions": [[output_1], [output_2]]
}

Where `output_1` and `output_2` are the predicted values for the inputs 1.0 and 2.0, respectively.

Step 5: Making Predictions via gRPC API

TensorFlow Serving also supports gRPC, which provides better performance and is commonly used in high-throughput production environments.

Example Python code using the gRPC API:

python
import grpc
import tensorflow as tf
from tensorflow_serving.apis import predict_pb2, prediction_service_pb2_grpc

# Connect to TensorFlow Serving server
channel = grpc.insecure_channel('localhost:8500')
stub = prediction_service_pb2_grpc.PredictionServiceStub(channel)

# Prepare request
request = predict_pb2.PredictRequest()
request.model_spec.name = 'my_model'
request.model_spec.signature_name = 'serving_default'
request.inputs['x'].CopyFrom(
    tf.make_tensor_proto([[1.0], [2.0]], shape=[2, 1])
)

# Make prediction
result = stub.Predict(request, 10.0)
print(result)

- The server must be started with the gRPC port exposed (`-p 8500:8500`), which is default in the Docker image.

Step 6: Model Versioning and Management

TensorFlow Serving is designed to efficiently handle model versioning. The directory structure allows multiple versions of a model to coexist. For example:

/models/
  my_model/
    1/
    2/

If a new version (e.g., `2`) is added, TensorFlow Serving can automatically switch to the new version without downtime, depending on configuration. By default, the highest numbered version is served.

To specify the model version in a request, the REST API provides an endpoint:

http://localhost:8501/v1/models/my_model/versions/2:predict

This enables canarying, blue-green deployments, and rollback strategies.

Step 7: Advanced Configuration

TensorFlow Serving supports more advanced features, such as serving multiple models simultaneously, monitoring, and custom batching.

- Serving Multiple Models:

Create a `models.config` file:

  model_config_list: {
    config: {
      name: 'model1',
      base_path: '/models/model1',
      model_platform: 'tensorflow'
    },
    config: {
      name: 'model2',
      base_path: '/models/model2',
      model_platform: 'tensorflow'
    }
  }
  

Start the server with:

sh
  docker run -p 8501:8501 \
    --mount type=bind,source=/models,target=/models \
    -t tensorflow/serving \
    --model_config_file=/models/models.config
  

- Monitoring:

TensorFlow Serving exposes metrics via a Prometheus-compatible endpoint at `/monitoring/prometheus/metrics`.

- Batching:

For performance, enabling batching can be beneficial for high-throughput workloads. Batching can be configured via command-line parameters or configuration files.

Step 8: Security and Production Considerations

In production, ensure that TensorFlow Serving endpoints are secured using authentication and authorization layers, as well as network-level protections (for example, running behind a reverse proxy or API gateway).

Logging, monitoring, and alerting are critical for production deployments. Integrate TensorFlow Serving with centralized logging and monitoring solutions to track usage, performance, and failures.

Example End-to-End Workflow

Training and Exporting a Model:

1. Train a simple estimator as shown in Step 1.
2. Export the model to the SavedModel format, e.g., `/models/linear/1/`.

Starting TensorFlow Serving:

sh
docker run -p 8501:8501 --name=tf_serving_example \
  --mount type=bind,source=/models/linear,target=/models/linear \
  -e MODEL_NAME=linear -t tensorflow/serving

Making a Prediction:

sh
curl -d '{"instances": [{"x": [5.0]}]}' \
     -H "Content-Type: application/json" \
     http://localhost:8501/v1/models/linear:predict

Response:

{{EJS27}}

Troubleshooting Common Issues

1. Model Not Found: Ensure the model directory structure is correct and the `MODEL_NAME` environment variable corresponds to the correct directory.
2. Signature Mismatch: The exported model’s input signature must match the input provided during prediction requests. Use the `saved_model_cli` tool to inspect the SavedModel signature.
3. Port Conflicts: Ensure the specified ports (8501 for REST, 8500 for gRPC) are not in use by other processes.
4. File Permissions: Verify that Docker has permission to access the model files on the host machine.

Integration with Google Cloud

TensorFlow Serving can be integrated with Google Cloud AI Platform for managed deployments. However, the fundamental principles of exporting models, serving, and querying remain consistent. Google Cloud AI Platform provides a managed service for serving TensorFlow models, abstracting away the infrastructure management.

Paragraph

TensorFlow Serving provides a robust and flexible solution for serving TensorFlow models in a production environment. It supports both REST and gRPC interfaces, enables version management, and integrates smoothly into scalable deployment architectures. Starting with model export, proceeding through Docker-based serving, and culminating in API-based inference, TensorFlow Serving streamlines the transition from model development to real-world deployment. Its compatibility with both simple estimators and complex models makes it a versatile tool in the machine learning deployment toolkit.

Other recent questions and answers regarding EITC/AI/GCML Google Cloud Machine Learning:

  • Is the so called part of "Inference" equivalent to the description in the step-by-step process of machine learning described as "evaluating, iterating, improving"?
  • What are some common AI/ML algorithms to be used on the processed data?
  • How Keras models replace TensorFlow estimators?
  • How to configure specific Python environment with Jupyter notebook?
  • What is Classifier.export_saved_model and how to use it?
  • Why is regression frequently used as a predictor?
  • Are Lagrange multipliers and quadratic programming techniques relevant for machine learning?
  • Can more than one model be applied during the machine learning process?
  • Can Machine Learning adapt which algorithm to use depending on a scenario?
  • What is the simplest route to most basic didactic AI model training and deployment on Google AI Platform using a free tier/trial using a GUI console in a step-by-step manner for an absolute begginer with no programming background?

View more questions and answers in EITC/AI/GCML Google Cloud Machine Learning

More questions and answers:

  • Field: Artificial Intelligence
  • Programme: EITC/AI/GCML Google Cloud Machine Learning (go to the certification programme)
  • Lesson: First steps in Machine Learning (go to related lesson)
  • Topic: Plain and simple estimators (go to related topic)
Tagged under: Artificial Intelligence, Docker, GRPC, Model Deployment, Model Versioning, Production ML, REST API, SavedModel, TensorFlow Serving
Home » Artificial Intelligence / EITC/AI/GCML Google Cloud Machine Learning / First steps in Machine Learning / Plain and simple estimators » How to use TensorFlow Serving?

Certification Center

USER MENU

  • My Account

CERTIFICATE CATEGORY

  • EITC Certification (105)
  • EITCA Certification (9)

What are you looking for?

  • Introduction
  • How it works?
  • EITCA Academies
  • EITCI DSJC Subsidy
  • Full EITC catalogue
  • Your order
  • Featured
  •   IT ID
  • EITCA reviews (Medium publ.)
  • About
  • Contact

EITCA Academy is a part of the European IT Certification framework

The European IT Certification framework has been established in 2008 as a Europe based and vendor independent standard in widely accessible online certification of digital skills and competencies in many areas of professional digital specializations. The EITC framework is governed by the European IT Certification Institute (EITCI), a non-profit certification authority supporting information society growth and bridging the digital skills gap in the EU.

Eligibility for EITCA Academy 80% EITCI DSJC Subsidy support

80% of EITCA Academy fees subsidized in enrolment by

    EITCA Academy Secretary Office

    European IT Certification Institute ASBL
    Brussels, Belgium, European Union

    EITC / EITCA Certification Framework Operator
    Governing European IT Certification Standard
    Access contact form or call +32 25887351

    Follow EITCI on X
    Visit EITCA Academy on Facebook
    Engage with EITCA Academy on LinkedIn
    Check out EITCI and EITCA videos on YouTube

    Funded by the European Union

    Funded by the European Regional Development Fund (ERDF) and the European Social Fund (ESF) in series of projects since 2007, currently governed by the European IT Certification Institute (EITCI) since 2008

    Information Security Policy | DSRRM and GDPR Policy | Data Protection Policy | Record of Processing Activities | HSE Policy | Anti-Corruption Policy | Modern Slavery Policy

    Automatically translate to your language

    Terms and Conditions | Privacy Policy
    EITCA Academy
    • EITCA Academy on social media
    EITCA Academy


    © 2008-2025  European IT Certification Institute
    Brussels, Belgium, European Union

    TOP
    Chat with Support
    Chat with Support
    Questions, doubts, issues? We are here to help you!
    End chat
    Connecting...
    Do you have any questions?
    Do you have any questions?
    :
    :
    :
    Send
    Do you have any questions?
    :
    :
    Start Chat
    The chat session has ended. Thank you!
    Please rate the support you've received.
    Good Bad