Learnitweb

Tensor in Machine Learning

In machine learning, especially in deep learning, tensors are the fundamental data structures used to store and manipulate data. Whether you are training a neural network, passing inputs through layers, or computing gradients, you are working with tensors.

1. What is a Tensor?

A tensor is a multi-dimensional array—a container that can hold numbers in more than two dimensions. It is a generalization of scalars, vectors, and matrices.

Mathematically, a tensor is an object that can be represented as an array of components that are functions of coordinates and obey transformation rules under coordinate changes.

In programming and machine learning, we simplify this to:

A tensor is a structured collection of numbers arranged across one or more dimensions.

2. Scalars, Vectors, Matrices, and Tensors

To understand tensors, it helps to first understand lower-dimensional data structures:

ObjectDescriptionTensor RankShape Example
ScalarSingle number05 or π
Vector1D array of numbers1[1, 2, 3]
Matrix2D array of numbers2[[1, 2], [3, 4]]
Tensor3D or higher-dimensional array3 or more[[[1, 2], [3, 4]]]

So:

  • A scalar is a rank-0 tensor
  • A vector is a rank-1 tensor
  • A matrix is a rank-2 tensor
  • Anything with 3 or more dimensions is considered a higher-rank tensor

3. Tensor Rank and Dimensions

The rank (also called order or degree) of a tensor is the number of dimensions (axes).

For example:

import numpy as np

a = np.array(5)                 # Scalar: rank 0
b = np.array([1, 2, 3])         # Vector: rank 1
c = np.array([[1, 2], [3, 4]])  # Matrix: rank 2
d = np.array([[[1], [2]], [[3], [4]]])  # Tensor: rank 3
  • a.shape()
  • b.shape(3,)
  • c.shape(2, 2)
  • d.shape(2, 2, 1)

The number of elements along each axis defines the shape of the tensor.

4. Why Tensors Matter in Machine Learning

Tensors are essential in machine learning because:

  • All inputs, outputs, weights, and activations in ML/DL models are stored as tensors
  • They can represent structured data of any shape: sequences, images, audio, etc.
  • ML frameworks are optimized for tensor computations on GPUs/TPUs
  • Gradient computation (backpropagation) is done using tensor calculus
  • The efficiency of tensor operations affects the training time and scalability of models.

5. Real-World Example: Image Classification using Deep Learning

Suppose you’re building a machine learning model to classify animal images (e.g., cats, dogs, horses). You’ve collected thousands of images, and you want to train a neural network using them.

Step 1: A Single Image as a Tensor

Let’s say each image is:

  • Width: 64 pixels
  • Height: 64 pixels
  • Color channels: 3 (Red, Green, Blue – RGB)

This means every image is stored as a 3-dimensional tensor of shape:

(64, 64, 3)
  • Axis 0: 64 rows (height)
  • Axis 1: 64 columns (width)
  • Axis 2: 3 values for RGB channels

Sample Pixel Tensor for One Image:

image = [
  [ [255, 0, 0], [254, 1, 0], ..., [0, 0, 255] ],  # row 1 (64 pixels)
  ...
  [ [34, 67, 89], [10, 20, 30], ..., [200, 200, 200] ]  # row 64
]

Each innermost list like [255, 0, 0] represents the color values for a pixel.

Step 2: A Batch of Images

You rarely feed just one image to a neural network. Typically, you use a batch of images.

Suppose you use a batch size of 32. Now, the input becomes a 4-dimensional tensor:

(32, 64, 64, 3)
  • 32 images
  • Each with height 64
  • Each with width 64
  • Each pixel with 3 RGB channels

This is the input tensor your model will receive in one forward pass.

Step 3: Corresponding Labels

Let’s say your dataset has 3 classes: Cat, Dog, and Horse.

You might represent the labels as one-hot encoded vectors:

  • Cat[1, 0, 0]
  • Dog[0, 1, 0]
  • Horse[0, 0, 1]

For a batch of 32 images, your label tensor will be:

(32, 3)

Step 4: Tensor Summary in the ML Model

Data ComponentDescriptionTensor ShapeRank
Single imageRGB image(64, 64, 3)3
Batch of images32 RGB images(32, 64, 64, 3)4
Label for one imageOne-hot vector(3,)1
Labels for batchOne-hot for 32 images(32, 3)2
Output of modelPredicted probabilities(32, 3)2

Visual Representation (Shape Only)

Input Tensor:
┌────────────────────────────┐
│ 32 Images                  │
│ ┌────────────────────────┐│
│ │64 x 64 x 3 (each image)││
│ └────────────────────────┘│
└────────────────────────────┘
Shape: (32, 64, 64, 3)