Using Hugging Face Embeddings with LangChain

In this tutorial, you’ll learn how to:

Create a Hugging Face account and access token
Configure your API token securely using environment variables
Install and use the sentence-transformers library
Use the langchain_huggingface integration to create embeddings
Generate embeddings for a single query and for multiple documents

This is one of the key building blocks in a RAG (Retrieval-Augmented Generation) or search system: converting text into vectors (embeddings) that capture semantic meaning.

1. What Are Embeddings (Quick Recap)?

Embeddings are numeric representations of text (usually large vectors of floats). Texts that are semantically similar (e.g., “How to make pasta?” and “Steps to cook spaghetti”) will have embeddings that are close to each other in vector space.

We use embeddings for:

Semantic search
Document retrieval (RAG)
Clustering and similarity analysis
Recommendation systems

Hugging Face provides many open-source models for generating embeddings, and LangChain makes it easy to use them in your applications.

2. Create a Hugging Face Account and Access Token

Open huggingface.co in your browser.
Sign up or log in to your account.
Click on your profile avatar → Settings.
In the left menu, click Access Tokens.
Click New token:
- Name: e.g. langchain-tutorial
- Type / Role: select Read (you only need read access to use models).
Click Generate token, and copy the token somewhere safe.

⚠️ Treat this token like a password. Never hard-code it in your code or push it to GitHub.

3. Store the Token in `.env` (Environment Variable)

To keep your token secret, you should store it in an .env file and load it at runtime.

In your project folder, create a file named .env.
Add a line like this (replace with your real token):

HF_TOKEN=your_hugging_face_token_here

Make sure .env is ignored by Git. Add this to your .gitignore:

.env

4. Install Required Python Libraries

Create a requirements.txt file (or update your existing one) with the following entries:

python-dotenv
sentence-transformers
torch
langchain-huggingface
tqdm

Then install everything:

pip install -r requirements.txt

Notes:

sentence-transformers contains the Sentence-BERT based models we’ll use for embeddings.
torch is the backend deep learning library required by sentence-transformers.
langchain-huggingface provides the HuggingFaceEmbeddings class for easy integration.
python-dotenv loads environment variables from .env.
tqdm (optional) shows progress bars when downloading models.

5. Load the Hugging Face Token in Your Code

Create a Python file, for example hf_embeddings_demo.py, and start with:

import os
from dotenv import load_dotenv

# 1. Load variables from .env file
load_dotenv()

# 2. Read the token from the environment
hf_token = os.getenv("HF_TOKEN")

if hf_token is None:
    raise ValueError("HF_TOKEN is not set in .env")

# 3. (Optional but common) Expose it as the token expected by HF libraries
os.environ["HUGGINGFACEHUB_API_TOKEN"] = hf_token

Explanation:

load_dotenv() loads the values from .env into the environment.
os.getenv("HF_TOKEN") reads the token.
Many Hugging Face integrations look for HUGGINGFACEHUB_API_TOKEN, so we set it for convenience.

6. Import and Configure `HuggingFaceEmbeddings`

Now we’ll use the langchain_huggingface integration to create an embeddings object.

from langchain_huggingface import HuggingFaceEmbeddings

Choose an Embedding Model

In this tutorial, we’ll use the popular MiniLM model:

sentence-transformers/all-MiniLM-L6-v2

This is a fast, lightweight model that produces 384-dimensional embeddings and works well for many applications.

Add this to your code:

# Model name from Hugging Face
model_name = "sentence-transformers/all-MiniLM-L6-v2"

# Create the embeddings object
embeddings = HuggingFaceEmbeddings(
    model_name=model_name
)

On the first run, the model will be downloaded, which might take a little time. After that, it will be cached locally.

7. Generate an Embedding for a Single Query

Let’s embed a simple sentence.

text = "This is a test document."

# Convert the text into an embedding (a list of floats)
query_result = embeddings.embed_query(text)

print("First 5 values of the embedding:", query_result[:5])
print("Embedding dimension:", len(query_result))

Sample output (the exact numbers will differ):

First 5 values of the embedding: [0.0123, -0.0456, 0.0789, ...]
Embedding dimension: 384

So the sentence "This is a test document." is represented as a 384-dimensional vector.

8. Generate Embeddings for Multiple Documents

Often you’ll have a list of texts (sentences, paragraphs, documents) and want embeddings for each.

docs = [
    "This is a test document.",
    "This is not a test document."
]

doc_results = embeddings.embed_documents(docs)

print(f"Number of documents: {len(doc_results)}")
print("Dimension of first document embedding:", len(doc_results[0]))

print("\nFirst document embedding (first 5 values):", doc_results[0][:5])
print("Second document embedding (first 5 values):", doc_results[1][:5])

Here:

doc_results is a list of embeddings.
Each element in doc_results is a vector (list of floats) of length 384.
Two semantically similar documents will have embeddings that are closer together (you can measure this with cosine similarity later).

9. Full Example Code

Here is a compact script that puts everything together:

import os
from dotenv import load_dotenv
from langchain_huggingface import HuggingFaceEmbeddings

# 1. Load environment variables
load_dotenv()

hf_token = os.getenv("HF_TOKEN")
if hf_token is None:
    raise ValueError("HF_TOKEN is not set in your .env file")

# Make token visible to Hugging Face integrations
os.environ["HUGGINGFACEHUB_API_TOKEN"] = hf_token

# 2. Create embeddings object
model_name = "sentence-transformers/all-MiniLM-L6-v2"
embeddings = HuggingFaceEmbeddings(model_name=model_name)

# 3. Single text embedding
text = "This is a test document."
query_embedding = embeddings.embed_query(text)
print("Single query embedding dimension:", len(query_embedding))

# 4. Multiple documents embedding
docs = [
    "This is a test document.",
    "This is not a test document."
]
doc_embeddings = embeddings.embed_documents(docs)

print("Number of documents:", len(doc_embeddings))
print("Each document embedding dimension:", len(doc_embeddings[0]))

# Show a small preview
print("\nFirst 5 values of first embedding:", doc_embeddings[0][:5])
print("First 5 values of second embedding:", doc_embeddings[1][:5])