In this tutorial, you’ll learn how to:
- Create a Hugging Face account and access token
- Configure your API token securely using environment variables
- Install and use the
sentence-transformerslibrary - Use the
langchain_huggingfaceintegration to create embeddings - Generate embeddings for a single query and for multiple documents
This is one of the key building blocks in a RAG (Retrieval-Augmented Generation) or search system: converting text into vectors (embeddings) that capture semantic meaning.
1. What Are Embeddings (Quick Recap)?
Embeddings are numeric representations of text (usually large vectors of floats). Texts that are semantically similar (e.g., “How to make pasta?” and “Steps to cook spaghetti”) will have embeddings that are close to each other in vector space.
We use embeddings for:
- Semantic search
- Document retrieval (RAG)
- Clustering and similarity analysis
- Recommendation systems
Hugging Face provides many open-source models for generating embeddings, and LangChain makes it easy to use them in your applications.
2. Create a Hugging Face Account and Access Token
- Open huggingface.co in your browser.
- Sign up or log in to your account.
- Click on your profile avatar → Settings.
- In the left menu, click Access Tokens.
- Click New token:
- Name: e.g.
langchain-tutorial - Type / Role: select Read (you only need read access to use models).
- Name: e.g.
- Click Generate token, and copy the token somewhere safe.
⚠️ Treat this token like a password. Never hard-code it in your code or push it to GitHub.
3. Store the Token in .env (Environment Variable)
To keep your token secret, you should store it in an .env file and load it at runtime.
- In your project folder, create a file named
.env. - Add a line like this (replace with your real token):
HF_TOKEN=your_hugging_face_token_here
- Make sure
.envis ignored by Git. Add this to your.gitignore:
.env
4. Install Required Python Libraries
Create a requirements.txt file (or update your existing one) with the following entries:
python-dotenv sentence-transformers torch langchain-huggingface tqdm
Then install everything:
pip install -r requirements.txt
Notes:
sentence-transformerscontains the Sentence-BERT based models we’ll use for embeddings.torchis the backend deep learning library required bysentence-transformers.langchain-huggingfaceprovides theHuggingFaceEmbeddingsclass for easy integration.python-dotenvloads environment variables from.env.tqdm(optional) shows progress bars when downloading models.
5. Load the Hugging Face Token in Your Code
Create a Python file, for example hf_embeddings_demo.py, and start with:
import os
from dotenv import load_dotenv
# 1. Load variables from .env file
load_dotenv()
# 2. Read the token from the environment
hf_token = os.getenv("HF_TOKEN")
if hf_token is None:
raise ValueError("HF_TOKEN is not set in .env")
# 3. (Optional but common) Expose it as the token expected by HF libraries
os.environ["HUGGINGFACEHUB_API_TOKEN"] = hf_token
Explanation:
load_dotenv()loads the values from.envinto the environment.os.getenv("HF_TOKEN")reads the token.- Many Hugging Face integrations look for
HUGGINGFACEHUB_API_TOKEN, so we set it for convenience.
6. Import and Configure HuggingFaceEmbeddings
Now we’ll use the langchain_huggingface integration to create an embeddings object.
from langchain_huggingface import HuggingFaceEmbeddings
Choose an Embedding Model
In this tutorial, we’ll use the popular MiniLM model:
sentence-transformers/all-MiniLM-L6-v2
This is a fast, lightweight model that produces 384-dimensional embeddings and works well for many applications.
Add this to your code:
# Model name from Hugging Face
model_name = "sentence-transformers/all-MiniLM-L6-v2"
# Create the embeddings object
embeddings = HuggingFaceEmbeddings(
model_name=model_name
)
On the first run, the model will be downloaded, which might take a little time. After that, it will be cached locally.
7. Generate an Embedding for a Single Query
Let’s embed a simple sentence.
text = "This is a test document."
# Convert the text into an embedding (a list of floats)
query_result = embeddings.embed_query(text)
print("First 5 values of the embedding:", query_result[:5])
print("Embedding dimension:", len(query_result))
Sample output (the exact numbers will differ):
First 5 values of the embedding: [0.0123, -0.0456, 0.0789, ...] Embedding dimension: 384
So the sentence "This is a test document." is represented as a 384-dimensional vector.
8. Generate Embeddings for Multiple Documents
Often you’ll have a list of texts (sentences, paragraphs, documents) and want embeddings for each.
docs = [
"This is a test document.",
"This is not a test document."
]
doc_results = embeddings.embed_documents(docs)
print(f"Number of documents: {len(doc_results)}")
print("Dimension of first document embedding:", len(doc_results[0]))
print("\nFirst document embedding (first 5 values):", doc_results[0][:5])
print("Second document embedding (first 5 values):", doc_results[1][:5])
Here:
doc_resultsis a list of embeddings.- Each element in
doc_resultsis a vector (list of floats) of length 384. - Two semantically similar documents will have embeddings that are closer together (you can measure this with cosine similarity later).
9. Full Example Code
Here is a compact script that puts everything together:
import os
from dotenv import load_dotenv
from langchain_huggingface import HuggingFaceEmbeddings
# 1. Load environment variables
load_dotenv()
hf_token = os.getenv("HF_TOKEN")
if hf_token is None:
raise ValueError("HF_TOKEN is not set in your .env file")
# Make token visible to Hugging Face integrations
os.environ["HUGGINGFACEHUB_API_TOKEN"] = hf_token
# 2. Create embeddings object
model_name = "sentence-transformers/all-MiniLM-L6-v2"
embeddings = HuggingFaceEmbeddings(model_name=model_name)
# 3. Single text embedding
text = "This is a test document."
query_embedding = embeddings.embed_query(text)
print("Single query embedding dimension:", len(query_embedding))
# 4. Multiple documents embedding
docs = [
"This is a test document.",
"This is not a test document."
]
doc_embeddings = embeddings.embed_documents(docs)
print("Number of documents:", len(doc_embeddings))
print("Each document embedding dimension:", len(doc_embeddings[0]))
# Show a small preview
print("\nFirst 5 values of first embedding:", doc_embeddings[0][:5])
print("First 5 values of second embedding:", doc_embeddings[1][:5])
