Learnitweb

Building a Streamlit SQL Chatbot Using Gemini + LangChain + Oracle DB

1. Introduction

This tutorial walks you through a Python-based Streamlit application that uses Google Gemini (via LangChain) to:

  • Accept natural language questions from the user
  • Generate SQL queries automatically
  • Clean and format the queries
  • Run them on an Oracle Database
  • Display the results in a table

Following is the code for the SQL Chatbot.

# ✅ Add this import at the top
import streamlit as st

# Existing import
import google.generativeai as genai
from langchain.chains import create_sql_query_chain
from langchain_google_genai import ChatGoogleGenerativeAI
from langchain_experimental.sql import SQLDatabaseChain
from langchain.sql_database import SQLDatabase
from sqlalchemy import create_engine, text

def remove_sql_code_block(text):
    # Remove ```sql and ``` from the response if present
    return text.replace("```sql", "").replace("```", "").strip()

# ✅ Accept user input from Streamlit
st.title("SQL Chatbot powered by Gemini")
user_question = st.text_input("Enter your question about the database:")

# Load Gemini model
llm = ChatGoogleGenerativeAI(
    model="gemini-2.0-flash",
    temperature=0,
    google_api_key="<Place you API key here>"
)

# Setup Oracle DB
db_uri = "oracle+oracledb://myuser:admin@localhost:1521/?service_name=XEPDB1"
engine = create_engine(db_uri)
db = SQLDatabase(engine)

# ✅ Run only if question is provided
if user_question:
    try:
        # Create SQL query generation chain
        chain = create_sql_query_chain(llm, db)
        
        # ✅ Replace hardcoded question with user input
        response = chain.invoke({"question": user_question})

        # Display the generated SQL
        st.subheader("Generated SQL Query")
        st.code(response, language="sql")

        prompt = f"""
        You are given an LLM output.
        Following are the instructions to clean the SQL query:
        Database: Oracle
        Format of returned SQL: Plain text
        The SQL query should be executable in tool: SQL developer or Oracle SQL Plus.  
        
        Any specific formatting for SQL: No
        Check for double quotes: Yes 
        Column names should be in uppercase: Yes
        Include double quotes in column names: No
        Remove semicolon at the end of the query: Yes
        Remove leading ``` and trailing : yes
        Here is the text:
        {response}
        """
        print("Prompt for cleaning SQL:", prompt)
        cleaned_sql = remove_sql_code_block(llm.invoke(prompt).content)
        st.subheader("Generated SQL Query")
        st.code(cleaned_sql, language="sql")
        print("Cleaned SQL:", cleaned_sql)
        # ✅ Execute the SQL query

        with engine.connect() as conn:
            result = conn.execute(text(cleaned_sql))
            rows = result.fetchall()
            st.subheader("Query Result")
            st.dataframe(rows)

    except Exception as e:
        st.error(f"Error: {e}")


2. Prerequisites

Ensure you have the following Python packages installed:

pip install streamlit google-generativeai langchain langchain-google-genai sqlalchemy oracledb

Also, make sure:

3. Import Required Libraries

import streamlit as st
import google.generativeai as genai
from langchain.chains import create_sql_query_chain
from langchain_google_genai import ChatGoogleGenerativeAI
from langchain_experimental.sql import SQLDatabaseChain
from langchain.sql_database import SQLDatabase
from sqlalchemy import create_engine, text

Explanation:

  • streamlit: Builds the interactive web app UI
  • google.generativeai: For configuring Gemini
  • langchain_google_genai: LangChain’s wrapper for Gemini
  • create_sql_query_chain: LangChain utility to generate SQL from natural language
  • SQLDatabase: LangChain wrapper for DB connection
  • create_engine, text: SQLAlchemy for connecting to Oracle and executing raw SQL

4. Define a Helper Function to Clean LLM Output

def remove_sql_code_block(text):
    return text.replace("```sql", "").replace("```", "").strip()

Why it’s needed:
LLMs often return SQL queries wrapped in Markdown code blocks (e.g., sql ... ), which causes execution errors in SQL engines. This function strips those characters.

5. Streamlit UI — Accept User Question

st.title("SQL Chatbot powered by Gemini")
user_question = st.text_input("Enter your question about the database:")

Displays a title and a text box where the user can input questions.

6. Load Gemini Model with LangChain Wrapper

llm = ChatGoogleGenerativeAI(
    model="gemini-2.0-flash",  # ⚠️ use "gemini-1.5-flash" for actual support
    temperature=0,
    google_api_key="<Place you API key here>"
)

Details:

  • temperature=0: Makes the model deterministic.
  • model: Choose Gemini variant. As of now, valid options are "gemini-1.5-pro" or "gemini-1.5-flash".

7. Connect to Oracle Database

db_uri = "oracle+oracledb://myuser:admin@localhost:1521/?service_name=XEPDB1"
engine = create_engine(db_uri)
db = SQLDatabase(engine)

Explanation:

  • Uses SQLAlchemy’s create_engine to build a DB connection.
  • SQLDatabase is a LangChain wrapper for abstracting SQL execution.
  • Replace myuser, admin, and XEPDB1 with your actual credentials and DB service.

8. Generate and Execute SQL from User Question

if user_question:
    try:
        chain = create_sql_query_chain(llm, db)
        response = chain.invoke({"question": user_question})
  • create_sql_query_chain(...): Creates a LangChain pipeline that maps questions to SQL.
  • chain.invoke(...): Passes the user’s question and gets back an auto-generated SQL query (as a string).

9. Show the Generated SQL

        st.subheader("Generated SQL Query")
        st.code(response, language="sql")

Displays the raw SQL returned by Gemini.

10. Clean the SQL Query Using Gemini Again (Optional Step)

        prompt = f"""..."""
        cleaned_sql = remove_sql_code_block(llm.invoke(prompt).content)

Why use Gemini again?
To clean or reformat the raw SQL and ensure:

  • No semicolon at the end
  • No Markdown backticks
  • Column names in uppercase
  • Ready to run in SQL Developer

Then remove_sql_code_block() strips Markdown formatting.

11. Display Cleaned SQL and Execute It

        st.subheader("Generated SQL Query")
        st.code(cleaned_sql, language="sql")

        with engine.connect() as conn:
            result = conn.execute(text(cleaned_sql))
            rows = result.fetchall()
            st.subheader("Query Result")
            st.dataframe(rows)
  • Shows the final cleaned SQL
  • Runs it on the Oracle DB using SQLAlchemy
  • Fetches results and displays them in a Streamlit table

12. Catch Errors Gracefully

    except Exception as e:
        st.error(f"Error: {e}")

If anything fails (invalid SQL, DB error, LLM failure), the app shows a red error message instead of crashing.