Introducing Gradio 5.0
Read MoreIntroducing Gradio 5.0
Read MoreIn this Guide, we go through several examples of how to use gr.ChatInterface
with popular LLM libraries and API providers.
We will cover the following libraries and API providers:
For many LLM libraries and providers, there exist community-maintained integration libraries that make it even easier to spin up Gradio apps. We reference these libraries in the appropriate sections below.
Let's start by using llama-index
on top of openai
to build a RAG chatbot on any text or PDF files that you can demo and share in less than 30 lines of code. You'll need to have an OpenAI key for this example (keep reading for the free, open-source equivalent!)
# This is a simple RAG chatbot built on top of Llama Index and Gradio. It allows you to upload any text or PDF files and ask questions about them!
# Before running this, make sure you have exported your OpenAI API key as an environment variable:
# export OPENAI_API_KEY="your-openai-api-key"
from llama_index.core import VectorStoreIndex, SimpleDirectoryReader
import gradio as gr
def answer(message, history):
files = []
for msg in history:
if msg['role'] == "user" and isinstance(msg['content'], tuple):
files.append(msg['content'][0])
for file in message["files"]:
files.append(file)
documents = SimpleDirectoryReader(input_files=files).load_data()
index = VectorStoreIndex.from_documents(documents)
query_engine = index.as_query_engine()
return str(query_engine.query(message["text"]))
demo = gr.ChatInterface(
answer,
type="messages",
title="Llama Index RAG Chatbot",
description="Upload any text or pdf files and ask questions about them!",
textbox=gr.MultimodalTextbox(file_types=[".pdf", ".txt"]),
multimodal=True
)
demo.launch()
Here's an example using langchain
on top of openai
to build a general-purpose chatbot. As before, you'll need to have an OpenAI key for this example.
# This is a simple general-purpose chatbot built on top of LangChain and Gradio.
# Before running this, make sure you have exported your OpenAI API key as an environment variable:
# export OPENAI_API_KEY="your-openai-api-key"
from langchain_openai import ChatOpenAI
from langchain.schema import AIMessage, HumanMessage
import gradio as gr
model = ChatOpenAI(model="gpt-4o-mini")
def predict(message, history):
history_langchain_format = []
for msg in history:
if msg['role'] == "user":
history_langchain_format.append(HumanMessage(content=msg['content']))
elif msg['role'] == "assistant":
history_langchain_format.append(AIMessage(content=msg['content']))
history_langchain_format.append(HumanMessage(content=message))
gpt_response = model.invoke(history_langchain_format)
return gpt_response.content
demo = gr.ChatInterface(
predict,
type="messages"
)
demo.launch()
Tip: For quick prototyping, the community-maintained langchain-gradio repo makes it even easier to build chatbots on top of LangChain.
Of course, we could also use the openai
library directy. Here a similar example to the LangChain , but this time with streaming as well:
Tip: For quick prototyping, the openai-gradio library makes it even easier to build chatbots on top of OpenAI models.
transformers
Of course, in many cases you want to run a chatbot locally. Here's the equivalent example using the SmolLM2-135M-Instruct model using the Hugging Face transformers
library.
from transformers import AutoModelForCausalLM, AutoTokenizer
import gradio as gr
checkpoint = "HuggingFaceTB/SmolLM2-135M-Instruct"
device = "cpu" # "cuda" or "cpu"
tokenizer = AutoTokenizer.from_pretrained(checkpoint)
model = AutoModelForCausalLM.from_pretrained(checkpoint).to(device)
def predict(message, history):
history.append({"role": "user", "content": message})
input_text = tokenizer.apply_chat_template(history, tokenize=False)
inputs = tokenizer.encode(input_text, return_tensors="pt").to(device)
outputs = model.generate(inputs, max_new_tokens=100, temperature=0.2, top_p=0.9, do_sample=True)
decoded = tokenizer.decode(outputs[0])
response = decoded.split("<|im_start|>assistant\n")[-1].split("<|im_end|>")[0]
return response
demo = gr.ChatInterface(predict, type="messages")
demo.launch()
The SambaNova Cloud API provides access to full-precision open-source models, such as the Llama family. Here's an example of how to build a Gradio app around the SambaNova API
# This is a simple general-purpose chatbot built on top of SambaNova API.
# Before running this, make sure you have exported your SambaNova API key as an environment variable:
# export SAMBANOVA_API_KEY="your-sambanova-api-key"
import os
import gradio as gr
from openai import OpenAI
api_key = os.getenv("SAMBANOVA_API_KEY")
client = OpenAI(
base_url="https://api.sambanova.ai/v1/",
api_key=api_key,
)
def predict(message, history):
history.append({"role": "user", "content": message})
stream = client.chat.completions.create(messages=history, model="Meta-Llama-3.1-70B-Instruct-8k", stream=True)
chunks = []
for chunk in stream:
chunks.append(chunk.choices[0].delta.content or "")
yield "".join(chunks)
demo = gr.ChatInterface(predict, type="messages")
demo.launch()
Tip: For quick prototyping, the sambanova-gradio library makes it even easier to build chatbots on top of SambaNova models.
The Hyperbolic AI API provides access to many open-source models, such as the Llama family. Here's an example of how to build a Gradio app around the SambaNova API
# This is a simple general-purpose chatbot built on top of Hyperbolic API.
# Before running this, make sure you have exported your Hyperbolic API key as an environment variable:
# export HYPERBOLIC_API_KEY="your-hyperbolic-api-key"
import os
import gradio as gr
from openai import OpenAI
api_key = os.getenv("HYPERBOLIC_API_KEY")
client = OpenAI(
base_url="https://api.hyperbolic.xyz/v1/",
api_key=api_key,
)
def predict(message, history):
history.append({"role": "user", "content": message})
stream = client.chat.completions.create(messages=history, model="gpt-4o-mini", stream=True)
chunks = []
for chunk in stream:
chunks.append(chunk.choices[0].delta.content or "")
yield "".join(chunks)
demo = gr.ChatInterface(predict, type="messages")
demo.launch()
Tip: For quick prototyping, the hyperbolic-gradio library makes it even easier to build chatbots on top of Hyperbolic models.
Anthropic's Claude model can also be used via API. Here's a simple 20 questions-style game built on top of the Anthropic API:
# This is a simple 20 questions-style game built on top of the Anthropic API.
# Before running this, make sure you have exported your Anthropic API key as an environment variable:
# export ANTHROPIC_API_KEY="your-anthropic-api-key"
import anthropic
import gradio as gr
client = anthropic.Anthropic()
def predict(message, history):
keys_to_keep = ["role", "content"]
history = [{k: d[k] for k in keys_to_keep if k in d} for d in history]
history.append({"role": "user", "content": message})
if len(history) > 20:
history.append({"role": "user", "content": "DONE"})
output = client.messages.create(
messages=history,
model="claude-3-5-sonnet-20241022",
max_tokens=1000,
system="You are guessing an object that the user is thinking of. You can ask 10 yes/no questions. Keep asking questions until the user says DONE"
)
return {
"role": "assistant",
"content": output.content[0].text,
"options": [{"value": "Yes"}, {"value": "No"}]
}
placeholder = """
<center><h1>10 Questions</h1><br>Think of a person, place, or thing. I'll ask you 10 yes/no questions to try and guess it.
</center>
"""
demo = gr.ChatInterface(
predict,
examples=["Start!"],
chatbot=gr.Chatbot(placeholder=placeholder),
type="messages"
)
demo.launch()