The Hugging Face Hub is a central platform that has over 90,000 models, 14,000 datasets and 14,000 demos, also known as Spaces. From Natural Language Processing to Computer Vision and Speech, the Hub supports multiple domains. Although Hugging Face is famous for its 🤗 transformers and diffusers libraries, the Hub also supports dozens of ML libraries, such as PyTorch, TensorFlow, spaCy, and many others.
Gradio has multiple features that make it extremely easy to leverage existing models and Spaces on the Hub. This guide walks through these features.
pipeline
First, let's build a simple interface that translates text from English to Spanish. Between the over a thousand models shared by the University of Helsinki, there is an existing model, opus-mt-en-es
, that does precisely this!
The 🤗 transformers library has a very easy-to-use abstraction, pipeline()
that handles most of the complex code to offer a simple API for common tasks. By specifying the task and an (optional) model, you can use an existing model with few lines:
import gradio as gr
from transformers import pipeline
pipe = pipeline("translation", model="Helsinki-NLP/opus-mt-en-es")
def predict(text):
return pipe(text)[0]["translation_text"]
iface = gr.Interface(
fn=predict,
inputs='text',
outputs='text',
examples=[["Hello! My name is Omar"]]
)
iface.launch()
The previous code produces the following interface, which you can try right here in your browser:
This demo requires installing four libraries: gradio, torch, transformers, and sentencepiece. Apart from that, this is a Gradio with the structure you're used to! The demo is a usual Gradio Interface
with a prediction function, a specified input, and a specified output. The prediction function executes the pipeline
function with the given input, retrieves the first (and only) translation result, and returns the translation_text
field, which you're interested in.
Hugging Face has a free service called the Inference API, which allows you to send HTTP requests to models in the Hub. For transformers or diffusers-based models, the API can be 2 to 10 times faster than running the inference yourself. The API is free (rate limited), and you can switch to dedicated Inference Endpoints when you want to use it in production.
Let's try the same demo as above but using the Inference API instead of loading the model yourself. Given a Hugging Face model supported in the Inference API, Gradio can automatically infer the expected input and output and make the underlying server calls, so you don't have to worry about defining the prediction function. Here is what the code would look like!
import gradio as gr
iface = gr.Interface.load("huggingface/Helsinki-NLP/opus-mt-en-es",
examples=[["Hello! My name is Omar"]]
)
iface.launch()
Let's go over some of the key differences:
Interface.load()
is used instead of the usual Interface()
.Interface.load()
receives a string with the prefix huggingface/
, and then the model repository ID.title
, description
, and examples
).You might notice that the first inference takes about 20 seconds. This happens since the Inference API is loading the model in the server. You get some benefits afterward:
Hugging Face Spaces allows anyone to host their Gradio demos freely. The community shares oven 2,000 Spaces. Uploading your Gradio demos take a couple of minutes. You can head to hf.co/new-space, select the Gradio SDK, create an app.py
file, and voila! You have a demo you can share with anyone else.
You can use the existing Spaces to tweak the UI or combine multiple demos. Let's find how to do this! First, let's take a look at an existing demo that does background removal.
This is a Gradio demo already shared by a community member. You can load an existing demo using Interface
in a syntax similar to how it's done for the Inference API. It just takes two lines of code and with the prefix spaces
.
import gradio as gr
gr.Interface.load("spaces/eugenesiow/remove-bg").launch()
The code snippet above will load the same interface as the corresponding Space demo.
You can change UI elements, such as the title or theme, but also change the expected type. The previous Space expected users to upload images. What if you would like users to have their webcam and remove the background from there? You can load the Space but change the source of input as follows:
import gradio as gr
gr.Interface.load(
"spaces/eugenesiow/remove-bg",
inputs=[gr.Image(label="Input Image", source="webcam")]
).launch()
The code above generates the following demo.
As you can see, the demo looks the same, but it uses a webcam input instead of user-uploaded images.
You can learn more about this feature, and how to use it with the new Blocks API in the Using Gradio Blocks Like Functions guide
Sometimes a single model inference will not be enough: you might want to call multiple models by piping them (using the output of model A as the input of model B). Series
can achieve this. Other times, you might want to run two models in parallel to compare them. Parallel
can do this!
Let's combine the notion of running things in parallel with the Spaces integration. The GPT-J-6B Space demos a model that generates text using a model called GPT-J. The T0pp Space demos another generative model called T0pp. Let's see how to combine both into one.
import gradio as gr
iface1 = gr.Interface.load("spaces/mrm8488/GPT-J-6B")
iface2 = gr.Interface.load("spaces/akhaliq/T0pp")
iface3 = gr.mix.Parallel(
iface1, iface2,
examples = [
['Which country will win the 2002 World Cup?'],
["A is the son's of B's uncle. What is the family relationship between A and B?"],
["In 2030, "],
])
iface3.launch()
iface1
and iface2
are loading existing Spaces. Then, with Parallel
, you can run the interfaces parallelly. When you click submit, you will get the output for both interfaces. This is how the demo looks like:
Although both models are generative, you can see that the way both models behave is very different. That's a powerful application of Parallel
!
Making use of the huggingface_hub client library library you can create new Spaces or model repositories. You can do this even in a Gradio Space! You can find an example space here. This Space creates a new Space comparing different models or spaces with the support of Gradio load
and Parallel
. Now you can try creating cool spaces with all kinds of functionality 😎.
from huggingface_hub import (
create_repo,
get_full_repo_name,
upload_file,
)
create_repo(name=target_space_name, token=hf_token, repo_type="space", space_sdk="gradio")
repo_name = get_full_repo_name(model_id=target_space_name, token=hf_token)
file_url = upload_file(
path_or_fileobj="file.txt",
path_in_repo="app.py",
repo_id=repo_name,
repo_type="space",
token=hf_token,
)
Here, create_repo
creates a gradio repo with the target name under a specific account using that account's Write Token. repo_name
gets the full repo name of the related repo. Finally upload_file
uploads a file inside the repo with the name app.py
.
Throughout this guide, you've seen there are Gradio demos embedded. You can also do this on own website! The first step is to create a Space with the demo you want to showcase. You can embed it in your HTML code, as shown in the following self-contained example.
<iframe src="https://osanseviero-mix-match-gradio.hf.space" frameBorder="0" height="450" title="Gradio app" class="container p-0 flex-grow space-iframe" allow="accelerometer; ambient-light-sensor; autoplay; battery; camera; document-domain; encrypted-media; fullscreen; geolocation; gyroscope; layout-animations; legacy-image-formats; magnetometer; microphone; midi; oversized-images; payment; picture-in-picture; publickey-credentials-get; sync-xhr; usb; vr ; wake-lock; xr-spatial-tracking" sandbox="allow-forms allow-modals allow-popups allow-popups-to-escape-sandbox allow-same-origin allow-scripts allow-downloads"></iframe>
That's it! Let's recap what you can do:
🤗