🎉 We are joining Hugging Face! Read our announcement here. 🤗

Working With Machine Learning

We'll take a look at a few examples that dive into how Gradio applications can be built specifically for machine learning models. To run the code for any of these examples, simply click the "open in Colab" button next to the example.

Image Classification in Tensorflow / Keras Colab link

We'll start with the MobileNetV2 image classifier, which we'll load using Tensorflow! Since this is an image classification model, we will use the Image input interface. We'll output a dictionary of labels and their corresponding confidence scores with the Label output interface.

import gradio as gr
import tensorflow as tf
import requests

inception_net = tf.keras.applications.MobileNetV2() # load the model

# Download human-readable labels for ImageNet.
response = requests.get("https://git.io/JJkYN")
labels = response.text.split("\n")

def classify_image(inp):
  inp = inp.reshape((-1, 224, 224, 3))
  inp = tf.keras.applications.mobilenet_v2.preprocess_input(inp)
  prediction = inception_net.predict(inp).flatten()
  return {labels[i]: float(prediction[i]) for i in range(1000)}

image = gr.inputs.Image(shape=(224, 224))
label = gr.outputs.Label(num_top_classes=3)

gr.Interface(fn=classify_image, inputs=image, outputs=label, examples=[
  ["images/cheetah1.jpg"], ["images/lion.jpg"]]).launch()

This interface gives you a way to test MobileNetV2 by dragging and dropping images, and also allows you to naturally modify the input image using image editing tools that appear when you click the edit button. Notice here we provided actual gradio.inputs and gradio.outputs objects to the Interface function instead of using string shortcuts. This lets us use built-in preprocessing (e.g. image resizing) and postprocessing (e.g. choosing the number of labels to display) provided by these interfaces.

Try it out in your device or run it in a colab notebook!

Add Interpretation

The above code also shows how you can add interpretation to your interface. You can use our out-of-the-box interpretation function or use your own interpretation functions. To use the basic out-of-the-box function, just specify “default” for the interpretation parameter:

gr.Interface(classify_image, image, label, interpretation="default").launch();

Image Classification in Pytorch Colab link

Let's now wrap a very similar model, ResNet, except this time in Pytorch. We'll also use the Image to Label interface. (The original ResNet architecture can be found here

import torch
import requests
import gradio as gr
from PIL import Image
from torchvision import transforms

model = torch.hub.load('pytorch/vision:v0.6.0', 'resnet18', pretrained=True).eval()

# Download human-readable labels for ImageNet.
response = requests.get("https://git.io/JJkYN")
labels = response.text.split("\n")

def predict(inp):
  inp = Image.fromarray(inp.astype('uint8'), 'RGB')
  inp = transforms.ToTensor()(inp).unsqueeze(0)
  with torch.no_grad():
    prediction = torch.nn.functional.softmax(model(inp)[0], dim=0)
  return {labels[i]: float(prediction[i]) for i in range(1000)}

inputs = gr.inputs.Image()
outputs = gr.outputs.Label(num_top_classes=3)
gr.Interface(fn=predict, inputs=inputs, outputs=outputs).launch()

Text Generation with Hugging Face Transformers (GPT-J) Colab link

Let's wrap a Text to Text interface around GPT-J, a text generation model that works on provided starter text. Click here to learn more about GPT-J and similar language models. We're loading the model directly from the Hugging Face model repo, and providing a few example prompts.

import gradio as gr

title = "GPT-J-6B"

examples = [
    ['The tower is 324 metres (1,063 ft) tall,'],
    ["The Moon's orbit around Earth has"],
    ["The smooth Borealis basin in the Northern Hemisphere covers 40%"]
]

gr.Interface.load("huggingface/EleutherAI/gpt-j-6B", 
    inputs=gr.inputs.Textbox(lines=5, label="Input Text"),
    title=title, examples=examples).launch();

Answering Questions with Roberta Colab link

What if our model takes more than one input or returns more than value? Let's wrap a 2-input to 2-output interface around Roberta, one model in a family of models that can be used answer questions. Like previously, we will load the model from Hugging Face model hub, but this time, we will override the default inputs and outputs so that we can customize the interface (e.g. put placeholder text).

import gradio as gr

examples = [
    ["The Amazon rainforest is a moist broadleaf forest that covers most of the Amazon basin of South America", 
     "Which continent is the Amazon rainforest in?"]
]

gr.Interface.load("huggingface/deepset/roberta-base-squad2", 
                  inputs=[gr.inputs.Textbox(lines=5, label="Context", placeholder="Type a sentence or paragraph here."), 
                          gr.inputs.Textbox(lines=2, label="Question", placeholder="Ask a question based on the context.")],
                  outputs=[gr.outputs.Textbox(label="Answer"), 
                          gr.outputs.Label(label="Probability")],                  
                  examples=examples).launch()

As shown in the code, Gradio can wrap functions with multiple inputs or outputs, simply by taking the list ofcomponents needed. The number of input components should match the number of parameters taken by the function or API (in this case 2: the context and question). The number of output components should match the number of values returned by the function (in this case, also 2: the answer and the probability it is correct).

A Multilingual Speech Recognition Demo Colab link

Gradio can do more than just images, videos, and text. The Audio input component is popular for speech-to-text applications (and the analogous Audio output component is useful for text-to-speech applications). Click "open in Colab" button to see the code needed for a complete speech recognition demo in mutiple languages. The Gradio-relevant part of the code is very simple and provided below.

import gradio as gr

iface = gr.Interface(
    fn=transcribe, 
    inputs=[
        gr.inputs.Audio(source="microphone", type='filepath'),
        gr.inputs.Dropdown(target_language),
    ],
    outputs="text",
    layout="horizontal",
    theme="huggingface",
)

iface.launch()

Numerical Interfaces: Titanic Survival Model with Scikit-Learn Colab link

Many models have numeric or categorical inputs, which we support with a variety of components. Let's wrap multiple input to Label interface around a Titanic survival model. We also include interpretation for each input by simply adding interpretation="default". See the full code include the training step by clicking on "open in Colab" button. The Gradio-relevant part of the code is provided below.

iface = gr.Interface(
    predict_survival,
    [
        gr.inputs.Dropdown(["first", "second", "third"], type="index"),
        "checkbox",
        gr.inputs.Slider(0, 80),
        gr.inputs.CheckboxGroup(["Sibling", "Child"], label="Travelling with (select all)"),
        gr.inputs.Number(),
        gr.inputs.Radio(["S", "C", "Q"], type="index"),
    ],
    "label",
    examples=[
        ["first", True, 30, [], 50, "S"],
        ["second", False, 40, ["Sibling", "Child"], 10, "Q"],
        ["third", True, 30, ["Child"], 20, "S"],
    ],
    interpretation="default",
)

iface.launch()