Streaming outputs

In some cases, you may want to stream a sequence of outputs rather than show a single output at once. For example, you might have an image generation model and you want to show the image that is generated at each step, leading up to the final image. Or you might have a chatbot which streams its response one token at a time instead of returning it all at once.

In such cases, you can supply a generator function into Gradio instead of a regular function. Creating generators in Python is very simple: instead of a single return value, a function should yield a series of values instead. Usually the yield statement is put in some kind of loop. Here's an example of an generator that simply counts up to a given number:

def my_generator(x):
    for i in range(x):
        yield i

You supply a generator into Gradio the same way as you would a regular function. For example, here's a a (fake) image generation model that generates noise for several steps before outputting an image using the gr.Interface class:

import gradio as gr
import numpy as np
import time

def fake_diffusion(steps):
    rng = np.random.default_rng()
    for i in range(steps):
        time.sleep(1)
        image = rng.random(size=(600, 600, 3))
        yield image
    image = np.ones((1000,1000,3), np.uint8)
    image[:] = [255, 124, 0]
    yield image

demo = gr.Interface(fake_diffusion,
                    inputs=gr.Slider(1, 10, 3, step=1),
                    outputs="image",
                    api_name="predict")

demo.launch()

Note that we've added a time.sleep(1) in the iterator to create an artificial pause between steps so that you are able to observe the steps of the iterator (in a real image generation model, this probably wouldn't be necessary).

Similarly, Gradio can handle streaming inputs, e.g. an image generation model that reruns every time a user types a letter in a textbox. This is covered in more details in our guide on building reactive Interfaces.

Streaming Media

Gradio can stream audio and video directly from your generator function. This lets your user hear your audio or see your video nearly as soon as it's yielded by your function. All you have to do is

Set streaming=True in your gr.Audio or gr.Video output component.
Write a python generator that yields the next "chunk" of audio or video.
Set autoplay=True so that the media starts playing automatically.

For audio, the next "chunk" can be either an .mp3 or .wav file or a bytes sequence of audio. For video, the next "chunk" has to be either .mp4 file or a file with h.264 codec with a .ts extension. For smooth playback, make sure chunks are consistent lengths and larger than 1 second.

We'll finish with some simple examples illustrating these points.

Streaming Audio

import gradio as gr
from time import sleep

def keep_repeating(audio_file):
    for _ in range(10):
        sleep(0.5)
        yield audio_file

gr.Interface(keep_repeating,
             gr.Audio(sources=["microphone"], type="filepath"),
             gr.Audio(streaming=True, autoplay=True)
).launch()

Streaming Video

import gradio as gr
from time import sleep

def keep_repeating(video_file):
    for _ in range(10):
        sleep(0.5)
        yield video_file

gr.Interface(keep_repeating,
             gr.Video(sources=["webcam"], format="mp4"),
             gr.Video(streaming=True, autoplay=True)
).launch()

End-to-End Examples

For an end-to-end example of streaming media, see the object detection from video guide or the streaming AI-generated audio with transformers guide.

Getting Started

Building Interfaces

Building With Blocks

Additional Features

Chatbots

Data Science And Plots

Streaming

Custom Components

Gradio Clients And Lite

Mcp

Other Tutorials

Streaming outputs

Streaming Media

Streaming Audio

Streaming Video

End-to-End Examples