New to Gradio? Start here: Getting Started
See the Release History
To install the Gradio Python Client from main, run the following command:
pip install 'gradio-client @ git+https://github.com/gradio-app/gradio@0c5cec175815344fcc6c8dfbdacf71b85fd3ad37#subdirectory=client/python'Using ZeroGPU Spaces with the Clients
Hugging Face Spaces now offers a new hardware option called ZeroGPU. ZeroGPU is a “serverless” cluster of spaces that let Gradio applications run on A100 GPUs for free. These kinds of spaces are a great foundation to build new applications on top of with the python gradio client, but you need to take care to avoid ZeroGPU’s rate limiting.
Explaining Rate Limits for ZeroGPU
ZeroGPU spaces are rate-limited to ensure that a single user does not hog all of the available GPUs.
The limit is controlled by a special token that the Hugging Face Hub infrastructure adds to all incoming requests to Spaces.
This token is a request header called X-IP-Token and its value changes depending on the user who makes a request to the ZeroGPU space.
Let’s say you want to create a space (Space A) that uses a ZeroGPU space (Space B) programmatically.
Normally, calling Space B from Space A with the Gradio Python client would quickly exhaust Space B’s rate limit, as all the requests to the ZeroGPU space
would be missing the X-IP-Token request header and would therefore be treated as unauthenticated.
In order to avoid this, we need to extract the X-IP-Token of the user using Space A before we call Space B programmatically.
Where possible, specifically in the case of functions that are passed into event listeners directly, Gradio automatically
extracts the X-IP-Token from the incoming request and passes it into the Gradio Client. But if the Client
is instantiated outside of such a function, then you may need to pass in the token manually.
How to do this will be explained in the following section.
Avoiding Rate Limits by Manually Passing an IP Token
In the following hypothetical example, when a user presses enter in the textbox, the generate() function
is called, which calls a second function, text_to_image(). Because the Gradio Client is being
instantiated indirectly, in text_to_image(), we will need to extract their token from the X-IP-Token header of the incoming request.
We will use this header when constructing the gradio client.
import gradio as gr
from gradio_client import Client
def text_to_image(prompt, request: gr.Request):
x_ip_token = request.headers['x-ip-token']
client = Client("hysts/SDXL", headers={"x-ip-token": x_ip_token})
img = client.predict(prompt, api_name="/predict")
return img
def generate(prompt, request: gr.Request):
prompt = prompt[:300]
return text_to_image(prompt, request)
with gr.Blocks() as demo:
image = gr.Image()
prompt = gr.Textbox(max_lines=1)
prompt.submit(generate, [prompt], [image])
demo.launch()