# cache

```python
@gradio.cache(···)
```

### Description

Decorator that auto-caches function results based on content-hashed inputs. Works with sync/async functions and sync/async generators. For generators, all yielded values are cached and replayed on hit. Cache hits bypass the Gradio queue.

### Example Usage

```python
import gradio as gr

@gr.cache
def classify(image):
    return model.predict(image)

@gr.cache(max_size=256, per_session=True)
def generate(prompt):
    return llm(prompt)
```

### Initialization

| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `fn` | `Callable \| None` | `None` | The function to cache. When used as @gr.cache without parentheses, this is the decorated function. When used as @gr.cache(...), this is None. |
| `key` | `Callable \| None` | `None` | Optional function that receives the kwargs dict and returns a hashable cache key, e.g. to only cache based on the prompt, pass in: lambda kw: kw["prompt"] |
| `max_size` | `int` | `128` | Maximum number of cache entries. Least-recently-used entries are evicted when full. Set to 0 for unlimited. Default: 128. |
| `max_memory` | `str \| int \| None` | `None` | Maximum total memory usage before eviction. Accepts strings like "512mb", "2gb" or integer bytes. When exceeded, least-recently-used entries are evicted. If None, no memory limit is applied. If both max_size and max_memory are set, the cache will evict entries when either limit is reached. |
| `per_session` | `bool` | `False` | When True, each user session gets an isolated cache namespace, preventing cached results from leaking between users. Per-session entries are cleared when the client session disconnects. The max_size and max_memory limits apply to the sum of all entries across all sessions. |
- [Caching](https://www.gradio.app/guides/caching)