Resources/The Agent API Atlas/AI/Hugging Face

Everything an AI agent can do with the Hugging Face API.

A reference guide for building AI agents: every method, how to authenticate, and the permissions each one needs.

Endpoints32

API versionv1

Last updated23 June 2026

Orientation

How the Hugging Face API works.

The Hugging Face API is how an app or AI agent works with the Hugging Face Hub and the models on it: searching and reading models, datasets, and Spaces, creating and committing to a repository, and running a model for chat, embeddings, or image generation. Access is granted through a user access token, where a fine-grained token sets each scope to read or write on chosen repositories or organizations, and an agent is limited to what that token reaches. The Hub serves one continuously updated API, and it can push events to a webhook URL when a repository changes.

32Endpoints

8Capability groups

17Read

15Write

6Permissions

Authentication

Every write and every private read needs a user access token sent as 'Authorization: Bearer '. Three token roles exist: read, which reads repositories the user can see; write, which adds write access to repositories the user can write; and fine-grained, which sets individual scopes on chosen repositories or organizations. The same token authenticates the Hub API and Inference Providers. Hugging Face recommends fine-grained tokens for production, because a leaked token is limited to the scopes it was given.

Permissions

A fine-grained token carries named scopes that decide what each call can do. Repository scopes include repo.content.read for reading files and repo.write for creating, committing, moving, or deleting a repository; discussion.write covers discussions, Pull Requests, and comments. Inference is its own scope: inference.serverless.write calls models through Inference Providers, and inference.endpoints.write manages dedicated endpoints. Further scopes cover collections, organizations, and Jobs. A read or write token instead grants coarse access across every repository the account can reach.

Versioning

The Hub API has no dated version. It is a single, continuously updated API served under one path prefix, so there is no version header to pin and no version to migrate between. Notable changes are announced through dated release notes, and the OpenAPI specification, published at a well-known path, is the always-current machine reference. Inference Providers exposes an OpenAI-compatible surface for chat completions under its own path.

Data model

The Hub is organised around repositories, which are one of three types: a model, a dataset, or a Space. Each repository lives at a namespace and name, holds files under Git references such as branches, tags, and commits, and carries discussions and Pull Requests. Hub methods read and write these repositories and the account and organization data around them. A separate inference router runs the models themselves rather than touching repository contents.

Connect & authenticate

Connection & authentication methods.

How an app or AI agent connects to Hugging Face determines what it can reach. There are several routes, one for working with the Hub and its repositories, one for running models, and one for receiving events, each governed by the access token behind it and the scopes that token carries.

Ways to connect

Hub API

The Hub API answers at huggingface.co under the /api path. It reads and writes models, datasets, and Spaces, creates and commits to repositories, and manages the account and organization data around them. It is a single, continuously updated API with no version to pin.

Best forConnecting an app or AI agent to the Hugging Face Hub.

Governed byThe user access token and the scopes it carries.

Docs ↗

Inference Providers router

The inference router answers at router.huggingface.co and runs models across partner providers through one token. Its chat completions endpoint is OpenAI-compatible, so existing OpenAI client code can target it by swapping the base URL to the router's v1 path.

Best forRunning a model for chat, embeddings, or image generation.

Governed byThe user access token and the inference scope it carries.

Docs ↗

Webhooks

Webhooks deliver the chosen repository events to a receiver URL, and an optional secret on the X-Webhook-Secret header confirms each delivery came from Hugging Face. Webhooks are created and listed through the settings webhooks endpoints or the settings page.

Best forReacting to repository changes without polling.

Governed byThe user access token and the scopes it carries.

Docs ↗

MCP server (Model Context Protocol)

Hugging Face's first-party MCP server at huggingface.co/mcp lets an agent search and explore models, datasets, Spaces, and papers, search the documentation, run Jobs, and call community Gradio Spaces as tools. It authenticates with a Hugging Face token and supports streamable HTTP and server-sent-events transports.

Best forConnecting an MCP-compatible assistant to the Hub.

Governed byThe user access token and the scopes it carries.

Docs ↗

Authentication

Fine-grained access token

A fine-grained access token sets individual scopes, each read or write, on chosen repositories or a specific organization, such as repo.content.read on one model or inference.serverless.write for running models. It is the least-privilege choice and what Hugging Face recommends for production.

TokenFine-grained access token

Best forLeast-privilege access to specific repositories or scopes

Docs ↗

Read token

A read token grants read access to every repository the account can see, public and private, across the user and the organizations they belong to. It cannot write, which suits downloading models or running inference.

TokenRead access token

Best forDownloading content or running inference

Docs ↗

Write token

A write token adds write access to every repository the account can write, on top of read. It is coarse, all-or-nothing access, which suits a trusted local workflow more than a shared production agent.

TokenWrite access token

Best forPushing content from a trusted workflow

Docs ↗

Capability map

What an AI agent can do in Hugging Face.

The Hugging Face API is split into areas an agent can act on, such as models, datasets, Spaces, repository management, inference, and webhooks. Each area has its own methods and its own scopes, and some grant access to far more than others.

Models

3 endpoints

Search and list models, read a single model's details, and read its model tags.

These are read methods over public and accessible model repositories.

View endpoints →

Datasets

3 endpoints

Search and list datasets, read a single dataset's details, and read its dataset tags.

These are read methods over public and accessible dataset repositories.

View endpoints →

Spaces

2 endpoints

Search and list Spaces and read a single Space's details.

These are read methods over public and accessible Space repositories.

View endpoints →

Repository management

9 endpoints

Create, move, rename, and delete repositories, update their visibility, list commits and references, read the file tree, and commit files.

Writes here change real repository data, and deleting a repository removes it.

View endpoints →

Discussions & Pull Requests

3 endpoints

List a repository's discussions and Pull Requests, create a discussion, and add a comment.

Writes here are visible to everyone who can see the repository.

View endpoints →

Inference

4 endpoints

Run a model through the provider router for chat completions, feature extraction embeddings, and text-to-image generation, and list available models.

Inference calls run models and are metered for billing.

View endpoints →

Account & collections

3 endpoints

Read the authenticated user, and list and create collections.

Writes here change the account's own collections.

View endpoints →

Webhooks

5 endpoints

List, read, create, update, and delete the account's webhooks.

Writes here change which events are delivered and where.

View endpoints →

Endpoint reference

Every Hugging Face API method.

Filter by method, access, or permission, or search any path. Select a row for version detail, rate limits, the related webhook event, and the source.

Hide deprecated

Method	Endpoint	What it does	Access	Permission	Version
Models Search and list models, read a single model's details, and read its model tags.3
GET	`/api/models`	List and search models, filtered by search text, author, or tags, and sorted.	read	`repo.content.read`	Current
Public models are listed without a token. A read or fine-grained token with repo.content.read is needed to include private models the account can see. Paginated through the Link header, with search, author, filter, sort, limit, and full parameters. Acts onmodel Permission (capability)`repo.content.read` VersionAvailable since the API’s base version Webhook eventNone Rate limitStandard limits apply SourceOfficial documentation ↗
GET	`/api/models/{repo_id}`	Get all information for a single model, optionally at a specific revision.	read	`repo.content.read`	Current
Equivalent to model_info in the Python client. A revision can be appended as /revision/{revision}. Public models need no token; private ones need repo.content.read. Acts onmodel Permission (capability)`repo.content.read` VersionAvailable since the API’s base version Webhook eventNone Rate limitStandard limits apply SourceOfficial documentation ↗
GET	`/api/models-tags-by-type`	Get all the available model tags hosted on the Hub.	read	—	Current
A read-only catalogue of tags, returned without a token. Acts ontag Permission (capability)None required VersionAvailable since the API’s base version Webhook eventNone Rate limitStandard limits apply SourceOfficial documentation ↗
Datasets Search and list datasets, read a single dataset's details, and read its dataset tags.3
GET	`/api/datasets`	List and search datasets, filtered by search text, author, or tags, and sorted.	read	`repo.content.read`	Current
Public datasets are listed without a token; repo.content.read includes private ones. Paginated through the Link header with the same search, author, filter, sort, limit, and full parameters as models. Acts ondataset Permission (capability)`repo.content.read` VersionAvailable since the API’s base version Webhook eventNone Rate limitStandard limits apply SourceOfficial documentation ↗
GET	`/api/datasets/{repo_id}`	Get all information for a single dataset, optionally at a specific revision.	read	`repo.content.read`	Current
Equivalent to dataset_info in the Python client. A revision can be appended as /revision/{revision}. Acts ondataset Permission (capability)`repo.content.read` VersionAvailable since the API’s base version Webhook eventNone Rate limitStandard limits apply SourceOfficial documentation ↗
GET	`/api/datasets-tags-by-type`	Get all the available dataset tags hosted on the Hub.	read	—	Current
A read-only catalogue of tags, returned without a token. Acts ontag Permission (capability)None required VersionAvailable since the API’s base version Webhook eventNone Rate limitStandard limits apply SourceOfficial documentation ↗
Spaces Search and list Spaces and read a single Space's details.2
GET	`/api/spaces`	List and search Spaces, filtered by search text, author, or tags, and sorted.	read	`repo.content.read`	Current
Public Spaces are listed without a token; repo.content.read includes private ones. Paginated through the Link header. Acts onspace Permission (capability)`repo.content.read` VersionAvailable since the API’s base version Webhook eventNone Rate limitStandard limits apply SourceOfficial documentation ↗
GET	`/api/spaces/{repo_id}`	Get all information for a single Space, optionally at a specific revision.	read	`repo.content.read`	Current
Equivalent to space_info in the Python client. A revision can be appended as /revision/{revision}. Acts onspace Permission (capability)`repo.content.read` VersionAvailable since the API’s base version Webhook eventNone Rate limitStandard limits apply SourceOfficial documentation ↗
Repository management Create, move, rename, and delete repositories, update their visibility, list commits and references, read the file tree, and commit files.9
POST	`/api/repos/create`	Create a repository, a model by default, or a dataset or Space.	write	`repo.write`	Current
The body sets type, name, organization, private, and, for a Space, sdk. Needs repo.write on a fine-grained token, or a write token. Subject to a separate, undocumented repository-creation limit. Acts onrepository Permission (capability)`repo.write` VersionAvailable since the API’s base version Webhook event`repo` Rate limitStandard limits apply SourceOfficial documentation ↗
DELETE	`/api/repos/delete`	Delete a repository, a model by default, or a dataset or Space.	write	`repo.write`	Current
The body sets type, name, and organization. Deleting a repository removes it and its files. Needs repo.write or a write token. Acts onrepository Permission (capability)`repo.write` VersionAvailable since the API’s base version Webhook event`repo` Rate limitStandard limits apply SourceOfficial documentation ↗
POST	`/api/repos/move`	Move a repository: rename it or transfer it from a user to an organization.	write	`repo.write`	Current
The body sets fromRepo, toRepo, and type. Needs repo.write or a write token. Acts onrepository Permission (capability)`repo.write` VersionAvailable since the API’s base version Webhook event`repo` Rate limitStandard limits apply SourceOfficial documentation ↗
PUT	`/api/repos/{repo_type}/{repo_id}/settings`	Update a repository's settings, such as its visibility.	write	`repo.write`	Current
Changing visibility to or from private is recorded as a repo.config update event. Needs repo.write or a write token. Acts onrepository Permission (capability)`repo.write` VersionAvailable since the API’s base version Webhook event`repo.config` Rate limitStandard limits apply SourceOfficial documentation ↗
GET	`/api/models/{namespace}/{repo}/commits/{rev}`	List commits on a model repository at a given revision.	read	`repo.content.read`	Current
The same path shape exists under /api/datasets and /api/spaces. Public repositories need no token. Acts oncommit Permission (capability)`repo.content.read` VersionAvailable since the API’s base version Webhook eventNone Rate limitStandard limits apply SourceOfficial documentation ↗
GET	`/api/models/{namespace}/{repo}/refs`	List the references, branches and tags, of a model repository.	read	`repo.content.read`	Current
The same path shape exists under /api/datasets and /api/spaces. Public repositories need no token. Acts onreference Permission (capability)`repo.content.read` VersionAvailable since the API’s base version Webhook eventNone Rate limitStandard limits apply SourceOfficial documentation ↗
GET	`/api/models/{namespace}/{repo}/tree/{rev}/{path}`	List the files and folders of a model repository at a given revision and path.	read	`repo.content.read`	Current
The same path shape exists under /api/datasets and /api/spaces. Public repositories need no token. Acts onfile Permission (capability)`repo.content.read` VersionAvailable since the API’s base version Webhook eventNone Rate limitStandard limits apply SourceOfficial documentation ↗
POST	`/api/models/{namespace}/{repo}/preupload/{rev}`	Check the upload method for files before committing, deciding standard or Git LFS.	write	`repo.write`	Current
The first half of an upload: it tells the client whether content goes through Git LFS. Needs repo.write or a write token. Acts onfile Permission (capability)`repo.write` VersionAvailable since the API’s base version Webhook eventNone Rate limitStandard limits apply SourceOfficial documentation ↗
POST	`/api/models/{namespace}/{repo}/commit/{rev}`	Commit files to a model repository at a given revision.	write	`repo.write`	Current
Records the upload prepared by preupload as a commit, firing a repo.content update event. The same path shape exists under /api/datasets and /api/spaces. Subject to a separate, undocumented commit limit. Needs repo.write or a write token. Acts oncommit Permission (capability)`repo.write` VersionAvailable since the API’s base version Webhook event`repo.content` Rate limitStandard limits apply SourceOfficial documentation ↗
Discussions & Pull Requests List a repository's discussions and Pull Requests, create a discussion, and add a comment.3
GET	`/api/{repoType}/{namespace}/{repo}/discussions`	List the discussions and Pull Requests on a repository.	read	`repo.content.read`	Current
On the Hub, a Pull Request is a special type of discussion. Public repositories need no token. Acts ondiscussion Permission (capability)`repo.content.read` VersionAvailable since the API’s base version Webhook eventNone Rate limitStandard limits apply SourceOfficial documentation ↗
POST	`/api/{repoType}/{namespace}/{repo}/discussions`	Create a new discussion or Pull Request on a repository.	write	`discussion.write`	Current
Fires a discussion create event. Needs discussion.write on a fine-grained token, or a write token. Subject to a separate, undocumented discussion limit. Acts ondiscussion Permission (capability)`discussion.write` VersionAvailable since the API’s base version Webhook event`discussion` Rate limitStandard limits apply SourceOfficial documentation ↗
POST	`/api/{repoType}/{namespace}/{repo}/discussions/{num}/comment`	Add a comment to a discussion or Pull Request.	write	`discussion.write`	Current
Fires a discussion.comment create event. Needs discussion.write or a write token. Acts oncomment Permission (capability)`discussion.write` VersionAvailable since the API’s base version Webhook event`discussion.comment` Rate limitStandard limits apply SourceOfficial documentation ↗
Inference Run a model through the provider router for chat completions, feature extraction embeddings, and text-to-image generation, and list available models.4
POST	`/v1/chat/completions`	Run a model for chat completions through the provider router, OpenAI-compatible.	write	`inference.serverless.write`	Current
Served at router.huggingface.co, not the Hub host. Drop-in OpenAI-compatible: swap the base URL. A model id can carry a provider or policy suffix such as :fastest or :cheapest. Needs inference.serverless.write on a fine-grained token. Acts oncompletion Permission (capability)`inference.serverless.write` VersionAvailable since the API’s base version Webhook eventNone Rate limitStandard limits apply SourceOfficial documentation ↗
GET	`/v1/models`	List the models available across providers, with pricing, context length, and throughput.	read	`inference.serverless.write`	Current
Served at router.huggingface.co. Lists models reachable through the router, including per-provider pricing and performance where available. Acts onmodel Permission (capability)`inference.serverless.write` VersionAvailable since the API’s base version Webhook eventNone Rate limitStandard limits apply SourceOfficial documentation ↗
POST	`/models/{repo_id}/pipeline/feature-extraction`	Run a model for feature extraction, returning embeddings for text.	write	`inference.serverless.write`	Current
Served through the inference router. Feature extraction returns embeddings for semantic search, retrieval, and recommendation. Needs inference.serverless.write. Acts onembedding Permission (capability)`inference.serverless.write` VersionAvailable since the API’s base version Webhook eventNone Rate limitStandard limits apply SourceOfficial documentation ↗
POST	`/models/{repo_id}/pipeline/text-to-image`	Run a model to generate an image from a text prompt.	write	`inference.serverless.write`	Current
Served through the inference router. Returns the generated image bytes. Needs inference.serverless.write. Acts onimage Permission (capability)`inference.serverless.write` VersionAvailable since the API’s base version Webhook eventNone Rate limitStandard limits apply SourceOfficial documentation ↗
Account & collections Read the authenticated user, and list and create collections.3
GET	`/api/whoami-v2`	Get information about the user or organization behind the token.	read	—	Current
Identifies the account the token belongs to and returns its orgs and the token's permissions. Any valid token works. Acts onuser Permission (capability)None required VersionAvailable since the API’s base version Webhook eventNone Rate limitStandard limits apply SourceOfficial documentation ↗
GET	`/api/collections`	List collections, filtered by owner, item, or search text.	read	`collection.read`	Current
Public collections are returned without a token; collection.read includes private ones. Acts oncollection Permission (capability)`collection.read` VersionAvailable since the API’s base version Webhook eventNone Rate limitStandard limits apply SourceOfficial documentation ↗
POST	`/api/collections`	Create a collection of models, datasets, Spaces, or papers.	write	`collection.write`	Current
Needs collection.write on a fine-grained token, or a write token. Acts oncollection Permission (capability)`collection.write` VersionAvailable since the API’s base version Webhook eventNone Rate limitStandard limits apply SourceOfficial documentation ↗
Webhooks List, read, create, update, and delete the account's webhooks.5
GET	`/api/settings/webhooks`	List the webhooks configured on the account.	read	—	Current
Returns the account's webhooks, their watched repositories, and their target URLs. A valid token for the account is required. Acts onwebhook Permission (capability)None required VersionAvailable since the API’s base version Webhook eventNone Rate limitStandard limits apply SourceOfficial documentation ↗
POST	`/api/settings/webhooks`	Create a webhook that delivers chosen repository events to a URL.	write	—	Current
Sets the watched repositories or namespaces, the target URL, and an optional secret sent back as the X-Webhook-Secret header. Each webhook is limited to 1,000 triggers per 24 hours. Acts onwebhook Permission (capability)None required VersionAvailable since the API’s base version Webhook eventNone Rate limitStandard limits apply SourceOfficial documentation ↗
GET	`/api/settings/webhooks/{webhookId}`	Get a single webhook by id.	read	—	Current
Returns one webhook's watched repositories, target URL, and status. A valid token for the account is required. Acts onwebhook Permission (capability)None required VersionAvailable since the API’s base version Webhook eventNone Rate limitStandard limits apply SourceOfficial documentation ↗
POST	`/api/settings/webhooks/{webhookId}`	Update a webhook's watched repositories, target URL, or secret.	write	—	Current
Changes which events are delivered and where. A valid token for the account is required. Acts onwebhook Permission (capability)None required VersionAvailable since the API’s base version Webhook eventNone Rate limitStandard limits apply SourceOfficial documentation ↗
DELETE	`/api/settings/webhooks/{webhookId}`	Delete a webhook by id.	write	—	Current
Stops all delivery for that webhook. A valid token for the account is required. Acts onwebhook Permission (capability)None required VersionAvailable since the API’s base version Webhook eventNone Rate limitStandard limits apply SourceOfficial documentation ↗

No endpoints match those filters.

Webhooks

Webhook events.

Hugging Face can notify an app or AI agent when something happens to a repository, instead of the app repeatedly asking. Hugging Face posts the event payload to a webhook URL that has been registered for the chosen repositories and events.

Event	What it signals	Triggered by
`repo`	Global events on a repository. The action is one of create, delete, update, or move, fired when a repository is created, deleted, renamed, or has its details change.	`/api/repos/create` `/api/repos/delete` `/api/repos/move`
`repo.content`	Events on a repository's content, such as new commits or tags, including the commit created when a Pull Request opens. The action is always update, and the payload lists the references that changed.	`/api/models/{namespace}/{repo}/commit/{rev}`
`repo.config`	Events on a repository's config, such as updating Space secrets, settings, or visibility. The action is always update, and the payload carries the updated config keys.	`/api/repos/{repo_type}/{repo_id}/settings`
`discussion`	Creating a discussion or Pull Request, updating its title or status, or merging it. The action is one of create, delete, or update.	`/api/{repoType}/{namespace}/{repo}/discussions`
`discussion.comment`	Creating, updating, or hiding a comment on a discussion or Pull Request. The action is one of create or update.	`/api/{repoType}/{namespace}/{repo}/discussions/{num}/comment`

No events match that search.

Rate limits & pagination

Rate limits, pagination & request size.

Hugging Face limits how fast an app or AI agent can call, through a request quota counted over a rolling five-minute window that depends on the account tier behind the token, with separate, higher quotas for downloading repository files.

Request rate

Hugging Face counts requests in three buckets over a rolling five-minute window. The Hub APIs bucket covers calls like search, repository creation, and user management; the Resolvers bucket covers file downloads and carries a much higher quota; and the Pages bucket covers web pages. Quotas rise with the account tier behind the token: an anonymous caller gets about 500 Hub API requests per window per IP address, a free user 1,000, a PRO user 2,500, a Team organization 3,000, and Enterprise plans from 6,000 up to 100,000 when organization IP ranges are set. Going over returns a 429, and the RateLimit and RateLimit-Policy response headers report the remaining quota and the seconds until reset. Certain actions, such as repository creation, commits, and discussions, carry their own separate, undocumented limits. Each webhook is capped at 1,000 triggers per 24 hours.

Pagination

List endpoints, such as listing models, datasets, or Spaces, are paginated and return a Link header with a rel="next" URL, which should be followed rather than built by hand, until it is absent. A limit parameter caps the number of results fetched, and a full parameter requests the fuller record for each item. The Python client follows the Link header automatically.

Request size

Hub API requests and responses are JSON. Large files are not sent inline: an upload first calls the preupload check to decide whether content goes through Git LFS or the standard path, then a commit call records it, so file size is handled by the storage layer rather than a single request body limit. Listing endpoints return trimmed records by default and the fuller record only when full is requested.

Errors

Status codes & error handling.

The status codes an agent should handle, and what to do about each.

Status	Code	Meaning	What to do
401	`Unauthorized`	Authentication is missing, or the access token is invalid or has been deleted.	Send a valid token in the Authorization header as 'Bearer '.
403	`Forbidden`	The token is valid but lacks the scope for this call, or an organization has denied, revoked, or restricted it. A read or write token used where the organization requires a fine-grained token is also rejected here.	Grant the missing scope, or have an organization administrator approve the token.
404	`Not Found`	The repository does not exist, or the token cannot see a private repository. A private repository is returned as 404 rather than 403 so that its existence is not confirmed.	Confirm the repository id and that the token has access to it.
429	`Too Many Requests`	A rate limit was exceeded for the current five-minute window. The RateLimit and RateLimit-Policy response headers report the remaining quota and the seconds until it resets.	Wait until the window resets, spread requests out, pass a token, or upgrade the account tier.

Versioning & freshness

Version history.

The Hub API is served under a single, continuously updated version. There is no dated version to pin; changes ship through dated release notes rather than versioned endpoints.

Version history

What changed, and when

Latest versionv1

v1Current version

Unversioned Hub API, OpenAI-compatible inference router

The Hub API is served under a single, continuously updated version with no dated version header to pin. The machine-readable reference moved to an OpenAPI specification published at a well-known path and an OpenAPI Playground in late 2025. Inference Providers exposes an OpenAI-compatible router under a v1 path for chat completions, so existing OpenAI client code can target Hugging Face by swapping the base URL.

What changed

Hub API reference moved to an always-current OpenAPI specification and Playground
Inference Providers router is OpenAI-compatible for chat completions, with model listing at /v1/models
Rate limits standardised into Hub APIs, Resolvers, and Pages buckets over five-minute windows

2025-06-06Feature update

Official Hugging Face MCP server

Hugging Face launched a first-party MCP server at huggingface.co/mcp, letting an AI assistant search and explore models, datasets, Spaces, and papers, search the documentation, run Jobs, and call community Gradio Spaces as tools, authenticated with a Hugging Face token.

What changed

First-party hosted MCP server at huggingface.co/mcp
Built-in tools for model, dataset, Space, paper, and documentation search
Streamable HTTP and server-sent-events transports

2025-01-28Feature update

Inference Providers launched

Hugging Face launched Inference Providers, a unified router that runs hundreds of models across partner providers through one Hugging Face token, with an OpenAI-compatible chat completions endpoint and native Python and JavaScript clients.

What changed

Single token routes to partner providers through one proxy
OpenAI-compatible chat completions endpoint for drop-in migration
A free tier with extra credits for PRO, Team, and Enterprise accounts

Because the API is unversioned, an integration tracks behaviour through the release notes rather than pinning a version.

Hugging Face changelog ↗

Questions

Hugging Face API, answered.

Read token, write token, or fine-grained, which should I use?+

A fine-grained token is the better default for an agent or production app. A read token can read every repository the account can see and a write token can write every repository it can write, both all-or-nothing, while a fine-grained token can be limited to specific repositories or a specific organization with each scope set to read or write, such as only repo.content.read on one model. Fine-grained tokens are what Hugging Face recommends for production, since a leaked token is confined to the scopes it was granted.

Is the same token used for the Hub and for running models?+

Yes. One user access token authenticates both the Hub API, for working with repositories, and Inference Providers, for running models, each sent as a bearer token. The scopes differ: reading and writing repositories use the repo scopes, while calling a model through the router needs the inference.serverless.write scope on a fine-grained token. A single token can hold both.

What are the rate limits?+

Hugging Face counts requests over a rolling five-minute window in three buckets: Hub APIs, Resolvers for file downloads, and Pages. The Hub API quota depends on the account tier, from about 500 requests per window for an anonymous caller and 1,000 for a free user up to several thousand for PRO and Enterprise plans. Exceeding a quota returns a 429, and the RateLimit response header gives the seconds until reset. Passing a token, rather than calling anonymously, is the most common fix for being rate limited.

How do I receive repository events instead of polling?+

Webhooks deliver events without polling. A webhook is registered against chosen repositories or whole namespaces and a set of scopes, such as repo for repository changes, repo.content for new commits and tags, discussion for Pull Requests and discussions, and discussion.comment for comments. Hugging Face posts a JSON payload when each event fires, and an optional secret, sent as the X-Webhook-Secret header, confirms the payload came from Hugging Face. Each webhook is limited to 1,000 triggers per 24 hours.

How does the API handle versions?+

There is no dated version to pin. The Hub serves a single, continuously updated API, so there is no version header and no migration between versions. Notable changes are announced through dated release notes, and the OpenAPI specification at the well-known path is the always-current machine reference. An integration tracks behaviour through the release notes rather than pinning a version.

Does Hugging Face have an official MCP server?+

Yes. The Hugging Face MCP server, launched in June 2025, is a first-party hosted server at huggingface.co/mcp that lets an AI assistant search and explore models, datasets, Spaces, and papers, search the documentation, run Jobs, and call community Gradio Spaces as tools. It authenticates with a Hugging Face token and supports the streamable HTTP and server-sent-events transports.

What is Bollard AI?

Control what every AI agent can do in Hugging Face.

Bollard AI sits between a team's AI agents and Hugging Face. Grant each agent exactly the access it needs, read or write, area by area, and every call is checked and logged.

Set read, write, or full access per agent, never a shared Hugging Face token.
Denied by default, so an agent reaches only what has been explicitly allowed.
Every call recorded in plain English: who, what, where, and the decision.

Control Hugging Face access in Bollard Browse all APIs →

Hugging Face

Model Ops Agent

Search models and datasets ResourceOffReadFull use

Run inference ActionOffReadFull use

Commit files to a repository ActionOffReadFull use

Delete repositories ResourceOffReadFull use

Per-agent access, set in Bollard AI, not in Hugging Face

How the Hugging Face API works.

Connection & authentication methods.

Hub API

Inference Providers router

Webhooks

MCP server (Model Context Protocol)

Fine-grained access token

Read token

Write token

What an AI agent can do in Hugging Face.

Models

Datasets

Spaces

Repository management

Discussions & Pull Requests

Inference

Account & collections

Webhooks

Every Hugging Face API method.

Models

Datasets

Spaces

Repository management

Discussions & Pull Requests

Inference

Account & collections

Webhooks

Webhook events.

Rate limits, pagination & request size.

Request rate

Pagination

Request size

Status codes & error handling.

Version history.

What changed, and when

Hugging Face API, answered.

More ai API guides for agents

ElevenLabs

Replicate

Google Gemini

OpenAI

Anthropic

Cohere

Control what every AI agent can do in Hugging Face.