Skip to content

[Blog post] - How Hugging Face Scaled Secrets Management for AI Infrastructure #2657

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 25 commits into from
Apr 3, 2025
Merged
Show file tree
Hide file tree
Changes from 6 commits
Commits
Show all changes
25 commits
Select commit Hold shift + click to select a range
363ebd2
Add case study on scaling secrets management for AI infrastructure
thomas-infisical Feb 7, 2025
96d080a
thumbnail
thomas-infisical Feb 7, 2025
7433c6e
blog entry
thomas-infisical Feb 7, 2025
1bc44ba
Update organization
thomas-infisical Feb 10, 2025
094e768
Update organization
thomas-infisical Feb 10, 2025
4e8717e
Merge branch 'main' into main
thomas-infisical Feb 10, 2025
06905ff
reworked blogpost
thomas-infisical Feb 20, 2025
0bbcdc0
Merge remote-tracking branch 'upstream/main'
thomas-infisical Feb 21, 2025
1ece503
Merge branch 'main' into main
thomas-infisical Mar 4, 2025
a59157b
Merge branch 'main' into main
thomas-infisical Mar 4, 2025
fb8e6e5
fix typo
thomas-infisical Mar 4, 2025
93b8b6b
Merge branch 'main' into main
thomas-infisical Mar 10, 2025
3a73f31
Update _blog.yml
thomas-infisical Mar 11, 2025
f407b95
Merge branch 'main' into main
thomas-infisical Mar 11, 2025
24bbe8a
Update _blog.yml
thomas-infisical Mar 11, 2025
6d85470
Update _blog.yml
julien-c Mar 11, 2025
99172bb
Update _blog.yml
julien-c Mar 11, 2025
666cd4a
Apply suggestions from code review
julien-c Mar 11, 2025
99022a2
Update _blog.yml
julien-c Mar 11, 2025
8be43a5
update _blog.yml
thomas-infisical Mar 13, 2025
a73b983
Merge branch 'main' into main
thomas-infisical Mar 13, 2025
413d78b
Merge branch 'main' into main
pcuenca Mar 18, 2025
1ed6cdd
final update
thomas-infisical Mar 31, 2025
0b8955a
Merge upstream/main and resolve conflicts in _blog.yml
thomas-infisical Mar 31, 2025
666183b
Merge branch 'main' into pr/thomas-infisical/2657
thomas-infisical Mar 31, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
14 changes: 14 additions & 0 deletions _blog.yml
Original file line number Diff line number Diff line change
Expand Up @@ -5472,6 +5472,19 @@
- research
- smolagents

- local: scaling-secrets-management
title: "How Hugging Face Scaled Secrets Management for AI Infrastructure"
thumbnail: /blog/assets/infisical/thumbnail.png
author: segudev
guest: true
date: Feb 10, 2025
tags:
- secrets
- security
- shift-left
- infrastructure
- open-source

- local: leaderboard-arabic-v2
title: "The Open Arabic LLM Leaderboard 2"
thumbnail: /blog/assets/leaderboards-on-the-hub/thumbnail_arabic.png
Expand All @@ -5484,3 +5497,4 @@
- leaderboard
- LLM
- arabic

Binary file added assets/infisical/thumbnail.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
103 changes: 103 additions & 0 deletions scaling-secrets-management.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,103 @@
---
title: "How Hugging Face Scaled Secrets Management for AI Infrastructure"
thumbnail: /blog/assets/infisical/thumbnail.png
authors:
- user: segudev
org: Infisical
---

# How Hugging Face Scaled Secrets Management for AI Infrastructure
Managing secrets at scale becomes increasingly complex as infrastructure grows. For Hugging Face, this challenge intensified as their platform scaled to support over 4 million AI builders deploying models on the Hub. TThis case study explores how they approached secrets management to support their growing infrastructure needs.

## Technical Challenge
As Hugging Face's infrastructure scaled to support millions of model deployments, their infrastructure and engineering teams identified security and operationnal challenges.

### Security Risk Management
Being at the forefront of AI development, Hugging Face needed to ensure their security infrastructure exceeded industry standards. This included:
- Maintaining tight access controls across their infrastructure
- Implementing a "Security Shift Left" approach
- Establishing comprehensive audit capabilities

### Secret Sprawl
With increasing infrastructure complexity and new engineering projects, [secret sprawl](https://infisical.com/blog/what-is-secret-sprawl) became a significant concern. The team needed to:
- Automate secrets management processes
- Streamline secret deployment workflows
- Establish a single source of truth for credentials

### Developer Experience
Supporting a large engineering team required maintaining developer productivity through:
- Self-serve secret management workflows
- Efficient developer onboarding processes
- Streamlined local development setup

## Solution
To solve the above, Hugging Face partnered with [Infisical](https://infisical.com/) to centralize its secrets management workflows and establish a single source of truth for infrastructure credentials, with several key technical components involved:

### Kubernetes Integration
With Kubernetes being central to Hugging Face's infrastructure, they implemented Infisical's [Kubernetes Operator](https://infisical.com/docs/integrations/platforms/kubernetes) to:
- Automatically propagate secrets to containers
- Handle application redeployments based on secret updates
- Maintain consistent secret management across clusters

### Local Development Workflow
For local development environments, the team utilized the [Infisical CLI](https://infisical.com/docs/cli/usage) to:

- [Inject secrets](https://infisical.com/docs/cli/commands/run) into local application environments
- Eliminate the need for local [.env files](https://infisical.com/blog/stop-using-env-files)
- Reduce security risks from secrets on local machines

### Centralized Management
The team established a central secrets management system using:

- A [web dashboard](https://infisical.com/docs/documentation/platform/project) enabling self-serve secrets management
- [Role-based access controls](https://infisical.com/docs/documentation/platform/access-controls/role-based-access-controls#role-based-access-controls) for different teams
- [Secret referencing and importing](https://infisical.com/docs/documentation/platform/secret-reference) capabilities for maintaining a single source of truth across infrastructure.
- [Secret Sharing](https://infisical.com/docs/documentation/platform/secret-sharing) to generate encrypted links to share secrets with each other or with stakeholders outside of the organization.

## Results and Impact

With the help of Infisical, Hugging Face was able increase both operational efficiency and security posture through centralized secrets management.

### Developer Workflow Efficiency
The new system improved development workflows through:

- Self-serve secrets management based on permissions. This saves developers time and speeds up development iterations.
- Faster developer onboarding: new engineers are now able to immediately get up and running with access to the necessary environments.
- Synchronized secrets across team environments: engineers easily check out the right environment and start their applications locally.
- Automated application redeployments: using Infisical, Hugging Face is able to automatically redeploy their applications based on secret changes in various environments.

### Security Improvements

Security is often a matter of making the secure path the easiest path. Beyond all the points mentioned above, the following measures helped strengthen Hugging Face's security posture regarding secrets management:

- Implemented tight and granular access controls
- Established comprehensive audit logging
- Integrated secure authentication methods
- Enhanced security through centralized management

### Security Culture Enhancement

Finally, the implementation helped foster better security practices by:

- Enabling secure secret sharing via encrypted channels
- Promoting responsible coding practices
- Implementing permission-based access controls

## Technical Insights
As noted by **Adrien Carreira**, Head of Infrastructure at Hugging Face:

> "Infisical provided all the functionality and security settings we needed to boost our security posture and save engineering time. Whether you're working locally, running kubernetes clusters in production, or operating secrets within CI/CD pipelines, Infisical has a seamless prebuilt workflow."

The implementation demonstrated that proper secrets management can simultaneously enhance security and developer productivity - a rare combination in infrastructure tooling.

## Resources
For teams looking to implement similar solutions:

- [Platform Documentation](https://infisical.com/docs/documentation/platform/organization)
- [CLI Reference](https://infisical.com/docs/cli/overview)
- [Kubernetes Integration Guide](https://infisical.com/docs/integrations/platforms/kubernetes/overview)
- [Secret Reference Documentation](https://infisical.com/docs/documentation/platform/secret-reference)

---

*This technical case study was adapted from the original customer story published at [infisical.com/customers/hugging-face](https://infisical.com/customers/hugging-face)*