How we built it: a zero-trust architecture for cloud development environments

Christian Weichel / Co-Founder, CTO at Gitpod / Oct 2, 2024

On October 1, 2024, we launched Gitpod Flex, the first automation platform for zero-trust development environments. This means, no user, device, or network is automatically trusted, whether requesting from inside or outside our architectural perimeter. Every request is authenticated, authorized, and continuously validated before granting or maintaining access to applications and data.

In this article, we go into technical detail on how we built the system to adhere to these principles, including:

Building around the ‘principals concept’ for users, runners, environments, and accounts
Why our management plane is the only entity that can attest identities through JWT tokens.
How the system attaches identities to every action through the Go contexts.
How and why we manage a multi-tenancy model using organizations.

All using Go, Postgres, gRPC/connect, ent, and testcontainers.

Gitpod architecture overview: runners and management plane

Before we get into the details of how our zero trust and identity models work, let’s recap how the Gitpod architecture works. Gitpod Flex is a self-hosted cloud development environment and has two main components: the management plane and the development environment runners. This architecture guarantees isolation of sensitive assets like source code secrets or internal network access.

Gitpod Flex Architecture

Runners are single-tenant, self-hosted, flexible orchestrators of remote development environments. They are self-hosted in order to compartmentalize any sensitive information related to your development environments. Runners are responsible for operational tasks like

Scaling
Backup
Caching
Version updating

of environments and themselves.

To reduce operational overhead, runners offload all non-sensitive administration responsibilities to the management plane.

An organization may have as many runners as needed, and deploy them in any region or availability zone to support remote teams in different timezones, adhere to data sovereignty, and other compliance requirements.

The management plane is hosted by Gitpod and serves as the central administrative hub. It contains the:

Management interface UI
Handles user authentication (including SSO and OAuth configurations)
Allows for setting policies and viewing system logs

Runners must connect to the management plane to receive commands like starting or stopping a development environment.

Building a zero-trust identity model for Gitpod Flex

The system recognizes four types of principals: users, runners, environments, and accounts. Each can act independently within the system. Environments have their own identity, allowing for proper attribution of their actions. Users are created when an account joins an organization (more on how organizations are handled later). And finally, a subject is composed of a principal type and an ID. When combined with an organization ID, this becomes the identity that every action in the system is associated with.

This model forms the foundation for implementing zero-trust principles by associating every action with a specific, granular identity. The system can enforce the core tenet of zero-trust: ‘never trust, always verify’. Each request must be authenticated and authorized based on its unique identity, eliminating implicit trust based on network location or principal type. Access decisions are made on a per-request basis with full context of who is making the request, from where, and for what resource, enabling us to implement fine-grained access controls, comprehensive audit logging, and least-privilege access policies across all interactions within the platform.

How is authentication and authorization handled?

The authentication system in Gitpod’s architecture relies on JWT (JSON Web Tokens) as bearer tokens. The system expects an ‘Authorization’ header with a bearer token for authenticated requests where all tokens issued by the system are JWTs. These tokens are then signed but not encrypted, as the information within them is not secret. The management plane is the only entity in the system authorized to issue tokens, and this centralization of token issuance helps maintain security and consistency. Tokens contain identity information, including principal type, ID, and organization ID - but no PII such as emails or names. For users logged into multiple organizations, an account-level access token is used, which we’ll cover in more detail later.

This authentication model supports a zero-trust approach by ensuring that every request, regardless of its origin within the system, must present valid credentials. The use of JWTs allows for stateless authentication, while the centralization of token issuance to the management plane provides a single point of control for managing authentication across the system.

Building multi-tenancy into organizations

Our multi-tenancy approach allows Gitpod to securely support multiple organizations on the same infrastructure while maintaining strict isolation between them, and providing flexibility for users who need to work across multiple organizations. This design balances security and isolation with usability and flexibility for cross-organization workflows.

AWS is a good example of a system where you can only be logged into one account at a time. For those who have worked with AWS, you know how challenging this approach can be, though it does bring a stricter separation between accounts, as users are required to explicitly switch between them. Google on the other hand, is an example of a system where you can be logged into multiple accounts simultaneously, allowing users to easily work across different workspaces or accounts without needing to log out and log back in. We chose to implement a system more similar to Google’s approach, allowing users to be logged into multiple organizations simultaneously, prioritizing user experience.

Multi-tenancy in Gitpod is implemented through a concept called ‘organizations’, which allows users to be logged into multiple organizations simultaneously. Each organization represents a separate tenant in the system. Everything (except accounts) in the system exists within the context of an organization. Only accounts exist outside of organizations. Accounts are created when a person first authenticates with the system through a third-party login. An account represents an identity attested by a trusted system (i.e., Google login, GitHub login, email verification). Once an account joins an organization though, a user entity is created for that account within the organization. Users are then accounts within the context of a specific organization. From a user perspective, you cannot make yourself an account again to prevent privilege escalation.

An account is similar to the root user in Linux. Like root, an account has higher privileges and broader access across the system (in this case, across multiple organizations). A user in Gitpod is analogous to a regular user in Linux. Like how root in Linux has access to the entire system, an account in Gitpod has potential access to multiple organizations. Users have limited permissions, confined to their specific organization (similar to how Linux users have limited permissions within the system). In Linux, regular users can become root through controlled mechanisms like ‘sudo’, which requires additional authentication. However, in Gitpod’s system, there is no direct mechanism for a user to escalate to an account. This ensures that actions within an organization remain contained and can’t unexpectedly gain higher privileges.

This is achieved by using an account-level access token and an additional header to specify which organization (user) they want to act as. We use an ent mixin to ensure on the DB query level that organizations are properly isolated and identities cannot query information from other organizations

The future of secure development is zero-trust environments

With thoughtful system design and clearly defined boundaries from the management plane to the runner architecture, we’ve created an infrastructure that embodies the core principles of ‘zero trust’. Every request, regardless of its origin, is subject to authentication and authorization, ensuring security is built into the system’s DNA. This foundation is a solution for today but it’s also a springboard for future features and extensions of Gitpod that inherit the security-first approach that defines the system. Our ambition here is to significantly raise the bar on what it means to do secure yet collaborative software development.

Gitpod architecture overview: runners and management plane

Building a zero-trust identity model for Gitpod Flex

How is authentication and authorization handled?

Building multi-tenancy into organizations

The future of secure development is zero-trust environments

Similar posts