How Palantir Secures Source Control

How Palantir Secures Source Control (Software Supply Chain Security Series, #3)

Editor’s Note: This is the third post in a series that shares insights from our journey improving our software supply chain security story across Palantir. This post focuses on our source control risk considerations and mitigations.

Zero-Trust Development

Software developers have many workflows for developing code. They may work locally on an endpoint, remotely on hosted developer machines, in test/dev environments, or a myriad of other platforms. As the attack surface of build and development tools increases so does the risk of source code being compromised. An attacker with access to a developer’s browser session, SSH key, or API key could introduce malicious code to circumvent security controls, exfiltrate data, or introduce bugs for later exploitation. The same attacks could be intentionally carried out by a malicious insider.

Protecting against these and other attacks against source control requires a layered, defense-in-depth approach.

Commit Signing

There are two common access patterns for committing and pushing code to our GitHub Enterprise (GHE) servers: via secure shell (SSH) keys and via the browser user interface. An anti-pattern that is also available are GHE access tokens, which we strongly discourage but do not outright prevent. If any of these credential materials are leaked or compromised, an adversary will have the opportunity to commit code as that user.

Our first layer of defense is git commit signing, which is a process for creating detached cryptographic signatures of stored git objects. We take commit signing several steps further by requiring that all commits are cryptographically attested by a trusted hardware key that requires physical presence. While not a panacea, this ultimately raises the bar for an attacker as they can no longer simply compromise a developer’s credentials or session. It also protects the commits at rest on the source control appliance by making the git tree cryptographically verifiable. The commits are also protected against man-in-the-middle (MitM) attacks in transit from the developers workstation to the remote repository.

The hardware-backed signing key mitigates the risk of exfiltration of the key. As an additional measure, we require the hardware key to be physically touched to initiate a signing process. Even if an attacker somehow gained a foothold on a local developer machine and wanted to create their own commit, without the user touching the YubiKey the signing operation would not occur. We require that users register their commit signing key via a registration process that ensures several properties of the hardware and its configuration are true.

The private key associated with the public key they wish to register is in fact stored on a YubiKey and cannot be exported.
The YubiKey must be a specific FIPS-compliant model.
The touch policy for signing operations must be enabled.

We do this by leveraging attestations generated by the YubiKey and requiring the user to upload them along with the associated public key for the signing key they wish to register to our custom key server. The key server requires users to authenticate via our AD passwordless infrastructure even if they already have a valid SSO session active. This ensures users must also attest to their identity via a YubiKey FIDO authentication flow before they can upload a trusted commit signing key. (See our Passwordless Authentication Series for more on our journey enforcing FIDO2 authentication via hardware authenticators.)

In order to enforce commit signing, we built our own pull request (PR) status check that pulls the trusted keys from the key server and uses that to check each commit signature. If all commit signatures are valid and signed by the users’ expected keys, then the check passes and allows the PR to be merged. This check also solves the issue of validating commits made by users via the GitHub UI. Since the GitHub appliance automatically signs these commits with its own key and there is no way for a user to sign them through the browser, our check requires users to manually review each UI commit and then go through a flow to re-authenticate using our passwordless authentication, which requires a YubiKey FIDO authentication flow. By doing this, we ensure non-repudiation by forcing a developer to acknowledge the UI-based changes attributed to them.

GitHub does have a built-in feature that enforces all commits on a branch are signed. However, after testing, we found it wasn’t flexible or secure enough. There were several downsides, including it requiring that all commits on the source and target branch be signed, meaning that you have to retroactively sign all commits in a repository, which for older repositories can be very painful. It also allows users to upload arbitrary public keys to trust to their profiles. This does not guarantee the key is stored on a trusted, correctly configured hardware device or that an attacker cannot compromise the user’s credentials and upload a valid key. Finally, UI-based commits would always be valid, and we could not enforce additional review or authentication via a hardware-backed authenticator.

Code Review Enforcement

While commit signing makes it much harder to tamper with commits or meaningfully compromise a user’s source control account, it does not mitigate other scenarios in which insiders or advanced threat actors could introduce malicious code. Code review is an industry standard for a reason; the eyes of a strongly authenticated, trusted reviewer are one of the most robust ways to catch all types of vulnerable code, irrespective of the author’s intentions.

We enforce code reviews via another required PR status check in GHE called policy bot. Policy bot is a custom GitHub app developed to provide more flexible +1 policies, including the following features:

Require reviews from specific users, organizations, or teams.
Apply rules based on the files, authors, or branches involved in a pull request.
Combine multiple approval rules with AND and OR conditions.

One such rule is that we require all non-human committers and tag creators to sign their operations via KMS-backed keys with the public key IDs associated with their identities in the policy bot policies. This ensures that an attacker is unable to spoof the identity of any automated committer and helps us further strengthen our commit signing guarantees.

Scaling policy bot policies and ensuring that every repository is using a secure, vetted policy was difficult. Our current implementation stores all policy bot policies in a central repository that ensures that policies are vetted, transparent, require multiple developer leadership +1s to change, and any incorrect policies can be fixed in a single location rather than on a per-repository basis. Prior, each code repository had its own policy, which had several security deficiencies:

Developers were able to define arbitrary policies, including ones that did not require any reviews for changes.
There was no way to audit every policy or enforce a standard template.
If there was a common typo, error, or required change, it had to be made in thousands of policy files across our repositories.

GHE Permission Management

GitHub uses a role-based access control model with several predefined roles. The least permissive role, “reader,” allows users to view code, issues, PRs, and other repository metadata. The most permissive role, “owner,” grants full repository control, including the ability to enable or disable all security settings such as branch protection rules, secret scanning, and static code analysis. To maintain security, we explicitly prohibit developers from holding owner roles, as this would enable them to bypass security controls by disabling or overriding them, such as force merging PRs that have not passed all required status checks. Instead, developers are granted “maintainer” permissions, which allows them to approve PRs as defined by policy bots, view relevant metadata, and view security scan results, but not edit any security settings.

Development leadership and our developer tools team have admin permissions via special service admin accounts, which are separate from their normal user accounts. Accessing these accounts requires logging into secure, hardened, ephemeral bastions with MFA. This significantly reduces the risk that a compromised ordinary account could lead to the degradation of our source control security configuration.

Additionally, our GHE instance sets repositories to private by default. This means that even with a valid GHE login, users do not have automatic access to view all repositories. Broad view permissions are carefully managed via our onboarding and off-boarding processes to ensure that contractors or other non-developer employees do not receive unnecessary access to our source code.

Static Code Analysis

One of the most scalable ways to identify security-related bugs, whether introduced intentionally or unintentionally, is static code analysis. We use CodeQL to scan each PR and each release to identify vulnerabilities. Our rule sets are a combination of standard and custom rules, which are tuned to optimize build time performance while maintaining low false positive. We have invested extensive time in developing metric pipelines to measure these performance indicators and manage the various tradeoffs based on these measurements.

Deriving value from static code analysis is extremely challenging, especially when dealing with thousands of repositories and a broad set of languages and ecosystems. Issues such as false positives, slow builds, unnecessarily blocked PR merges, nonsensical findings, and a host of other problems can lead to developers losing trust in the tooling and results. Further, writing high-signal, performant custom rules requires significant experience. We have multiple Application Security (AppSec) engineers dedicated to maintaining our rules, responding to bugs, and monitoring the metrics. In a future post. we will dig into more details about our static analysis implementation but for now, it’s sufficient to mention it as part of our defense-in-depth approach to code and source control security.

Secure Release Flow

We use an internal tool called Autorelease to securely tag and release code. Autorelease is the only thing that can create tags and trigger release builds, which we enforce via pre-receive Git hooks. Autorelease also enforces a set of rules dictating which branches can be tagged and which users can trigger the tag and release actions. This is critical to being able to separate which builds can end up in production and which can not. Here’s a detailed look at how we delineate production versus dev builds.

Within Artifactory, we maintain repositories for internal software production releases, which our deployment system, Apollo, use to pull software for deployment. The enforcement of how these repositories are used within our deployment system, while a separate topic, involves internal controls ensuring compliance. Separately, we have pre-release and dev build repositories, installable only on non-production environments such as dev cloud instances or development and test environments that emulate production. To ensure code integrity and security, it’s critical we place guard rails around how software arrives in these different repositories. These guard rails ensure that only code satisfying our security controls ends up in production.

Autorelease serves as the first guard rail by ensuring that only protected branches can be tagged as production releases. These tags are then picked up by the build system and published to the production release repository in Artifactory. We also enable developers to ask Autorelease to generate a pre-release tag, which the build system will build and publish to our designated pre-release Artifactory repository. Autorelease also enforces permission checks on who can request a production release tag, limiting this ability to repository maintainers rather than any user with read or write access. Finally, Autorelease signs the tags it creates to protect the integrity of the tags and guarantee their authenticity, making the tags always auditable, such as by the build system.

GitHub has historically had inflexible rules and weak protections around tags, which define release points in source control. By using Autorelease, we have established highly granular controls and custom logic to ensure that tag creation is auditable, secure, and matches our build and artifact storage system expectations for what is allowed to go to production.

What Risk Are We NOT Mitigating?

Commit signing is part of our defense-in-depth solution for protecting our code repositories and preventing unauthorized code introduction. Its efficacy is typically a controversial topic and it’s important to acknowledge what it cannot do. It is not a silver bullet and there are specific risks it cannot mitigate. Commit signing does not prevent malicious code from being committed. This can occur if an insider intentionally introduces malicious code or if a persistent attacker with access to a developer’s machine adds malicious code that the developer unknowingly commits as part of their normal changes. Commit signing only ensures that a Palantirian was physically present and did, in fact, perform the commit operation with their hardware key. To mitigate the risk of malicious code introduction, we rely on other controls such as code reviews and static analysis.

Additionally, tag signing is only valuable if the signature is checked and used to gate-keep the release or building of that tag. Signing alone guarantees nothing, so it is crucial to adapt your build, release, or deployment system to check for valid commit and tag signatures. It’s also critical to secure the paths for introducing trusted keys and where those keys live.

Conclusion

Source control security requires a multilayered approach that robustly authenticates code authors, ensures the integrity of the source, prevents unauthorized access, and identifies vulnerable code through both human and automated reviewers. Even with these advanced and strong controls, there remains a risk that a determined threat actor could introduce unauthorized code. We continuously test, validate, and improve our controls as part of our broader software supply chain security program. In our next post, we will delve into how we guarantee the integrity and authenticity of our software using in-toto and attestations.

How Palantir Secures Source Control was originally published in Palantir Blog on Medium, where people are continuing the conversation by highlighting and responding to this story.