New Hires, Lost Keys & Lessons Learned (Passwordless Authentication Series, #3)

Editor’s Note: This is the third and final post in the Passwordless Authentication Series, which shares insights from our journey on enforcing FIDO2 authentication via hardware authenticators (YubiKeys) across all of Palantir. The first post focuses on hardware selection and logistics; the second post covers technical controls, rollout, and edge cases. While Palantir has enforced mandatory strong multi-factor authentication for well over a decade, hardware-backed authentication using FIDO2 represents the strongest form of modern authentication available.

New Hire Onboarding

One of the first issues that organizations face after moving to passwordless authentication is creating a seamless new hire experience. Regardless of which identity provider your organization chooses, you’ll need one that enables new users to enroll quickly and begin using their FIDO2 device for authentication without having to rely on a traditional password. You’ll find that this quickly leads to a classic “chicken and egg” problem, where a new user requires a registered FIDO2 authenticator in order to add a new FIDO2 authenticator to their account.

There’s a lot of variance in how different identity providers handle this problem. For instance, Okta addresses this by providing admins with a console they can log into and manually add a FIDO2 authenticator to a specific user’s account. While initially this may appear to be a seamless solution, you may find that it quickly suffers from scalability concerns since an admin must then manually enroll a FIDO2 authenticator for each user in every incoming new hire cohort. For some organizations, these onboarding groups can be hundreds of users in size! There’s also the problem of remote work becoming more prevalent across the industry. Who’s on the hook for shipping these FIDO2 devices to each remote user? How do you handle the potential for lag time or shipments getting lost in the mail prior to onboarding?

At Palantir, we use Azure Active Directory (Azure AD) as our primary identity provider for all user authentication. Although there are multiple ways to address this registration problem via automation, our guidance on rolling out FIDO2 authentication across your organization emphasizes low barrier to entry. As a result, this post primarily focuses on how the new hire registration problem is addressed within the context of Azure’s built-in authentication features. This integrated solution that Azure provides today is called a temporary access pass (TAP) code.

TAP codes

TAP codes act as a time-limited token that can allow users in Azure to bypass authentication requirements, such as FIDO2 and Multi-Factor Authentication (MFA). Furthermore, these codes can be configured in the Azure Portal to vary in length, duration, and whether they’re single or multi-use.

TAP codes can be used to provide users with a method of registering a FIDO2 device without needing an existing device registered to their accounts, thus solving our “chicken and egg” problem described earlier. By providing your support administrators with the ability to generate TAP codes for users during onboarding, new hires can leverage these bypass tokens for an initial sign-in to their account’s security page, where they can then register their FIDO2 security key. A limitation of the TAP code one-time use variation, however, is that that the session token generated post-authentication is only valid for 10 minutes. As a result, once a TAP code has been used, the user will only have 10 minutes to get their FIDO2 key set up with their account before they’d need a new TAP code. Good new user onboarding documentation and adequate on-site support can mitigate this concern and help the process run more smoothly and quickly.

While these TAP codes can be incredibly helpful for enforcing a passwordless authentication policy at your organization, the feature is not without its drawbacks. Failure to limit the scope within the TAP code configuration, not having a Standard Operating Procedure (SOP) for your support teams to follow for verifying user identities, or not providing a secure medium in which these bypass tokens can be provided to end users can quickly lead to a security incident. It cannot be understated that these TAP codes provide an authentication method that bypasses other security controls that you configure in Azure AD, including MFA, and should be handled with extreme care.

Configuration

To begin, navigate to the TAP feature within the Azure tenant.

Azure Active Directory > Security > Authentication Methods > Policies > Temporary Access Pass

You’ll then need to enable and scope TAP codes to the groups of users that should have access to the feature. For this example, the target is set to apply to all users in the Azure AD tenant.

Azure Active Directory > Security > Authentication Methods > Policies > Temporary Access Pass > Enable and Target

Lastly, and probably most importantly, you’ll need to decide on the configuration of your TAP codes. This will vary greatly based on the security posture of your organization and the workflows you’re trying to support with this feature. Due to the security authentication bypass features that TAP codes provide, we encourage restricting these to one-time use and setting the maximum lifetime (validity period of the TAP code once generated) to be as restrictive as possible.

Azure Active Directory > Security > Authentication Methods > Policies > Temporary Access Pass > Configure

Lost or Broken Authenticators

Another common issue organizations run into when relying on hardware tokens for authentication is users losing their authenticator. There are many scenarios that can lead to a user being locked out of their account, including leaving their authenticator at home, being in a location where they can’t have their authentication physically on them, or the authenticator breaks completely and is no longer functioning. It’s important to have a strong SOP for addressing each of these scenarios in order to get users back online as quickly and securely as possible.

Having a way to selectively exclude users from your FIDO2 requirement policy is going to be really key (no pun intended) for unblocking your users in the event they lose their FIDO2 device. One way to do this is via rollback groups: after verifying a user’s identity, your support admins can place users in a group that reverts them back to traditional username and password with strong MFA. We suggest having some level of automation that automatically removes users from this exception group after a specific amount of time (ex. every 7 days) so you don’t have users perpetually excluded from your FIDO2 authentication policies. You can even take it a step further and set up a chat bot that pings users when they’re added to the group and notifies them when their FIDO2 exception is set to expire.

Another way to address lost or broken FIDO2 authenticators is via multi-use TAP codes. Multi-use TAP codes provide users with a randomly generated code that can be used for the lifetime of the token to bypass FIDO2 and MFA controls to log into their account. As the administrator, you can configure the length of the code and how long the token for bypassing authentication requirements should remain valid. As previously mentioned, these TAP codes bypass all authentication policies, and are therefore inherently less secure than a traditional “password with MFA” authentication method.

If you want to enable multi-use TAP codes, you’ll need to adjust your TAP settings slightly by removing the “Require one-time use” restriction.

You should carefully consider the security pros and cons of whatever strategy you decide to implement for FIDO2 bypass. Whether you go with TAP codes, an exclusion group, or some other method, you’ll need a way to verify the identity of the users requesting the FIDO2 authentication exception. If you have a large enough support team, then you may be able to get away with manually verifying the identity of each user. However, there are also some products like Berbix | Instant and Accurate ID Checks that can automate ID validation away from your admins entirely, drastically decreasing the amount of manual effort on your support team each time a user in your organization requires a FIDO2 exception.

While this may seem like it reduces the effectiveness of your FIDO2 authentication policy, it’s quite the contrary. Users will lose their authentication key, and without a method to unblock users who lose their security keys, your FIDO2 rollout will likely fail. It’s imperative that you have a solid game plan that’s addresses what to do when users lose their authentication key and get locked out of their accounts.

Another aspect of passwordless rollout you should consider implementing is providing your users with multiple FIDO2 security keys. Palantir encourages all users to procure two security keys so they have a backup in case one breaks or is lost/stolen. You can also suggest that users get two different form-factors of FIDO2 devices (ex. one USB-C and one USB-A) to mitigate any compatibility issues with their port and the FIDO2 security key.

It’s also a good idea to keep a stock of FIDO2 security keys in your office locations so users can more easily obtain a new key, as opposed to users needing to order and wait for one to arrive via shipping.

Make sure your instructions are very clear on how your users can obtain new FIDO2 security keys and what to do if a key is lost or stolen. FIDO2 bypass requests should be a last resort when users are not in an office and otherwise have no way to quickly obtain an approved FIDO2 security key without waiting for a new device to be shipped to them.

User Education

User education is one of the most easily overlooked aspects of a passwordless authentication policy — and probably the most important when determining the overall success of a FIDO2 rollout. Without comprehensive and accurate instructions, users will flounder with this new technology and you’re likely to run into significant roadblocks during passwordless enforcement, with the potential of the friction halting your rollout in its entirety!

When developing user guidance, make sure to cover the basics of how to actually operate the FIDO2 key selected for implementation in your organization’s rollout. In the context of YubiKeys, this should involve graphics and (preferably) short video clips on how to insert the key into your device and where to touch the metal contacts on different key form-factors. While it may sound simple, it’s critical that you think through friction with this technology from the start. Unlike passwords, this is an entirely new form of authentication that users are not likely to understand. As a result, your users may be confused about how a FIDO2 device with a PIN is more secure than a password with MFA, so it’s a good idea to provide some high-level documents on the security enhancements of this technology. Providing users with basic knowledge about FIDO2 and how it works has the added benefit of promoting buy-in and should help adoption across your organization.

Finally, we recommend providing documentation that differs in scope and complexity for users of varying technical ability. For example, one set of documentation can revolve around the basic end user by providing a base set of instructions covering how to order a key, usage of the key for authentication, and replacement of the security key in the case of loss. For admins and more technical users, this can be coupled with advanced documentation that covers how the FIDO2 technology works and other features that can be used, such as OpenPGP code signing for developers. To better support users for whom FIDO2 may be more of a struggle, we suggest having office hours or walk-in meetings where users can ask questions and get clarification.

Lessons Learned

The road to passwordless authentication at Palantir was not easy. It required our users to change their thinking about how authentication works and — with the technology being relatively new — we had to respond to seemingly never-ending edge cases associated with moving to full passwordless enforcement. While we had A LOT of lessons learned from this process, below are some of the largest gaps, “gotchas,” and gripes we encountered along the way.

OATH tokens. If your organization has users that operate within a Sensitive Compartmented Information Facility (SCIF), then you’ll have to provide a workflow that allows users to bypass FIDO2 authentication, as most SCIFs do not allow users to bring in FIDO2 hardware devices and are even less likely to allow users to insert a security key into a computer on-site. As a result, you may attempt to bridge that authentication gap by relying on FIDO2 exception workflows with traditional password + OATH tokens for MFA. Depending on your identify provider, your experience with supporting OATH token workflows will vary. Azure, for example, doesn’t have great support features for provisioning OATH tokens and, as of writing, each OATH token has to be manually loaded into Azure and associated to an end user. Microsoft is working on some new APIs to expand the functionality of OATH token registrations to user accounts, but until those features exist you have to manually convert Portable Symmetric Key Container (PSKC) files into .csv formats and upload them to Azure, which is very tedious. You’ll also be hard pressed to find highly reputable OATH token brands outside of major parts suppliers, and you’re likely to find they’re incredibly difficult to source. The best suggestion we have is not to support OATH token workflows unless absolutely necessary.

Shipping logistics. Getting a YubiKey in the hands of every employee, spread out around the world, during a global pandemic was incredibly challenging. It was not uncommon for keys to be confiscated by border officers when shipping to other countries or for packages to get lost in the mail entirely. Unfortunately, there’s not a lot within your control when shipping keys to your users. Our best advice is to offer keys in your offices and give users a lot of leeway when obtaining their FIDO2 security key prior to enforcement.

Edge cases. In the process of the YubiKey rollout, we encountered challenging edge cases, discussed in the second blog post in this series. Unfortunately, there’s no easy way to detect all of your organization’s applications that don’t support FIDO2 authentication until you’ve expanded to enough users for those edge cases to crop up. We recommend keeping your FIDO2 rollout groups to a reasonable size and building in lag time between each group enforcement so that issues can be identified.

Documentation. Palantir uses YubiKey modules, such as the OpenPGP module for code signing, in addition to the FIDO2 module for authentication. This caused a lot of friction for users, as they weren’t sure how the modules were different or the nuances between them. You can reduce this type of friction for end users by providing very thorough documentation and debugging instructions for each module.

Usability. When deciding which form factors of FIDO2 devices to offer your users, consider avoiding anything that’s non-nano. During rollout at Palantir, we found that non-nano form factors tended to break (e.g., user inserting key and snapping it while plugged into their device) more frequently than their smaller form-factor counterpart. Also, providing USB-C devices to users with a laptop that has only one USB-C port leads to lost keys when users remove the FIDO2 device to insert their other peripherals. Lastly, if users aren’t required to use their FIDO2 key right away, they’re more likely to forget their unique PIN. This caused a lot of pain when YubiKey enforcement began and users would need to reset the PIN, which had the effect of nuking all FIDO2 key credential information and requiring re-registration of the device.

Final Thoughts

FIDO2 is still a relatively new technology, and as such a lot of the challenges faced during rollout were exacerbated by a lack of support for passwordless authentication across different operating systems and thick apps. Luckily, most of the users at Palantir were incredibly supportive and patient throughout the transition from traditional passwords to strong, FIDO2-backed authentication.

While moving to passwordless authentication has drastically reduced the risk of Account Takeover (ATO) at Palantir, it’s critical to keep in mind that like all things in security, there is no silver bullet. Additional security controls, such as strong endpoint management, telemetry, logging, etc., need to be layered in with FIDO2 authentication in order to have a coherent, and strong security posture in your organization.

Our hope is that this blog series not only emphasizes the importance of transitioning away from traditional, password-based authentication, but also brings to light some of the imperfections of FIDO2. While passwordless rollout requires an immense amount of company resources and time, it cannot be understated that, if done correctly, FIDO2 offers the single greatest control for reducing ATO risk that most organizations face today.

Authors

Chris Dunn and Kimmy Richardson, Palantir Information Security (PalSec)


New Hires, Lost Keys & Lessons Learned (Passwordless Authentication Series, #3) was originally published in Palantir Blog on Medium, where people are continuing the conversation by highlighting and responding to this story.