Image credit: Unsplash
In the previous blog post, we discussed how to protect against the compromise of highly privileged human users. In this post we will discuss how to protect against compromise of highly privileged non-human users or workloads. As we discussed in the first blog post in this series, the attackers are able to move freely across our complex software ecosystem. They often do this by compromising the credentials used for inter-workload or inter-service communications. These access credentials often having excessive permissions also adds to the problem.
One of the biggest problems here is when the attackers steal service-to-service access credentials, they can operate in complete stealth without being detected. If the attacker has to maintain presence on a workload for a long time, chances of them being detected is high. If they steal an access credential and use that to access target service, we may have no way of detecting it. We have no easy way of knowing if the access is happening from the legitimate workload or by an attacker.
In this blog post we will outline how the Procyon solution provides non-compromisable workload identities and eliminates the need for vulnerable secrets and other credentials. We will also discuss how to prevent the privilege sprawl associated with the workloads and services.
Types of workload credentials
Before we discuss how to prevent workload access credentials from getting compromised, let us identify different types of workload credentials used between workloads and services:
-
- Secrets like database passwords.
- Private keys like SSH .pem files.
- Simple bearer tokens like API access keys used between services(like AWS access key, stripe key, twilio access key etc). This remains the most popular form of access key used between services owned by different organizations.
- Service accounts like Azure or GCP service accounts used by a CI/CD service.
- Private keys used as workload identities. This is largely used between micro services owned by the same organization. Most commonly in the form of a service mesh.
- Oauth access keys used between services. This form is increasingly used between services owned by different organizations. For example a CI/CD service like github actions might use Oauth/OIDC for AWS or GCP api calls.
- Individual users authorizing a third party application to access data in a SaaS application. This is used between SaaS services like Dropbox to get access to your Microsoft365 account.
When developers or devops teams implement a secret management strategy, they typically go in this order:
- Centralize management, rotation and access control of secrets using a secrets manager like Hashicorp Vault or a cloud provider’s secret manager service.
- But this leads to the problem that accessing secrets manager requires yet another access credential
- Consolidate and centralize service account management so that privileges and credentials associated with the service account are managed from a central console
- Create workload identities like service-mesh workload identities or workload identity federation services from cloud providers and grant specific workloads access to secrets instead of storing yet another access key in the workload
- This leads to the problem that the workload identities like private keys stored on the filesystem are compromisable just like access keys.
- Use non-compromisable workload identities like cloud-provider instance identities, server side TPMs etc and grant these identities specific access. This form of access is the most secure and resistant to workload access credential theft.
In an ideal world, we want to use non-compromisable workload identities for all access without requiring us to pass around secrets or compromisable credentials. But in reality, simple bearer tokens and service account credentials are used everywhere because they have a simpler developer experience. It’s much simpler to copy a token from the AWS webUI and paste it into your code rather than worry about workload identities and how to secure it.
Service account and API key lifecycle management
Centrally managing the lifecycle of service accounts and API keys is the first step in getting control of all the API keys being used. You need an inventory of all the API credentials being used. In the event of any of them being compromised, you need a way to rotate them and populate new values to the workload which were using them.
Procyon provides a way to manage the lifecycle of service accounts and API keys. It provides a self service portal experience for developers to create and manage these service accounts across multiple cloud providers. Procyon’s identity analyser provides a way to see if the privileges associated with the service accounts are excessive. It allows you to visualize which privileges are being used versus which are provisioned. This allows the administrators to right size the permissions.
Workload identity federation
Workload identity federation is a better and more secure method than using long lived API keys. In this model, every workload instance (be it VMs, containers, bare-metal servers or lambda functions) has a unique identity. There is an identity provider or a certificate authority who attests to the identity of the workload instances. When the workloads want to communicate to a service, they use a protocol like Oauth or OIDC to get a short lived access token from an authorization service. This has certain benefits like:
- There is an inventory of all the workloads and who is accessing which service.
- There is an audit trail maintained of which workload requested access tokens for which service.
- The target service is aware of each workload instance’s identity. It can choose to grant it access or not based on the identity. It’s not relying on a sharable bearer token that could be shared by many workloads.
- The workload identity is associated with something like a private key and this key never leaves the local filesystem.
- We can collaborate between workloads owned by different organizations in a standards based way.
Procyon’s workload identity solution normalyses workload identity federation across many different providers including the big three cloud providers. This is a great way for on-prem CI/CD systems or workloads on one cloud provider to access the API of another cloud. Procyon solution takes care of the details involved in creating workload identities, establishing trust relationships and managing and rotating keys. It also provides the same self-service workflow for managing the permissions and lifecycle.
Non-compromisable workload identities
Only way we can prevent workload secrets theft is by using non-compromisable workload identities and using this identity to provide access. One example of non-compromisable workload identities is by using AWS nitro TPMs in EC2 service. This allows the workload to create non-extractable private keys in Nitro hardware. Another example is using the instance identities provided by the cloud provider. This does require us to trust the hypervisors operated by the cloud provider. But this is a reasonable compromise. But this type of non-compromisable workload identities are not available everywhere.
Procyon’s workload identity solution tries to solve this problem so that non-compromisable workload identities are available for all types of workload whether it is on-prem or in a cloud. Procyon workload identity is created using a virtual-TPM technology. It leverages the multi-party-computation(MPC) scheme of cryptography so that an attacker can not compromise the private key. One of the challenges of naively using the MPC technology is: if the attackers have access to the entire memory of a workload instance, they can just clone the workload. They don’t necessarily have to steal the secret. We prevent the cloning of the workload memory by running a consensus protocol. This consensus protocol can detect the workload cloning and invalidate the workload identity.
We make use of nodes called identity oracles. These could be hosted in Procyon SaaS cloud or in the customer’s environment. The workload identity keys are split between the identity oracle nodes and the workload instance itself. Let’s say Alice is the oracle and Bob is the workload. If the attacker Eve wants to steal Bob’s identity, she needs to clone Bob’s state and hijack the consensus protocol between Alice and Bob. Think of it like hijacking a TCP connection between Alice and Bob. Eve can try to hijack the TCP connection. But if that happens Bob will notice it very quickly as his sessions look like it’s out of sync with Alice. From Alice’s perspective it’ll look like there are two endpoints for the same session. As soon as we detect this we can revoke the identity associated with Bob. So the guarantee we provide is:
-
- When the workload instance is compromised, the attacker can not read the keys
- When a workload identity is cloned, we detect it quickly and revoke it.
Bootstrapping workload identities
We need to make bootstrapping the workload identities really easy for the developers so that we can use this secure solution everywhere. From the security point of view, one additional thing to do would be to verify the integrity of the workload deployment and the integrity of the workload image before we grant access to them. In the next blog post we will discuss how to verify the integrity of the deployment. A good practice is to maintain a chain of trust between the person or team that did the workload deployment and the workload identity.
Conclusion
Now that we have outlined how to securely establish workload identities, we can use it for all service to service communication.
We have outlined how to protect the workload credentials from getting compromised. We showed how to limit the excessive privileges associated with workloads. It is worth noting that the workload identity protection does not prevent the attacker from getting into the workload and creating a web shell into it. You would need the workload to be behind an IDWall and you would need a workload detection/protection solution for detecting a compromise.
Procyon workload identity solution is still in beta trials. Please reach out to us if you want to participate in the early trials.
Please click here to read the next post in the blog series.
Here are the links to other posts in this series: