Last week, CISA quietly released its Cloud Security Technical Reference Architecture. Not too many people noticed, and the thing is 70 pages long.
There are quite a few different frameworks being released lately. So I thought I’d do you a service and write a quick summary of cloud infrastructure security lessons that regular (non-government) folk can all take from the report.
TLDR; the document describes ways that federal agencies can move to the public cloud while adopting a modern security posture that includes centralized IAM, cloud security posture monitoring (CSPM), and zero-trust architectures. There are a few interesting nuggets that I’ve picked out for your perusal.
Let’s get started. First: the government is moving to the public cloud? Gasp! Here’s why CISA says a federal cloud migration is needed:
“Migrating to the cloud can help agencies keep pace with the evolving technology landscape by improving both their operations and their security.”
In other words, if the government builds its own parallel clouds, the way it tried to build its own parallel internet, it will fall behind and eventually get hacked. (We actually have the SIPRNET and the NIPRNET which are parallel internets to the regular Internet. I always found the existence of these things super odd…)
Bottom line — the government should complete a public cloud migration in order to gain the benefits of modern commercial technologies.
(And it is also a good argument for why the government shouldn’t ask for backdoors to be placed in commercial technologies. Because the government ends up using commercial technologies and they probably don’t want to backdoor themselves. But I digress.)
SaaS vs. Self-Hosted Solutions?
The meat of the document starts by covering the definition of SaaS, PaaS and IaaS models for delivering cloud services.
It’s interesting that the document doesn’t rule out using SaaS services. My knee-jerk reaction is to expect the government to want to self-host everything when it comes to cloud infrastructure security. But the document is not actually pushing agencies in that direction. Instead it says:
Agencies must clearly identify and understand the delineation of responsibilities between themselves and their [Cloud Service Provider] CSP.
Agencies should carefully set up service level agreements (SLA) to define expectations and responsibilities with each of their CSPs. Agencies may find that they need to change their security posture to stay current with their CSP(s) as they update service offerings. Agencies should ensure that they properly understand the security posture of their elected CSP(s) both initially and continuously over time.
Federal Cloud Migration Leans Towards a Preference for Cloud Agnostic Tools
Cloud security teams are often torn about whether to build/buy a tool that works across all of their cloud providers vs. one that works only with a specific cloud. Think, for instance, about using KMS instead of Hashicorp vault or using AWS SSM instead of SSH. CISA says:
Where possible, agencies should use security tools that can work across multiple CSPs. … It is important to find parity in the security information between the different cloud offerings an agency uses. … When operating in a multi-cloud environment, agencies should be cognizant of the potential for vendor lock-in.
In other words, a cloud agnostic approach is best.
Use SSO Wherever Possible, But It Can Be Self Hosted.
If you’re following these government memos, you’ll have noticed by now that the Cloud Security Technical Reference Architecture document is far from the first call for the use of a centralized identity provider and SSO. This document is neutral on whether to use a SaaS-based solution, over a self-hosted solution which is also interesting.
[Agencies] must consider the implications associated with where their identity provider will reside (e.g., on-premises, in a CSP — if they have more than one, which CSP will host the identity provider). Agencies should implement the strongest security features wherever possible such as implementing phishing-resistant multi-factor authentication (MFA), and they should consider when to use convenience features like single sign-on.
Why is this interesting? Self-hosted SSOs like AD are a very popular target for attackers — in fact CISA says this itself in another document they released on Advanced Persistent Threats (APTs). Basically, they point to APTs that frequently hack into the victims' environment, compromise the AD server, steal the SAML signing key, and use that key to issue themselves credentials for whatever they want to access. Here’s what CISA says in that other report:
… the adversary is … compromising the SAML signing certificate using their escalated Active Directory privileges. Once this is accomplished, the adversary creates unauthorized but valid tokens and presents them to services that trust SAML tokens from the environment. These tokens can then be used to access resources in hosted environments, such as email, for data exfiltration via authorized APIs. During the persistence phase, the additional credentials being attached to service principals obfuscates the activity of user objects, because they appear to be accessed by the individual, and such individual access is normal and not logged in all M365 licensing levels.
The current best solution for this problem is to move from self-hosted AD to Azure AD.
So it would have been nice to see CISA advocating for cloud hosted SSO to avoid these sorts of attacks, potentially with a second independent root of trust for authentication, like what we do at BastionZero. The idea being that if one of the roots of trust is compromised, the other one is still there to protect you. But they aren’t pushing for that yet!
Developing a DevSecOps Mentality
This part of the report basically covers “what most modern cloud teams know they should be doing” (but maybe aren’t actually doing).
Specifically, CISA recommends using CI/CD, Infrastructure as Code (IaC), automated security testings, deployment to staging before pushing to production, and least privilege access for both humans and machine accounts. Yes. Agree.
That said, let me pick out a few interesting nuggets:
“Source code management software can also enforce procedures for code review and code check in that further reduce human errors and add non-repudiation into the system.”
In other words, we should be able to attribute any changes in the code to an individual person, in a way that cannot be denied by that person. We are again tying back into the theme of software supply chain security and SBOM (software bill of materials) that we’ve seen all over the industry lately.
Here’s another interesting one:
Traditionally, separation of duties has been used to deter insider threats and catch innocent mistakes by requiring more than one individual to perform important tasks. An example is the team that does development and coding is separate from the team that does production deployment. This approach is in tension with DevSecOps since these responsibilities are now shared within a team. A replacement process is a two-person integrity check approach through code reviews.
In other words, the old-fashioned separation of duties will no longer work in a DevSecOps world.
The document describes pairwise code reviews as a solution, but in practice, there are other ways this can be done. For instance, multi-party authorization to provide access to a resource or to launch a build process (That is, Bob asks Alice to give him permission to launch a build process or access a service). You can find these in various commercial tools, including soon in BastionZero.
How Can Cloud Infrastructure Access Be More Secure?
Infrastructure access for developers, a topic dear to my heart, does get some airtime in the report. CISA says:
Agencies should ensure that each DevSecOps team member has sufficient privileges to do his or her job but no more privileges than what that user needs. The principle of least privilege right-sizes the scope and duration of access for each person to perform the duties of their tasks and roles. This helps minimize the risk of misuse by a malicious actor (internal or external) by limiting how they can elevate privileges or restricting possible movement.
So this means both just-in-time access (i.e. access for a short period of time) and granular access control (i.e. which role or account a given user can access on a given machine). Yes. Agree.
The risk of “privilege creep”, where developers gain access to some resources and then it never goes away, is also covered:
The risk of ever-expanding roles can also be mitigated with other security best practices, like setting more granular access permissions across the team, and enforcing regular revocations of unneeded access.
Procedures for removing access when an employee leaves the team are also critical.
How should teams actually do this? The document doesn’t say, but a great way to manage offboarding people when they leave a team is through integration with an identity provider (IdP), and the use of IdP groups (that is, add the user to the group, and then give the group permissions. Once the user leaves the team, remove them from the group so that they automatically lose all their permissions).
So yet another way that integrating with an SSO provider helps with cloud security!
Cloud Security Posture Management (CSPM)
The document is also embracing a new buzzword, CSPM, for which it also provides a workable definition: CSPM is
“a continuous process of monitoring a cloud environment by identifying, alerting on, and mitigating cloud vulnerabilities; reducing risk; and improving cloud security.”
I have to confess that we’re currently at page 32 of the report and I’m starting to run out of steam, so let me just call your attention to a few points. If anyone is writing more extensively about this section, please let me know! I would love to read it.
First off, the memo states that CSPM covers what you would expect, like continuous monitoring and alerting when cloud resources or data protection mechanisms are out of compliance or have vulnerabilities, but also things like privilege and identity management.
Second, according to the memo, segmenting networks is an important security practice that falls under CSPM. Specifically:
Agencies should carefully manage the different authentication realms that they will use in their environments. An authentication realm is any unique form of authentication that allows a user, process, or system to access another process or system.
Authentication realms are important because, as I’ve noted before, the federal government is deprecating VPNs.
As agencies move into the cloud, their assets cannot be protected by this castle and moat paradigm. Agencies will likely operate in a multi-cloud environment where they have varied levels of control over perimeters. In IaaS environments, agencies will probably be able to emulate traditional network defenses and add action-based defenses that provide adversary detection through misinformation and redirection; such protections are likely unavailable in a SaaS environment.
Finally, logging and auditing. The document recommends collecting logs in a centralized location, putting them through a SIEM, and alerting on anomalous behaviors.
Summarizing the CISA Document
The document is, in some ways, gently nudging the federal government into the cloud. Agencies should use cloud agnostic tooling where possible, as well as centralized IAM and SSO that segments different environments. However, it's still ok to self-host your identity provider (even if that is a known vector of attack), and while SaaS solutions are OK to use, agencies should decide on a case by case basis whether to self host commercial solutions. (This last point is so interesting, I wonder if it will change in coming years).
Finally, the old-fashioned separation of duties between backend developers and production engineers has disappeared; instead, agencies need to find more modern ways to build separation of duties into their processes (e.g., code reviews, multiparty authorization), to ensure code robustness, the ability to stop insider attacks, and cloud infrastructure security.