Skip to main content

Rego 102: Combining queries with AND/OR and custom messages

wordpress-sync/feature-cloud-security

2023年11月9日

0 分で読めます

This blog post series offers a gentle introduction to Rego, the policy language from the creators of the Open Policy Agent (OPA) engine. If you’re a beginner and want to get started with writing Rego policy as code, you’re in the right place.

In this three-part series, we’ll go over the following:

As a reminder, Rego is a declarative query language from the makers of the Open Policy Agent (OPA) framework. The Cloud Native Computing Foundation (CNCF) accepted OPA as an incubation-level hosted project in April 2019, and OPA graduated from incubating status in 2021.

Rego is used to write policy as code, which applies programming practices such as version control and modular design to evaluate cloud and infrastructure as code (IaC) resources. OPA is the engine that evaluates policy as code written in Rego. And Snyk uses the Rego language for custom rules.

Part 1 recap

In Part 1 of this blog post series, we explained that a Rego rule is a conditional assignment. A rule queries the input to find a match for a condition, and if a match is found, a value is assigned to a variable.

You can read a rule like this:

1THIS VARIABLE    :=	HAS THIS VALUE {
2    IF THESE CONDITIONS ARE MET
3}

Here's the example we used, which represents a corporate policy that only Alice, a network administrator, should have permission to create and delete virtual networks in the prod account:

1allow := true {
2  input.user == "alice"
3}

OPA evaluates a JSON or YAML input document against a rule to produce a policy judgment. The input document below represents the currently logged-in user:

1{
2  "user": "alice"
3}

If you use OPA to evaluate the input against this rule, it finds a match for the query input.user == "alice". Therefore, the variable in the rule head, allow, is assigned the value in the rule head, true. Here's the output proving this:

1{
2  "allow": true
3}

OPA has delivered the decision that Alice, the currently logged-on user, is allowed to create and delete virtual networks in the prod account. The input is compliant with the rule.

AND and OR

So far, we've only shown rules with a single query. A rule can also contain multiple queries. If it does, the queries represent multiple conditions that must all be met in order for a variable to be assigned. There's an implicit AND — "This condition must be met AND this condition must be met."

For example, in the rule below, both input.user == "alice" AND input.environment == "prod" must be true in order for the variable allow to be assigned the value true:

1allow := true {
2  input.user == "alice"
3  input.environment == "prod"
4}

In some cases, OR might be more appropriate. You can represent OR by using the same head in multiple rules:

1allow = true {
2  input.user == "alice"
3}
4
5allow = true {
6  input.user == "bob"
7}

This set of rules can be read like so:

allow is true if user is "alice" OR if user is "bob".

Technically, the set of rules forms a single rule because the head is the same for both. Because you're defining this rule in multiple steps, it's called an incremental rule. 

If you like, you can get rid of the second head and put the bodies together. A more succinct way of writing the above is:

1allow = true {
2  input.user == "alice"
3} {
4  input.user == "bob"
5}

You might be wondering why we've used the unification operator = rather than the assignment operator :=. That's because variables are immutable in Rego. Even though rules with the same head are treated as a single incremental rule if you try to use the assignment operator, you're effectively "assigning" the same variable multiple times — and that isn't allowed in Rego. Instead, we use the unification operator in the rule head because it unifies multiple rules with the same name.

If it's confusing to remember when to use which operator in the rule head, there's a simpler way, thanks to default values and a bit of syntactic sugar.

Default values in rule heads

The default value given to a variable in the head of a rule is true. So, Rego offers some syntactic sugar here: When a rule assigns the value true to the variable, you can omit the := true from the rule head.

That means this AND rule…

1allow := true {
2  input.user == "alice"
3  input.environment == "prod"
4}

…is the same as this AND rule:

1allow {
2  input.user == "alice"
3  input.environment == "prod"
4}

And likewise, this OR rule…

1allow = true {
2  input.user == "alice"
3} {
4  input.user == "bob"
5}

...is the same as this OR rule:

1allow {
2  input.user == "alice"
3} {
4  input.user == "bob"
5}

"Sweet" indeed!

default keyword

As we discussed in Part 1, if there are no matches in the input for a rule query, the variable in the rule head is not assigned the value in the head.

To demonstrate this, let's return to our example rule, which says that only Alice, a network administrator, should have permission to create and delete virtual networks in the prod account:

1allow := true {
2  input.user == "alice"
3}

And we'll say we have an input document where the currently logged-in user is Bob:

1{
2  "user": "bob"
3}

Since input.user is not "alice", when we evaluate the rule against the input, OPA does not find a match in the input. Therefore, allow is not assigned the value true, and the result of the evaluation is an empty set:

1{}

We say in this case that the value of allow is undefined. Whenever OPA queries input to evaluate a rule, it only returns values that match. If there is no matching value, there's nothing to return — thus, the empty set.

What if we want OPA to return false if allow is not explicitly true? We can use the default keyword to set a default value. This means if a rule evaluation isn't explicitly true, it returns a specific value (in this case, false) instead of returning an empty set of results. To do this, we write an additional rule that also uses allow:

1default allow = false

Now, if OPA determines that input.user is not "alice", allow does not evaluate to an empty set. Instead, it takes on the default value, which we've declared is false:

1{
2  "allow": false
3}

Note that when you specify the default keyword, you use the unification operator = instead of the assignment operator := in both the rule where you define the default value and the rule where you define the conditional assignment. Again, that's because variables are immutable. If you try to use the assignment operator, you're "assigning" the same variable multiple times, which Rego doesn't allow. We use the unification operator instead:

1default allow = false
2
3allow = true {
4  input.user == "alice"
5}

You can, of course, take advantage of the syntactic sugar we described earlier and leave out the = true in the rule with the conditional assignment. This is perfectly acceptable and perhaps easier to use because you don't need to remember which operator to use in the conditional assignment:

1default allow = false
2
3allow {
4  input.user == "alice"
5}

Custom messages

Sometimes, you want to return a series of messages rather than a simple pass or fail, true or false/undefined result. You can do so by using the rule head deny[msg] and by assigning the desired message to the variable msg. The rule below checks if the user is not Alice, and if that's the case, it assigns the string "User is denied access" to msg, which is then added to the deny set (we'll talk more about sets in Part 3):

1deny[msg] {
2  input.user != "alice"
3  msg := "User is denied access"
4}

To test this out, let's suppose our input document contains the name of the currently logged-in user:

1{
2	"user": "bob"
3}

When we evaluate the rule against the input above using OPA's Rego Playground or the opa eval -i input.json -d check_user.rego "data.rules.check_user" --format pretty command (for instructions, see our Part 1 blog post), the result looks like this:

1{
2  "deny": [
3	"User is denied access"
4  ]
5}

What's happening here? We're actually creating a set rule, a concept we'll return to in our next blog post in this series. For now, just understand that rather than a single true or false/undefined result, we're returning a set of messages assigned to the deny variable. In Rego, a set is an unordered list of unique elements, such as integers { 1, 2, 3 } or strings { "alice", "bob", "carlotta" } or even other sets { { 1, 2}, {3, 4} }. You can make a set out of any supported Rego type, or even mix and match types within a single set.

In the case of our example rule, each element in the deny set is a string containing a message. There's only one element in the set for this particular input document, but there can be more, and we'll show you an example later in this blog post.

What if we want to return additional information in the message? We can use the sprintf built-in function to display the value of the input.user field that caused a deny result:

1deny[msg] {
2  input.user != "alice"
3  msg := sprintf("User %v is denied access", [input.user])
4}

The sprintf function takes two arguments — a string and an array of values. In this case, the only element in the array is a string represented by input.user. We use %v as a placeholder in the first argument, and the value in the array takes its place when the rule is evaluated.

Now, if we evaluate the rule using the following input…

1{
2	"user": "bob"
3}

…we see this result:

1{
2  "deny": [
3    "User bob is denied access"
4  ]
5}

The not keyword

You can negate an expression by prefacing it with the not keyword so that it means the opposite. Most of the time, you'll want to use this in a query to specify the absence of a property from the input. So, for example, this query:

1input.tags.environment

…means "The input document has a tags.environment property, and the value is not false," and this query:

1not input.tags.environment

…means "The input document does not have a tags.environment property or tags.environment is set to false." There's no overlap or middle ground — an expression and its inverse are mutually exclusive.

Here's an example rule that assigns true to deny if the input does not have a department tag:

1deny {
2  not input.tags.department
3}

Let's use this input:

1{
2  "tags": {
3    "environment": "staging"
4  }
5}

If we were to evaluate this input against the rule above, we'd see that deny returns true because it is missing the required department property:

1{
2  "deny": true
3}

Evaluating an example rule with OPA

Let's experiment with the concepts we've discussed in this blog post by evaluating an example rule. As in Part 1, we will focus on two ways of interacting with OPA:

  • Using the Rego Playground

  • Using OPA’s command line tool

For instructions on using these interfaces, see Part 1.

This time, we're using more of a real-world example involving a Kubernetes pod. Here's the JSON manifest we will use as input:

1{
2  "apiVersion": "v1",
3  "kind": "Pod",
4  "metadata": {
5    "name": "nginx-demo",
6    "labels": {
7      "release" : "stable"
8    }
9  },
10  "spec": {
11    "containers": [
12      {
13        "name": "nginx",
14        "image": "nginx:1.14.2",
15        "ports": [
16          {
17            "containerPort": 80
18          }
19        ]
20      }
21    ]
22  }
23}

And here's the rule we'll be evaluating it against, which we've written to enforce the company policy "Kubernetes pods must be labeled with release and environment":

1deny[msg] {
2  input.kind == "Pod"
3  not input.metadata.labels.release
4  msg := sprintf("Pod %v is missing release label", [input.metadata.name])
5} {
6  input.kind == "Pod"
7  not input.metadata.labels.environment
8  msg := sprintf("Pod %v is missing environment label", [input.metadata.name])
9}

This rule demonstrates some concepts we've discussed in this blog post:

  • deny[msg] to return a set of custom messages instead of true or false/undefined

  • Both AND and OR rule structure:

    • Deny if the Kubernetes object is a pod AND it's missing the release label, OR:

    • Deny if the Kubernetes object is a pod AND it's missing the environment label

  • The not keyword to check for the absence of a property

  • The sprintf function to return a message that lists the name of the noncompliant pod

For your convenience, we've created a playground with this content already: https://play.openpolicyagent.org/p/KNVK9kEvIT 

If you evaluate the rule by selecting the Evaluate button in the playground or by executing a command such as opa eval -i input.json -d check_pod.rego "data.rules.check_pod" --format pretty if running OPA locally, you'll see this output:

1{
2  "deny": [
3    "Pod nginx-demo is missing environment label"
4  ]
5}

As we can see, the Kubernetes pod we're checking is noncompliant with our rule because the input does not contain a labels.environment property.

Now, let's remove the labels.release property. The labels section of the input should look like this:

1    "labels": {
2    }

If you evaluate the rule now, you'll see that the deny set contains two messages:

1{
2  "deny": [
3    "Pod nginx-demo is missing environment label",
4    "Pod nginx-demo is missing release label"
5  ]
6}

Finally, let's add both a labels.release and labels.environment property to the input, so it looks like this:

1    "labels": {
2      "release" : "stable",
3      "environment": "prod"
4    }

What happens if we evaluate the rule again? We see that the deny set is empty:

1{
2  "deny": []
3}

This means our pod is compliant because OPA did not add any messages to the deny set. Hooray!

What’s next?

Be sure to return to our blog to read Rego for Beginners Part 3, where we’ll explore set rules, object rules, functions, and iteration.

In the meantime, here are some useful resources:

If you’re interested in using Rego to write custom rules for Snyk IaC check out our documentation here. In addition to Snyk’s built-in security and compliance-mapped rulesets, IaC+ custom rules enable you to set customized security controls across your SDLC.

IaC+ gives you a single view and controls for your configuration issues from code to cloud with an issues UI, ruleset, and policy engine spanning IDE, SCM, CLI, CI/CD, Terraform Cloud, and deployed cloud environments such as AWS, Azure, and Google Cloud.