Turn reactive audit logs into proactive alerts

Google Cloud, Observability, Security Posted on

Security engineering is an ever-evolving space. While prevention and planning are still very important, we also need to invest in detection and response. Audit logs (timestamped records that capture the entity that performed an action in a system) are very useful for retroactive analysis following a security incident, but what if they could also be used to proactively alert before a security incident occurs?

Humans accessing secrets

As a concrete example, suppose we have a Google Cloud project with secrets stored in Secret Manager. After their creation, these secrets are accessed exclusively by service accounts. Except in very rare situations (e.g. production outage), these secrets should never be accessed by humans. Even if you have configured your infrastructure such that humans have no permissions to access secrets, it's still wise to create an alert, since it could catch a misconfiguration. A human accessing a secret should trigger a proactive security alert.

First, we need to enable audit logging for Secret Manager. Once enabled, successful requests to access a secret will generate an audit log. The audit log includes a wealth of information including the name of the secret, the caller's identity, and other useful metadata.

Enable audit logging for Google Secret Manager

With audit logging enabled, when an entity accesses a secret, Google Cloud will generate an audit log entry like the following:

{
  "protoPayload": {
    "@type": "type.googleapis.com/google.cloud.audit.AuditLog",
    "authenticationInfo": {
      "principalEmail": "my-account@my-project.iam.gserviceaccount.com",
      "serviceAccountDelegationInfo": [],
      "principalSubject": "serviceAccount:my-account@my-project.iam.gserviceaccount.com"
    },
    "requestMetadata": {
      "callerIp": "<redacted>",
      "requestAttributes": {
        "time": "2021-04-29T21:23:01.056050299Z",
        "auth": {}
      },
      "destinationAttributes": {}
    },
    "serviceName": "secretmanager.googleapis.com",
    "methodName": "google.cloud.secretmanager.v1.SecretManagerService.AccessSecretVersion",
    "authorizationInfo": [
      {
        "permission": "secretmanager.versions.access",
        "granted": true,
        "resourceAttributes": {
          "service": "secretmanager.googleapis.com",
          "name": "projects/my-project/secrets/my-secret/versions/1",
          "type": "secretmanager.googleapis.com/SecretVersion"
        }
      }
    ],
    "resourceName": "projects/my-project/secrets/my-secret/versions/1",
    "request": {
      "@type": "type.googleapis.com/google.cloud.secretmanager.v1.AccessSecretVersionRequest",
      "name": "projects/my-project/secrets/my-secret/versions/1"
    }
  },
  "timestamp": "2021-04-29T21:23:01.049494965Z",
  "severity": "INFO",
  "logName": "projects/my-project/logs/cloudaudit.googleapis.com%2Fdata_access",
  "receiveTimestamp": "2021-04-29T21:23:01.257247531Z"
}

In this example, the entity was a service account, because the principalEmail field ends with .gserviceaccount.com:

"principalEmail": "my-account@my-project.iam.gserviceaccount.com"

When a human accesses a secret, the caller is their respective human email address:

"principalEmail": "them@example.com"

Thus, if you were retroactively investigating an incident, you could use a Cloud Logging query like the following to collect all log entries in which a human accessed a secret:

protoPayload.@type = "type.googleapis.com/google.cloud.audit.AuditLog"
protoPayload.serviceName = "secretmanager.googleapis.com"
protoPayload.methodName =~ "AccessSecretVersion$"
protoPayload.authenticationInfo.principalEmail !~ "gserviceaccount.com$"

To proactively alert when a human accesses a secret, we need to convert this query to a logs-based metric:

Create logs-based metric

  • Metric type: counter
  • Metric name: human_accessed_secret
  • Units: 1
  • Filter: (same log query as above)
  • (Optional) label extractor: protoPayload.resourceName
  • (Optional) label extractor: protoPayload.authenticationInfo.principalEmail

Now when a human accesses a secret, the generated audit-log will trigger a metric to be created. All that's left is to alert on that metric! We can create an alerting policy that alerts any time that metric is present:

Create alerting policy

Or, if you prefer to express your queries in MQL:

fetch generic_task
| metric 'logging.googleapis.com/user/human_accessed_secret'
| group_by 1m,
    [value_human_accessed_secret_aggregate:
       aggregate(value.human_accessed_secret)]
| every 1m
| group_by [],
    [value_human_accessed_secret_aggregate_aggregate:
       aggregate(value_human_accessed_secret_aggregate)]
| condition val() > 1 '1'

Save the alerting policy and try to access a secret. You will get an alert!

Humans decrypting data

This pattern can be extended to other sensitive operations. For example, suppose you leverage Cloud KMS to encrypt and decrypt sensitive data. With a few modifications, you can generate alerts when a human decrypts a value.

The Cloud Logging query and logs-based metric filter is slightly different:

protoPayload.@type = "type.googleapis.com/google.cloud.audit.AuditLog"
protoPayload.serviceName = "cloudkms.googleapis.com"
protoPayload.methodName = "Decrypt"
protoPayload.authenticationInfo.principalEmail !~ "gserviceaccount.com$"

The other properties of the logs-based metric, including the metric's name and label extractors, are the same. The alerting policy is also the same, except you need to use the new name of the metric. For example:

fetch generic_task
| metric 'logging.googleapis.com/user/human_decrypted_value'
| group_by 1m,
    [value_human_decrypted_value_aggregate:
       aggregate(value.human_decrypted_value)]
| every 1m
| group_by [],
    [value_human_decrypted_value_aggregate_aggregate:
       aggregate(value_human_decrypted_value_aggregate)]
| condition val() > 1 '1'

Wrapping up

Audit logs are very powerful, and you should have them enabled on all your production services. In some instances, the audit logs can also proactively alert you to abnormal or anomalous behavior, allowing you to get ahead of a potential security incident.

About Seth

Seth Vargo is an engineer at Google. Previously he worked at HashiCorp, Chef Software, CustomInk, and some Pittsburgh-based startups. He is the author of Learning Chef and is passionate about reducing inequality in technology. When he is not writing, working on open source, teaching, or speaking at conferences, Seth advises non-profits.