Index ¦ Archives ¦ Atom

Participating in the GitHub token scanning program

Secrets (including API access tokens) occasionally get accidentally committed to public GitHub repositories, with various less-than-ideal results such as unauthorized access to data and cloud account abuse. GitHub provides a mechanism by which providers can proactively reduce the associated risk/impact for their customers. GitHub's token scanning covers all commits to all GitHub-hosted public (not private) repositories in close to real-time, and repositories that are transitioned from private to public. Program participation is free of charge.

There's a relatively recent status update at https://github.blog/2019-08-19-github-token-scanning-one-billion-tokens-identified-and-five-new-partners/, and a slightly dated overview at https://github.blog/2018-10-17-behind-the-scenes-of-github-token-scanning/.

Providers register regular expressions that match their token formats with GitHub, then standup a webhook. When the GitHub token scanner (built around Intel's Hyperscan) detects commits with strings matching those regex's, it sends an alert to that webhook. Alerts coming from GitHub are signed and should be validated before taking action.

The documentation at https://developer.github.com/partnerships/token-scanning/ has validation examples for Go and Ruby, but not Python. Here's roughly where I landed with that (it's a bit happy-pathish, but it works...):

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
#!/usr/bin/env python3

import json
import hashlib
import base64
import urllib.request
from ecdsa import VerifyingKey  # pip install ecdsa
from ecdsa.util import sigdecode_der

mock_event = {
    'headers': {
        'Github-Public-Key-Identifier': '3ec68716d6df3f7cd532ac97e55420cb1c143752450d65bf916d41ca25d9dfc4',
        'Github-Public-Key-Signature': 'MEUCICop4nvIgmcY4+mB+i5GlUYGNL20Qrlrx3RvrysilFHeAiEAg7yJ8KqEUcUadBaxepp3COhTUrk4feZ9TTb/xdBG6Ek=',
    },
    'body': '[{"token": "some_token", "type": "some_type", "url": "some_url"}]',
}


def handle_alert(event):
    # get the GitHub key ID, signature, and published public keys
    key, verified = None, False
    github_pk_id = event.get('headers', {}).get('Github-Public-Key-Identifier', '')
    github_pk_sig = event.get('headers', {}).get('Github-Public-Key-Signature', '')
    github_pk_url = 'https://api.github.com/meta/public_keys/token_scanning'
    f = urllib.request.urlopen(github_pk_url)
    keys = json.loads(f.read().decode('utf-8')).get('public_keys', [])
    for k in keys:
        if k['key_identifier'] == github_pk_id:
            key = k['key']
    if key is None:
        print('ERROR: GitHub public key not found for provided identifier {}'.format(github_pk_id))
    # verify signature against the matching public key
    else:
        vk = VerifyingKey.from_pem(key)
        verified = vk.verify(
            base64.b64decode(github_pk_sig), event.get('body', ''), hashlib.sha256, sigdecode=sigdecode_der
        )
        if not verified:
            print('ERROR: signature for GitHub alert could not be verified')

    # drop the alert into a queue for async handling, & return a quick 200-ok!  then:
    # - validate the token referenced in the alert
    # - automatically warn the user, or disable the token?
    # - create a ticket in <system of your choice>?
    # - page an admin in <system of your choice>?
    # - drop a message in <Slack?  IRC?  I miss IRC...>?


handle_alert(mock_event)

Setting up a 'serverless' handler (API Gateway endpoint + Lambda function) makes this a pretty lightweight / low-cost program to engage in with real tangible benefits for your customers.

© Jamie Finnigan. Built using Pelican. Modified from theme by Giulio Fidente on github.