General impressions, and a little Python to validate the signature on incoming alerts.
Secrets (including API access tokens) occasionally get accidentally committed to public GitHub repositories, with various less-than-ideal results such as unauthorized access to data and cloud account abuse. GitHub provides a mechanism by which providers can proactively reduce the associated risk/impact for their customers. GitHub's token scanning covers all commits to all GitHub-hosted public (not private) repositories in close to real-time, and repositories that are transitioned from private to public. Program participation is free of charge.
There's a relatively recent status update at https://github.blog/2019-08-19-github-token-scanning-one-billion-tokens-identified-and-five-new-partners/, and a slightly dated overview at https://github.blog/2018-10-17-behind-the-scenes-of-github-token-scanning/.
Providers register regular expressions that match their token formats with GitHub, then standup a webhook. When the GitHub token scanner (built around Intel's Hyperscan) detects commits with strings matching those regex's, it sends an alert to that webhook. Alerts coming from GitHub are signed and should be validated before taking action.
The documentation at https://developer.github.com/partnerships/token-scanning/ has validation examples for Go and Ruby, but not Python. Here's roughly where I landed with that (it's a bit happy-pathish, but it works...):
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 |
|
Setting up a 'serverless' handler (API Gateway endpoint + Lambda function) makes this a pretty lightweight / low-cost program to engage in with real tangible benefits for your customers.