How to download a private GitHub repository ZIP via API
| 2 minutes
GitHub Python API

As you may know from one of my recent blog posts, the blog you’re reading is a collection of Markdown and Hugo artifacts that, when ‘compiled’ with Hugo, creates a wonderfully lightweight website hosted out of AWS S3. My biggest gripe with my documented deployment from GitHub to S3 was the dependency on the GitHub repo being public. Anyone could see my published and unpublished content. As I’ve been spending quite a bit of time writing new posts, I wanted to protect the repo but still deploy the site automatically as I do now in AWS Lambda.

To download the ZIP ball of a branch programmatically, you first need a “Personal Access Token” generated under your GitHub account settings. The minimum required privileges for the Personal Access Token is repo, which includes:

  • repo - Full control of private repositories
    • repo:status - Access commit status
    • repo_deployment - Access deployment status
    • public_repo - Access public repositories
    • repo:invite - Access repository invitations
    • security_events - Read and write security events

I’ve tested deselecting any combination of these privileges and can’t cut it down further. So this will have to do. Once you have the token, store it securely, you will need to regenerate it if you misplace it. In the real world, you should be rotating the token.

With the token now available to us, we can start testing it using curl to fetch the ZIP file:

curl -H "Authorization: token $token" -L https://api.github.com/repos/$org/$repo/zipball/master > master.zip

Or, if you’re in my shoes and need it in Python (Yes, I’m aware that $variable is not valid):

http = urllib3.PoolManager()
r = http.request('GET', "https://api.github.com/repos/$org/$repo/zipball/master", preload_content=False, headers={'Authorization': "token " + $token})
with open("/tmp/master.zip", 'wb') as out:
    while True:
        data = r.read(64)
        if not data:
            break
        out.write(data)
r.release_conn()

That’s it! The above Python excerpt was from my Lambda function (minus a few tweaks and replacement of variables). I’ll be putting up another post on using this in a Lambda function to build Hugo sites to compliment my previous posts.

About Stellios Williams
Senior Cloud Solutions Architect - Service Providers VMware
This is my personal tech related blog for anything private and public cloud - including homelabs! My postings are my own and don’t necessarily represent VMware’s positions, strategies or opinions. Any technical guidance or advice is given without warranty or consideration for your unique issues or circumstances.
Comments
comments powered by Disqus
Advertisement