Download a private GitHub Repo using AWS Lambda

As mentioned in this post I needed to download a private GitHub repo as a ZIP file. The GitHub repo containing this blog is now set to private (mainly to protect my backlog of drafts), which broke my AWS Lambda function (see here and here).

Minimal changes were required from the functioning code, but there was a lot of testing. Some key changes were:

  • Setting the GitHub Personal Access Token as a Lambda variable
  • Modify the file download function to use custom headers
  • Consuming the token from within Lambda to access and download the file

The new function to download the repo:

def downloadSite(account, repo):
    logger.info("Downloading master zip of " + repo + " from GitHub")

    url = 'https://api.github.com/repos/' + account + '/' + repo + '/zipball/master'
    siteZip = "/tmp/master.zip"

    # Download the master repo as a ZIP file
    http = urllib3.PoolManager()
    r = http.request('GET', url, preload_content=False, headers={'Authorization': "token " + os.environ['GITHUB_ACCESS_KEY'],'User-Agent': os.environ['GITHUB_ACCOUNT']})
    with open(siteZip, 'wb') as out:
        while True:
            data = r.read(64)
            if not data:
                break
            out.write(data)
    r.release_conn()