Downloading Imgur albums with python

I have a fair few wallpapers. As of writing, I’ve got around 240+ different pictures in a folder that my OS of choice cycles (poorly) through. Mostly landscapes and space photos with some from Star Wars and the like thrown in-between. My main issue is I often find myself browsing /r/wallpapers and wanting a way to download albums I find . I could download them as a zip from Imgur but that adds another step of having to unzip them. I already have an ifttt role to download single images but I wanted something simple to download whole albums!

Time to bust out some python. The Imgur API looked pretty good and there had to be a wrapper. The API isn’t quite as open as I was hoping (given most their content is public!), since everything querying it must be registered a registered app, there is no anonymous access at all. It also returns all links as plain text (http) instead of https. Its 2017, wtf Imgur?

Basically it all boils down to this:

  • Python wrappers around wget are awful
  • Parsing Imgur album pages will often break as they change HTML and it wont be as accurate
  • The Imgur API returns everything by default with HTTP links
  • Zero anonymous access to the API at all

So, after throwing some functions together we get a working python script that’ll dump the images into a folder for us. Note: We’ll be using the Imgur Python API since, scraping the HTML will be balls. You’ll need to register for an application here . Again, balls.

First, start off with a basic function to initialise the python API client:

def main(album_id):

    # Initialise Imgur client
    client = ImgurClient('client_id', 'secret_id')

And immediately we hit a roadblock. Unfortunately Imgur doesn’t just give you a list of all the images in an album, but its got a nice little function that will do it for us. So lets throw that in there.

def main(album_id):

    # Initialise Imgur client
    client = ImgurClient('client_id', 'secret_id')

    for album in album_id:
        # Get our album images
        try:
            images = client.get_album_images(album)
        except ImgurClientError as e:
            print('ERROR: {}'.format(e.error_message))
            print('Status code {}'.format(e.status_code))

Now we need a function to do the actual download and save the image to disk. We’ll use urllib since its nice and easy.

def download_image(image_url, download_path):
    """
    Download image and save it to file path
    """
    try:
        download = urllib.URLopener()
        download.retrieve(image_url, download_path)
        print("File {} downloaded to {}".format(image_url, download_path))

    except urllib.error.URLError as e:
        print("Error downloading image '{}': {}".format(image_url, e))
    except urllib.error.HTTPError as e:
        print("HTTP Error download image '{}': {!s}".format(image_url, e))

Finally chuck a wrapper around it all, add some basic checks to see if the file we’re trying to save already exists, rewrite our HTTP link to HTTPs and workout what file name we want to give the image:

def main(album_id):

    # Initialise Imgur client
    client = ImgurClient('client_id', 'secret_id')

    for album in album_id:
        # Get our album images
        try:
            images = client.get_album_images(album)
        except ImgurClientError as e:
            print('ERROR: {}'.format(e.error_message))
            print('Status code {}'.format(e.status_code))

        print("Downloading album {} ({!s} images)".format(album, len(images)))

        # Download each image
        for image in images:
            # Turn our link HTTPs
            link      = image.link.replace("http://","https://")
            # Get our file name that we'll save to disk
            file_name = link.replace("https://i.imgur.com/","")
            download_path = os.path.join(args.directory, file_name)
            if os.path.isfile(download_path):
                print("File exists for image {}, skipping...".format(file_name))
            else:
                download_image(link, download_path)

Boom! You’ve now got a nice ghetto album downloader! Throw it all together with a few config functions and you end up with this beautiful bastard.

#!/usr/bin/env python
import argparse, os, sys, yaml
import urllib.request
from imgurpython import ImgurClient
from imgurpython.helpers.error import ImgurClientError

def parse_args():
    parser = argparse.ArgumentParser(description='Download an Imgur album/gallery into a folder.')
    parser.add_argument('-c', '--config', type=argparse.FileType('r'), default=os.path.expanduser('~/.config/imgur_downloader/config.yaml'),
            help='config file to load settings from')
    parser.add_argument('-a', '--album', help='album ID to download from Imgur (can be specified multiple times)', required=True, action='append')
    parser.add_argument('-d', '--directory', default=os.path.dirname(os.path.realpath(__file__)), help='directory to save images into')

    return parser.parse_args()

def load_config(config_file):
    print('Loading config file {}'.format(config_file.name))
    try:
        config = yaml.load(config_file, yaml.BaseLoader)
    except yaml.YAMLError as e:
        exc_type, exc_obj, exc_tb = sys.exc_info()
        print('Error loading YAML {} on line {}'.format(e, exc_tb.tb_lineno))

    return config

def download_image(image_url, download_path):
    """
    Download image and save it to file path
    """
    try:
        urllib.request.urlretrieve(image_url, download_path)
        print("File {} downloaded to {}".format(image_url, download_path))

    except urllib.error.URLError as e:
        print("Error downloading image '{}': {}".format(image_url, e))
    except urllib.error.HTTPError as e:
        print("HTTP Error download image '{}': {!s}".format(image_url, e))

def main():
    global config
    global args

    args   = parse_args()
    config = load_config(args.config)

    # Initialise Imgur client
    client = ImgurClient(config['imgur_client_id'], config['imgur_client_secret'])

    for album in args.album:
        # Get our album images
        try:
            images = client.get_album_images(album)
        except ImgurClientError as e:
            print('ERROR: {}'.format(e.error_message))
            print('Status code {}'.format(e.status_code))

        print("Downloading album {} ({!s} images)".format(album, len(images)))

        # Download each image
        for image in images:
            # Turn our link HTTPs
            link      = image.link.replace("http://","https://")
            # Get our file name that we'll save to disk
            file_name = link.replace("https://i.imgur.com/","")
            download_path = os.path.join(args.directory, file_name)
            if os.path.isfile(download_path):
                print("File exists for image {}, skipping...".format(file_name))
            else:
                download_image(link, download_path)

if __name__ == '__main__':
    main()

Not the prettiest code in the world and its a shame it requires outside dependencies but hey, it works. Please open up your API a little bit Imgur. Having to register for read only access is a PITA.

Check it out on GitHub and don't forget to star it!