Bing of the Day revisited

Back in 2014 I wrote an article about borrowing the bing wallpaper to use on my Macbook. Not long after that, I found that I was missing the odd “Bing of the Day” so I decided to run it in the cloud.

This post cover the steps of how that is achieved so you can “borrow” the Bing of the Day into S3.

Prerequisite

You will need an Amazon Web Services account. The Lambda function that runs this will fit within your free tier and it will take some time before the images dent your S3 free tier.

I’m not going to cover creating an account, but if you don’t have one then you can go to the AWS account creation page to get started.

Steps

Creating the S3 Bucket for your images

First thing you’re going to need is somewhere to store the images. I’m going to create a new bucket in S3 called owensbingoftheday. As per S3 requirements, you’re going to need to come up with your own unique bucket name.

In the AWS management console, go to the S3 service and create the new bucket, the create page should look like this;

Create the S3 Bucket for storing bing images

Creating the Lambda Role

The next thing you’re going to need is an IAM Role which the Lambda function can execute as. First we will create the locked down policy the role is going to use to access the S3 bucket.

Creating the policy

In the IAM service of the AWS Management console, create a new policy and switch to the JSON tab then add the following JSON. You need to change the bucket name in the Resource section to match the name of the bucket you specified in the previous step.

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "WriteBingImages",
            "Effect": "Allow",
            "Action": "s3:PutObject",
            "Resource": "arn:aws:s3:::owensbingoftheday/*"
        }
    ]
}

The policy should look something like this; (you can ignore the error at the top of the screen shot)

Create a new policy to write to S3

Creating the role

Now that the policy has been created, we need the role that Lambda will execute as. This role will be created to trust Lambda. There is no need to explicity specify the Trust Policy for the role, we just select Lambda when creating the role.

Create a new role in the IAM service in AWS Management console. Ensure you select the service as Lambda to allow Lambda to assume the role.

Create the new role

Next, attach the policy we created earlier, you can filter to find it

Create the new role

Finally, save the policy with a unique name

Create the new role

At this point, all of the plumbing is done to create the new Lambda function. Go to the Lambda service in AWS Management console.

Creating Lambda Function

For this function we’re going to be starting a function from scratch. As the code to download the Bing of the day wallpaper is written in Python, we’ll be using that for the runtime.

Create a new function and the first page should look something like this;

Create the new role

Specify the new role that you’ve just created, it will allow the function to PUT our images into the specified bucket.

On the next screen, paste the code below into the code box and Save.

import boto3
import json
import os
import urllib3

from datetime import datetime

IMAGE_ARCHIVE_URL = 'http://www.bing.com/HPImageArchive.aspx?format=js&idx=0&n=1&mkt=en-US'
http = urllib3.PoolManager()

def get_image_details(payload):
    image = payload.get('images')[0]
    urlbase = image.get('urlbase')
    name = image.get('fullstartdate')
    return name, 'http://www.bing.com'+ urlbase + '_1920x1080.jpg'


def save_to_s3(name,  path):
    s3 = boto3.client('s3')
    year = datetime.now().year
    bucket_name = os.environ['BUCKET_NAME']
    object_name = '%s/%s.jpg' % (year, name)
    print ("uploading {}".format(object_name))
    s3.upload_file(path, bucket_name, object_name)


def get_image(payload):
    name, image_url = get_image_details(payload)
    path = '/tmp/%s.jpg' % name
    with open(path, 'wb') as f:
        pic = http.request('GET', image_url)
        f.write(pic.data)
    return name, path


def lambda_handler(event, context):
    req = http.request('GET', IMAGE_ARCHIVE_URL)
    if req.status != 200:
        print("Could not get the image archive data")
    else:
        payload = json.loads(req.data.decode('utf-8'))
        name, path = get_image(image_url, path)
        save_to_s3(name, path)
        

The lambda function can now be tested. We’ll need an event to send so using the drop down next to the Test button, create a new test event.

Fill out the details for an empty event so it looks like this;

Create the test event

Now, when you press the Test button, it will send the empty event to trigger the lambda function, all being well, you’ll find today’s image in the specified S3 bucket.

Scheduling the function

The intention was to move this from the laptop to run automatically within AWS. We now need to trigger this function to download once a day. To achieve this, we’re going to use a CloudWatch trigger with a schedule of 24 hours.

Creating the CloudWatch rule

Give the rule and name and save it. Our lambda function will now be triggered every 24hours to download the Bing of the day.

Downloading to your machine

The last thing to do is the routine download of the latest files. To do this I use the aws cli and the sync function. this will download anything not already on my machine;

aws s3 sync s3://owensbingoftheday/2019 ~/Pictures/bing-wallpapers/2019

New Year, New Intentions

Last year I managed 3 blog posts so this year my aim is to beat that by at the very least doubling it.

At the end of 2018 I completed my AWS DevOps Engineer - Professional certification, to that end I’m planning to blog a lot more about AWS and abstractions of what I’m working on day to day.

Introducing gitsearch-cli

The first version of gitsearch-cli is now available. This command line interface allows you to search github repositories and users using keywords and (currently) a handful of additional criteria.

Installation

To install git search you can use pip3 with the following command;

pip3 install gitsearch-cli

Usage

By default the search will be scoped to look in repositories, however you can change the scope to look specifically for users.

For additional help, just use;

git-search --help

Searching for Users

git-search --scope=users owen rumney

or

git-search --scope=users owenrumney

This will yield the following results;

username url
owenrumney https://github.com/owenrumney

Searching for repositories

When searching for repositories you can create a general search by keyword or focus the search by including the language and/or user.

git-search -l=scala -u=apache spark

This will give the following result;

name owner url
fluo-muchos apache https://github.com/apache/fluo-muchos
predictionio apache https://github.com/apache/predictionio
spark apache https://github.com/apache/spark
spark-website apache https://github.com/apache/spark-website

If you want to only return results where the keyword is in the name, you can use the --nameonly flag

git-search -l=scala -u=apache spark --nameonly

This will give the following result;

name owner url
spark apache https://github.com/apache/spark
spark-website apache https://github.com/apache/spark-website

TODO

  • Add date based options for search criteria
  • Refactor the code to be more pythonic

Allow connection to dockerised elasticsearch other than localhost

We need to access ElasticSearch in a namespace within minikube and the other Pods can’t connect to 9200. It turns out that from the box its limited to localhost and the network.host property needs updating.

Setting network.host in the elasticsearch.yml configuration file on a docker container will put the instance into “Production” mode which will invoke a load of limit checks including, but not limited to the number of threads allocated for a user.

To my knowledge setting ulimits in Docker isn’t trivial so another way to expose ElasticSearch to other pods is required.

The answer appears to be, set http.host: 0.0.0.0 so that its listening on all interfaces. This will allow you to stay as a development instance without all the ulimit issues stopping startup and you can access outside of the Pod.

Argument defaults in shell scripts

Regularly when writing a shell script I find that I want to be able to pass an argument into the script but only sometimes. For example if I want the script to output to /tmp folder for the most part but I’d like the opportunity to select the output myself.

Default arguments can be used in scripts using the following simple syntax

#!/bin/sh

# example script to write to output folder

OUTPUT_PATH=${1:-/tmp/output}

echo "some arbitrary process" > ${OUTPUT_PATH}/arbitrary_output.output

This will either used the first parameter passed in for the output path or a default value of /tmp/output if that isn’t provided

sh example_script.sh # outputs to /tmp/output

sh example_script.sh /var/tmp/special # outputs to /var/tmp/special