Introducing Changed.Page

I’m excited to announce something I’ve been working on for a little while: Changed.Page. This is a platform that notifies you when web pages change. Every day this checks pages to see if they’ve changed and send you an email if they have. It’s that simple.

I’ve been using this myself for a little while, and now you can too.

The website is at: https://changed.page

I’m going to try to cover as many questions about this as possible.

Why did you build this?

This was born of frustration. In my day job we use a number of APIs that change, but there was no way to be notified of the upcoming change. While there might be a blog post or a release notes page, there was no way to get notified they had changed. This was and still is extremely frustrating.

While the companies that provide the APIs could simply notify people of the changes in advance, they didn’t. Or they force you to log into their platform to get updates. Either way it meant manually checking pages to see if there was an update.

It seemed like such a waste of time to manually check the pages. It was crying out to be automated. So I built this.

How does it work?

In a word: serverless. I saw this as a real opportunity to build something interesting using serverless computing. Serverless provides the promise of allowing you to write code without worrying as much about how it gets hosted. It means you no longer have to deal with building servers, whether they’re cloud based VMs or physical servers. It means no more patching required.

Serverless also provides a much better way to scale out based on load. Under most models of hosting, as load increases you scale out with more servers, typically based on things like increased CPU load. However it takes time to provision new servers, deploy the software and start handling the load. Often it also takes time before the servers are fully operational and “warm” enough to perform well.

Serverless allows new resources to be added in smaller increments. Rather than adding a server, you add just what you need, when you need it.

It’s also a very cost effective way of hosting a platform. It means you only pay for the resources you use. If the load isn’t constant, then serverless is a great way to manage that load. And in almost all cases, load is not constant.

I’m going to write a few follow up posts about how this has been built.

Other questions

What are you doing with my email?

We’ll send you emails when pages change or updates about improvements to the platform. That’s it.

Why is there a limited set of pages to monitor?

I’m not ready to open this up to monitor any and every web page. I’m not 100% sure how people might use this at this point if it’s totally open. That could change in future.

How is this different from archive.org?

They have a different focus. Archive.org aims to provide a comprehensive digital library. Changed.Page just looks to send a notification when something changes.

How will you make money from this?

I won’t. This a free service because I think it’s valuable for other people. For reasons I’ll go into later blog posts, this should cost close to zero.

Where will this go?

I’ve built this because it solves a problem for me. At the same time, I think it might be helpful to other people. If I had the problem, other people might also.

So where it goes from here? I’m really not sure but I definitely expect to keep adding more pages to be monitored.

I’m happy to keep improving this to continue to make it useful. If you’ve got any suggestions, feel free to get in touch at [email protected].

Managing Patches on AWS

During a quiet time over the Christmas break, I spent a little time looking for a better way to manage patching EC2 instances on AWS. It took a little while to put it all together so I thought this might help a few other people if I wrote it up.

We’re running EC2 instances (AWS Linux) as hosts for docker containers, so there isn’t a huge amount to patch. All the same, even a minimal install is vulnerable to attack through unpatched software (eg ssh).

Under the circumstances we want:

  • automatic patching with no human intervention
  • installing patches out of hours

AWS Patch Manager

Google led me to AWS Patch Manager, which looked promising.

I was far less impressed when I read the details of how this was implemented for AWS Linux. The 7 step process had a handy summary at the bottom:

Note

The equivalent yum command for this workflow is:

sudo yum update-minimal --security --bugfix 

So, essentially it’s just a wrapper around yum update. It’s really not clear how much benefit AWS patch manager brings for a smaller scale installation.

Installing the updates

Installing the updates was pretty simple:

yum update-minimal --security --bugfix -y

This installs just security patches, answering yes to all questions.

In some cases updates require a reboot. While researching this, I found this great discussion, and added this:

needs-restarting -r || shutdown -r

To put it all together into a shell script:

#!/bin/bash
yum update-minimal --security --bugfix -y
needs-restarting -r || shutdown -r now

Now we’ve got a script to install updates and handle reboots if required.

Running the update script

The scripts isn’t much help if we don’t have a way to trigger it. The most obvious thing to do is to add this as a cron task. Easy to do if you’re logged into the machine, but you really want this built into your infrastructure configuration.

For EC2, the best way to do this is through UserData scripts, configured through Cloud Formation templates. There are some great examples of this here (see the section “Parameters Section with Parameter Value Based on Pseudo Parameter”), the YAML syntax is easier to work with.

I want to drop the update script into cron.weekly, which I can do using echo.

To build the update script, this becomes:

echo "#!/bin/bash" > /etc/cron.weekly/install_updates.sh
echo "yum update-minimal --security --bugfix -y" >> /etc/cron.weekly/install_updates.sh
echo "needs-restarting -r || shutdown -r now" >> /etc/cron.weekly/install_updates.sh

There are a couple of other adjustments to make:

  • needs-restarting isn’t installed by default, so I need to install that: yum install -y yum-utils
  • The script needs permissions to be executed:
    chmod +x /etc/cron.weekly/install_updates.sh

The full script is:

yum install -y yum-utils
echo "#!/bin/bash" > /etc/cron.weekly/install_updates.sh
echo "yum update-minimal --security --bugfix -y" >> /etc/cron.weekly/install_updates.sh
echo "needs-restarting -r || shutdown -r now" >> /etc/cron.weekly/install_updates.sh
chmod +x /etc/cron.weekly/install_updates.sh

Controlling when it executes

One last issue, we don’t really want this installing updates at the wrong time. The simplest way to do this to adjust the hours when cron (actually anacron) runs. This means adjusting START_HOURS_RANGE and I also wanted to reduce the random interval range when the job runs within this range (RANDOM_DELAY).

It’s relatively simple to adjust the config using sed:

sed -i 's/START_HOURS_RANGE=.*/START_HOURS_RANGE=14-19/' /etc/anacrontab
sed -i 's/RANDOM_DELAY=.*/RANDOM_DELAY=20/' /etc/anacrontab

Putting it all together

The final UserData script is:

#!/bin/bash -xe
sed -i 's/START_HOURS_RANGE=.*/START_HOURS_RANGE=14-19/' /etc/anacrontab
sed -i 's/RANDOM_DELAY=.*/RANDOM_DELAY=20/' /etc/anacrontab

yum install -y yum-utils
echo "#!/bin/bash" > /etc/cron.weekly/install_updates.sh
echo "yum update-minimal --security --bugfix -y" >> /etc/cron.weekly/install_updates.sh
echo "needs-restarting -r || shutdown -r now" >> /etc/cron.weekly/install_updates.sh
chmod +x /etc/cron.weekly/install_updates.sh

Putting this in context, with the YAML template:

UserData:
  Fn::Base64: !Sub |
    #!/bin/bash -xe
    sed -i 's/START_HOURS_RANGE=.*/START_HOURS_RANGE=14-19/' /etc/anacrontab
    sed -i 's/RANDOM_DELAY=.*/RANDOM_DELAY=20/' /etc/anacrontab

    yum install -y yum-utils
    echo "#!/bin/bash" > /etc/cron.weekly/install_updates.sh
    echo "yum update-minimal --security --bugfix -y" >> /etc/cron.weekly/install_updates.sh
    echo "needs-restarting -r || shutdown -r now" >> /etc/cron.weekly/install_updates.sh
    chmod +x /etc/cron.weekly/install_updates.sh

This works fine for either individual EC2 instances or Autoscale Launch Configurations.

Final Thoughts

It’s a very long time since I’ve had to look at anything like this. The last time I looked at something like this was the early 2000s when I was hosting my own mailserver and fileserver on debian Linux.

At the time I used to add a script in cron.daily:

apt-get update
apt-get upgrade

What I find amazing is how much this hasn’t changed in over 15 years.

Why would someone hack your ‘thing’

This is a follow up to an earlier blog post (Internet of things heading for a trainwreck), where I commented on the security of the internet of thing. An interesting follow up article on this is: ‘Things’ on the Internet-of-things have 25 vulnerabilities apiece. It’s highly unlikely that any of those 25 vulnerabilities will be patched.

But I digress, the question is, why would someone hack your ‘thing’?

Reason 1: Because it is there

Someone might not be targeting your thing or even your class of thing when they take control. Your thing might be share vulnerabilities with someone that is being targeted. For example the underlying operating system of the device or hosted applications might share vulnerabilities that are common in web servers running on the internet. Many devices would provide a web interface to manage them, and to do this they are likely to use commonly available webservers, eg ngix or apache.
However simply being available on the internet makes something a target. Someone scanning the internet for something interesting to break into would not necessarily know that the device responding on port 80 is a thermostat rather than a webserver.

Reason 2: For the capabilities

In many ways a thing is mostly a less powerful computer with extra capabilities. It’s unlikely that someone would hack your thing for the computing power (although people have produced bincoin mining malware for android). However your thing is every bit as capable as a computer in every other way. It could be used to gain a foothold in your network. It could be used as a spam relay. It could be used in DDoS attacks. It could be used to host malware. The list goes on…
Your thing is generally different though, as it is a computer + something. That something could provide a rich set of capabilities that computers don’t currently have. For example, a Nest has sensors to identify whether someone is home or not. Imagine if someone were able to hack into your Nest, prior to breaking into your house to check if you are home. Cameras on ‘things’ could well be used in the same way that RAT trojans are used for voyeurism and blackmail. Consider that you might not have a laptop with a webcam in every room, but you might well put a ‘thing’ with a camera each room, including the bathroom.
As the internet of things starts to take off, people are going to work out new ways to exploit the new sensors that it brings to the table.

Reason 3: For the data

Not all ‘things’ have sensors that would be useful in real time but many of them collect some very interesting data. Something like a fitbit can track your activity. We want that information to run our lives, but it is just as interesting to a third party. Things can provide historical data on heart rate, location or any other information that might be collected.

Conclusion

Security is hard and typically is an afterthought. This should get rather interesting when your lightbulbs and door locks get hacked.

Internet of things heading for a trainwreck

Many years ago I read The inmates are running the asylum. One of the early chapters pointed out that when you cross a computer with anything you get … a computer. The intention of the chapter was to highlight terrible user interfaces for computers. 10 years later, I wonder what Alan Cooper would think of “The Internet of Things”.

The Internet of Things is basically thing + computer + connection to the internet. Now with lower power chips, it’s never been easier.  A Rasberry Pi is more powerful than the first computer I built. Slap on a customised linux distro, marry it to your ‘thing’ and you are away.

Is it a ‘thing’ or a computer?

The internet of things will add internet and computing power to things we know now. Thinks like lights, fridges, watches, power points … well pretty much anything. We are used to interacting with these things in the same way that we interact with appliances. You plug them in, switch them on and they just work.
Appliances often have quite long life cycles. For example the fridge we own now used to belong to my wife’s grandfather and must be over 20 years old.
This is very different to how we treat computers.
It’s worth reviewing how computers have been used.

Computers – a brief review

Computers were once like appliances. You could buy something like a TRS-80, plug it in and use it. It didn’t have any persistent storage. Programs were stored on cassettes or later on floppy disks.

Viruses

Early viruses would infect programs on a disk or the disk itself. The vector for infection was generally sneaker net. Someone would be infected with the virus from someone else when it came into contact with their infected disk. Infection far easier once computers started getting hard drives as the virus could infect any floppy disks that were inserted into the computer.
However the speed that viruses could spread was pretty limited by the way that they spread.

Networked – appliances meet Metcalf’s Law

Metcalf’s Law says “the value of a telecommunications network is proportional to the square of the number of connected users of the system”. The short version is that computers get much more valuable when they are connected together. this is one of the great benefits that the internet of things promises.
What it does mean is that when all the computers are connected together, a virus (or any other sort of malware) can spread far faster. The most spectacular example of this was SQL Slammer, where it is estimated that almost all of the vulnerable systems were infected within 10 minutes of its release.
This has exposed the reality the all computer systems have bugs. Networked computer systems are exposed to all the malware and bad actors on that network. And the internet is a very, very large network.

Obsolete – appliances meet Moore’s Law 

Moore’s law (better stated as Moore’s curves) is generally understood to say that computing performance doubles every 18 months. This is a phenomenally rapid rate of improvement. Imagine if kettles could boil water twice as fast every 18 months.
One impact of this is that computers have a relatively short lifespan when compared to other items. Most computers would be replaced within 5 years (by which time their replacement would be 8 times as fast).

Lifecycle of a computer

While computers started out as close to appliances, they now have a very different life cyle in two very key ways:
  1. They get updates to fix vulnerabilities to protect them from malware
  2. They live for less than 5 years

Back to the internet of things

My fear is that in the end these are all computers yet they are not being treated like computers. The problem here is computers are not like appliances.

Does your ‘thing’ get updates?

Will the manufacturer commit to providing software updates for the life of the ‘thing’ or just the warranty period? The company producing has a primary interest in selling the thing rather than the software driving the thing. Often that is their primary area of expertise. Typically the software is an afterthought. It’s very likely that the ‘thing’ you buy will never receive updates.

It will be left running the same software it shipped with. Through the internet it will be exposed to all of hackers, crackers, tinkers, malware writers and cyber criminals. They will find holes that need patching and nobody will be there to patch them.

I’m not the only one worried about this sort of thing.

Conclusion

The internet of things is going to provide a stack of new devices that can get hacked.
It took 20 years to create the security lifecycle that we have today. How long will it take for the internet of things to catch up?