Remembering dozens of passwords

You’ll never forget your password ever again

In recent weeks, there have been claims that username/passwords of Dropbox have been leaked online. While Dropbox has denied that any passwords were leaked, their advice was for “users not to reuse passwords across services”. For people who don’t use second-factor authentication or password manager services, this is good advice.

In fact, I’ve moved away from the approach I described previously of how to choose a strong password. There is no such thing as a strong password once it’s leaked. Sadly, even well regarded sites like Evernote and LinkedIn have had their passwords stolen, and no service can be considered immune to hacks.

Previously, I simply remembered passwords relating to different tiers of service: a password for my most secure service, another for secure but less important services, another for services I use regularly but don’t need to be secure, and another for services that I don’t really use. This way I just needed to remember a handful of passwords across many sites. Unfortunately, this method is not proof against hacks.

However, to remember a different password for every site is infeasible for most people (including me!). Still, there is a way to have a large number of different passwords across different sites but need to remember only two things: a password stub and a password algorithm. When logging in, a user just needs to apply the name of the service and the stub to the algorithm, and out should pop a (relatively) unique password. Different stubs might be used for different accounts, e.g. if the same service is used for both work and personal purposes.

Here’s an example of how this might be used. Take the password stub “pa55word” and the algorithm “insert the second and third letter of the site name in the third position”, then if this user was logging in to “”, the second and third letter would be “ro” and the unique password would be “paro55word”. (Let me just say that this is neither a stub that I use nor an algorithm, and now that it’s documented here, not one that you should use either.)

Since there are potentially 676 (26 x 26) combinations of second and third letters, this algorithm can generate hundreds of passwords without needing to remember more than two things. It’s easier than my previous approach where I needed to remember at least four things.

In choosing a stub, it’s helpful to include the sorts of things that password strength tests look for, e.g. some punctuation, a number and both upper and lower case letters. In choosing an algorithm, you want it to be pretty simple so that it will work for many different site names, so don’t go overboard.

So this will let you follow Dropbox’s advice, and avoid you reusing passwords, but when (!) a service has its passwords hacked and you need to change the password, it’s not going to work. So, probably you need to remember a third thing – how many times a given service has been hacked (hopefully there aren’t too many). Then you would have a modification of the algorithm that would incorporate this information as well, e.g. have as the letters inserted for the second iteration of a password on to be “rro” instead of “ro”, and the third iteration being “rrro”, etc. This does expose the main weakness of the method, in my opinion, so I’m hopeful of coming across a better approach at some point.

As I mentioned at the top, second-factor authentication and password manager services are also approaches that can be considered, but have their own downsides. I’m more hopeful that these services will improve in usability and utility over time so that I can make more use of them, before I need to remember the details of too many website hacks.

Lessons from NYT on innovation

The Kindle New York TimesWhatever the circumstances that led someone at The New York Times to leak their report on Innovation, I am thankful. Published (internally) in March, it is the fruits of a six month long deep-dive into the business of journalism within a company that has been a leader in that industry for over a century, and provides an intimate and honest study into how an incumbent can be disrupted. It is 97 pages long, and worth reading for anyone who is interested in innovation or the future of media.

The report was leaked in full in May, and I’ve been reading bits of it in my spare time. Just recently I completed it, and felt it was worth summarising some of the lessons that are highlighted by the people at the Times. As it is with such things, my summary is going to be subjective and – by nature – highly selective, so if this piques your interest, I encourage you to read the whole thing.

(My summary ended up being longer than I’d originally intended, so apologies in advance.)

Organisational Division

Because of the principle of editorial independence, the Times has clear boundaries between the journalists in the newsroom and those who operate “the business” part of the newspaper, which has been traditionally about selling advertising. This separation is even known as “church and state” within the organisation, and affects everything from who is allowed to meet with whom (even during brown-bag lunch style meetings) to the language used to communicate concepts. This has worked well in the past, allowing the journalism to be kept at the highest quality, without fear of being compromised by commercial considerations.

However, the part of the organisation that has been developing new software tools and reader applications is within “the business” (not being journalists), and has hence been disconnected from the newsroom. Hence new software is not developed to support the changing style of journalism, and where it is, it is done as one-off projects. Other media organisations are utilising developers more strategically, resulting in better tools for the journalists and a better experience for the readers.

Lesson: Technology capability needs to be at the heart of an innovation organisation, rather than kept at arms-length.

Changing Customers

For a very long time, the main customer of the Times has been advertisers. However, print media is facing a future where advertisers will not pay enough to keep the organisation running. Online advertising pays less than print advertising, and mobile advertising even less again. Coupled with declining circulation due to increased digital readership, the advertising business looks pretty sick. But there’s a new type of customer for the digital editions that is growing in importance: the reader.

While advertising revenues had the potential to severely compromise journalism, it’s not so clear that the same threat exists from reader revenues. In theory there is a good alignment: high quality journalism results in more readers. But if consideration of attracting readers is explicitly kept away from the newsroom as part of the “church and state” division, readers may end up being attracted by other media organisations. In fact, this is what is happening at the Times, with declines in most online reader metrics, and none increasing.

In the print world, it was enough to produce a high quality newspaper and it would attract readers. However, in the digital world this strategy is not currently working. Digital readers don’t select a publication and then read the stories in it, they discover individual articles from a variety of sources and then select whether to read them or not. The authors of articles need to take a bigger role in ensuring those articles are discovered.

Lesson: When customers radically change, the business needs to radically change too (many true-isms may be true no longer).


The rules for success in digital are different from those of traditional print journalism, although no-one really knows what they are yet. That said, the Times newsroom has an ingrained dislike of risk-taking. Again this made sense for a newsroom that didn’t want to print an incorrect story, and so everything had to be checked before it went public. However, this culture inhibits innovation if applied outside of the news itself.

Not only does it a culture of avoiding risks prevent them from experimenting and slow the ability to launch new things, but smart people within the organisation risk getting good at the wrong things. A great quote from the report: “When it takes 20 months to build one thing, your skill set becomes less about innovation and more about navigating bureaucracy.”

Also, the newsroom lacks a dedicated strategy and operations team, so doesn’t know how well readers are responding to experiments, or what is working well for competitors. Given that competitors are no longer only other daily newspapers, it’s not enough to just read the morning’s papers to get insight into the competition. BuzzFeed reformatted stories from the Times and managed to get greater reader numbers than the Times was able to for the same stories.

Lesson: If experimentation is being avoided due to risk, then business risks are not being managed effectively.

Acquiring Talent

It turns out that people experienced in traditional journalism don’t automatically have all the skills to meet the requirements of digital readers. However, the Times has a bias for hiring and promoting people in digital roles based on their achievements as journalists. While this likely worked in the past to create a high quality newspaper, it isn’t working in digital. In general, the New York Times appears to be a print newspaper first, and a digital business second. The daily tempo of article submission and review is oriented around a daily publication to be read in the mornings, rather than supporting the release of stories digitally when they are ready to be published. Performance metrics are still oriented around the number of front page stories published – a measure declining in importance as digital readers cease to discover articles via the home page.

The lack of appreciation for the digital world and digital people in general has resulted in the departure of a number of skilled employees, according to the report. Hiring digital talent is also difficult to justify to management given that demand has pushed salaries higher for skilled people even if those people are relatively young. What could be a virtuous circle, with talent attracting talent, is working in the opposite direction with what appears to be a cultural bias against the very talent that would help the Times.

Lesson: An organisation pays for the talent either by paying market rates for capable people or paying the cost in lost opportunities.

Final words

When I first came across the NYT Innovation report, I expected to read about another example of the innovators’ dilemma, where rational business decisions kept them from moving into a new market. Instead, the report is the tale of how the organisation structure, culture and processes that made The New York Times great in the past are actively inhibiting its success in the present. Some of these seem to have become sacred cows and it is difficult for the organisation to get rid of them. It will require courage – and a dedication to innovation – to change the organisation into one that is able to compete effectively.

Hackathon Tips

fall 2012 hackNY student hackathonLast week, I participated in my first Hackathon. It was an internal one for Telstra employees, but there were around 40-50 people involved, and my team ended up winning – which was awesome. However, the experience of being part was a reward in itself, with the collective energy creating a real buzz, and there was a huge amount of satisfaction in being part of something so productive.

Since it was all internal development, I’m not going to share the details of the idea. However, I was one of the two developers on the team (we were also joined by an awesome interface designer and fabulous digital sales person) and I wrote a back-end server in Node.js that had to implement a web server, IMAP to Gmail, and OAuth to I’d been doing some serious Node.js development in a previous project, so I didn’t have to learn that, and the IMAP stuff wasn’t too different from a hobby project I’ve discussed before (although that was in Python). Getting OAuth to work was the main hurdle, but the advantage of picking popular frameworks and services is that others are likely to have solved the major problems before me, and Stack Overflow was a good source of solutions.

In any case, I thought it might be worthwhile to share a couple of the things that I think I did well, and which might help others going into their own Hackathons. Putting aside the strength of the idea and the talent possessed by the team – which would have been the principal things that helped us win the top prize – I think there were three things that put us in the best position to pull it off.

1. Networking prior to pitching

The start of the Hackathon was to build a team on the strength of a one-minute pitch, and around half the participants pitched an idea. So, it was a pretty competitive way to start things off, and one minute isn’t much time to sell yourself and your idea. However, before the pitching began, there was about a half hour of social drinks (the Hackathon started after work had finished for the day).

I decided to use the social drinks time to be social, rather than just chatting to people that I already knew. As it turned out, this was a good thing to do, since a natural ice-breaker was to ask if someone was planning to pitch an idea, and to share the idea I was planning to pitch. This meant that I got to speak to several people for longer than one minute about my idea, and one of those people ended up deciding to join my team.

This was a lucky break, since once the one-minute pitches were all done, there were now two of us going around selling the project idea to others. I doubt I would’ve gotten a project team together without this, since I hadn’t pre-arranged a team to work on the idea.

2. Knowing ahead of time how to achieve the idea

Luckily, the idea was one that I’d done some initial work on with others in Telstra. Also, I’d done a bit of research to see how a useful version of it might be implemented in the time available. One of the rules was that we had to use a partner API, like that of’s, so for example I had a quick look to see that the APIs would do what was needed.

As a result, I was able to explain clearly at the start how I proposed that we would go about building something. Also, I was able to respond to a variety of objections and arguments that were put to us by mentors, peers and judges during the Hackathon.

That’s not to say that I was stubborn or unmoving when it came to the idea (at least, I’d like to think I wasn’t). It’s just that I wasn’t making decisions or coming up with responses from a position of ignorance. We did explore a couple of variants of the idea as we went along and there were additional features that were built that I hadn’t originally thought of. However, we were very focussed, and I think this helped in realising the idea.

3. Progressing through Tuckman’s Stages ASAP

If you haven’t heard of Tuckman’s Forming, Storming, Norming and Performing stages of team development, add it to your to-do list to read up. (Or do it now – I’ll wait here if you want.) I was conscious that the team had only a limited time to complete the project, and a major risk was consuming valuable time in internal team politics. We needed to get to the Performing stage as quickly as possible.

Rather than detail exactly how the team evolved, I’ll just mention a few things that I think helped us progress:

  • Forming the team in a social setting was a good way to start with some of the barriers broken down.
  • The pre-work mentioned in point #2 above helped us stay in synch. Also, the first thing I did is answer questions from the team on the idea and its implementation, so we begun heading in the same direction.
  • The next thing I did was ask everyone for their thoughts and plans on how to begin, so we had a collective plan.
  • As the idea evolved, we wrote up the specifics on one of the walls of the office we were in so that everyone could see it.
  • Everyone had largely independent activities, so we weren’t held up waiting on each other.
  • I was team “captain” but I spent much of my time contributing to the final outputs, i.e. was part of the team rather than the manager of the team.

That said, the team was made up of easy-going people, so it was probably less likely we’d have a big falling-out. However, since I didn’t know any of them in advance, I didn’t know this.


We also spend a couple of hours prior to the final three-minute presentation going over (and over) the demo and presentation. This was worthwhile, but an obvious thing to do.

So, I think I ended up with a winning team through a combination of good luck and good planning. However, while I can’t help with the luck, I hope the above tips would aid you if you’re entering a Hackathon. I hope you enjoy it as much as I did.

Android on Xperia

One of my handsets is a Sony Xperia E C1504 (code-name Nanhu) which was a low-end-ish Android handset when it launched in early 2013, and was apparently relatively popular in India. One of its claims to fame is that it was also one of the first handsets to have a version of Firefox OS available for it. But why I’m writing about it here is that Sony has been hassling me to upgrade it to the most recent version of firmware (11.3.A.2.23), and recently I gave in, but since I was mucking about with the firmware I thought I’d “root” it as well. And therein lies the tale.

Although, as is often the case with Android when wandering off the well-trod path, it’s more of a cautionary tale.

“Rooting” an Android device means to gain complete control over the operating system through installing a superuser tool in the system partition. When Android is running, the system partition is read-only, so this step has to be done outside of Android itself (unless an “exploit” is used, utilising a bug in an Android implementation to achieve this). The usual process for achieving root is: 1) unlock the bootloader, 2) install a custom recovery partition, 3) copy a superuser tool onto the device’s filesystem, and 4) install the superuser tool into the system partition from within recovery. Oh, if only it was that easy.

Step 0 – Install USB Drivers

Before you can do anything, you need to get the right USB drivers set up on your PC, which is a world of pain itself. Complications come from whether the PC is running a 32-bit or 64-bit operating system, whether the drivers are 32-bit or 64-bit, whether the drivers support ADB, fastboot, or (for Sony) flash modes, and which particular USB port the cable is plugged into (and for which mode). I’m running Windows 8.1 64-bit, which seems to have limited driver support for this sort of thing.

I had to:

  • Install the Android SDK from Google, so that the fastboot and adb tools were installed on my PC
  • Install the ClockworkMod universal USB drivers
  • Before going any further, make sure the drivers are working. Use “adb devices” or “fastboot devices” from the command line to list the devices that can be seen. To put the Sony Xperia E into fastboot mode: turn off the handset, ensure it is not connected to the PC via the USB cable, hold down the volume up button, then connect it to the PC via the USB cable.
  • When I connected the Xperia E in fastboot mode, the laptop reported it as an unknown device “S1 Boot”. I opened the Device Manager (press Windows-X and select it from the menu, for a quick way to get to that), right-clicked on the unknown device, selected “Update Driver Software”, then “Browse my computer for drive software”, then “List me pick from list” and I chose the Nexus bootloader driver from the ClockworkMod set of drivers.
  • I used my laptop’s powered USB port for ADB and fastboot modes, but an unpowered USB port for the flash mode.

Pro-tip: If something’s not working, try another USB port. Try all of them.

Step 1 – Unlock the bootloader

Sony provides an official way to unlock the bootloader. Be warned that although it’s “official”, it will still void your warranty, and potentially brick your device. Ensure everything that is valuable on the device has been backed up somewhere else. At the very least, all apps and configuration on the device will be lost anyway.

I followed the unlocking instructions from Sony Developer World. However, my device immediately stopped working afterwards. It would go into a boot loop, showing the Sony logo, then the boot animation, then back to the Sony logo, and so on. The solution was to reflash the operating system.

I download the Flashtool that’s available from (I used Flashtool64 since I have a 64-bit Windows). It’s the one that’s commonly used in the Sony Android custom ROM community.

I followed the instructions at XDA Developers to flash the 11.3.A.2.23 version of the C1504 firmware onto the device (the newest version currently available). This happened to be an Indian build of the firmware, so I’ve ended up with some links to Shahrukh Khan videos on my device as a result. :)

It was pretty hard to find these instructions. Most Sony Xperia E firmware flashing instructions refer to the C1505 version, which has support for 3G at 900MHz instead of 850MHz. Since Telstra’s 3G network is 850MHz, I need this capability of the C1504, and didn’t trust that C1505 firmware would give me what I wanted.

Now my handset booted again, but aside from the Shak Ruhk Khan videos, I hadn’t gained anything new yet.

Step 2 – Install a custom Recovery partition

The Android fastboot mode gives access to the bootloader. Now that it was unlocked, the bootloader allowed a special partition called the recovery partition to be flashed with new firmware. The recovery partition is an alternate boot partition to the default system partition.

The two most popular Android recovery partitions are from CWM (ClockworkMod) and TWRP (Team Win Recovery Project). I’ve used both in the past, but this time I chose CWM since it seemed to be more widely tested on the Xperia E. Unfortunately, I found that most of the instructions for installing the CWM recovery resulted in Wi-Fi ceasing to work on my device. I also tried to build a new version of CWM for my device, but the CWM build tools didn’t support it.

However, I found someone had built a version of the ZEUS kernel (replacing the default Sony Xperia kernel) that included CWM recovery, and this wouldn’t have the Wi-Fi issue. I followed the instructions at XDA Developers to flash that onto my device.

Step 3 – Superuser tool

Now when I turned on the device, it booted as normal, but a blue light appeared at the base of the device at the beginning when the Sony logo is shown. When the blue light is shown, the device can be diverted to boot into CWM recovery rather than the default Android system by press a volume key. However, I needed to put a superuser tool onto the device’s SD card before this new feature would be useful.

The superuser tool is made up of two files (a system application called “su” and an Android app called “Superuser.apk”), but both are stored within a zip file for easy installation. I got the zip file from the AndroidSU site (the one labelled

I installed the zip file onto the sdcard simply by enabling the Developer Options on the device (under Settings) and ticking the USB Debugging option, then attaching the device to my PC via USB, and using the command on my PC:
adb push /sdcard/

Step 4 – Install the Superuser tool

I disconnected and rebooted the device, and when the Sony logo appeared (and the blue light), pressed a volume key. The device booted into the recovery partition. I followed the instructions at XDA Developers (starting at step 5).

And that’s all (!). Now I can boot my unlocked, rooted Xperia E (C1504). Remember, if you’ve followed along with these instructions, you’ve voided your warranty. But at least now you can install whatever you want on the device, or change any of the configuration settings.

One thing you might want to install straight up is a build.prop editor, such as the mysteriously named Build Prop Editor, to change configurations. For example, tweaking the network settings for the Australian mobile operators seems to improve performance. I haven’t tried these myself yet, but it’s an example of the sort of thing that can be done.

A protest

Crowd outside of State LibraryI joined today’s Palm Sunday Walk (a.k.a. Walk for Justice for Refugees), and I was asked why I bothered, since governments have a long tradition of ignoring protest marches. This post is meant to briefly explain why.

Firstly, while some protests have a long list of vague things that those involved are against, this march at least had a clear message about what it was for: justice for refugees.

I have to accept that the current government was elected fairly under a reasonable democratic system. Also, I have to accept that they are strong believers in a policy of deterrent to reduce the number of refugees arriving here by boat.

However, I cannot accept that any type of deterrent be used in the implementation of this policy. Breaches of human rights, including conditions akin to torture, cross a line. Australia is a civilised nation, we follow the rule of law, and we treat people humanely.

While I wasn’t happy with the policies of either our previous Prime Minster or the current PM, I wouldn’t have marched in protest against those policies. (I rarely march in protest against anything.) However, the recent events of Manus Island, including the circumstances that led to a death of a refugee, are not consistent with how we treat people in this country.

I was there today to be counted. I don’t expect our current government to immediately change anything, but I don’t want to let the current implementation of their policies be silently accepted. They will likely do what they feel they have the power to do, but they shouldn’t think that it’s acceptable to the wider community.

Australian institutions are accountable to the standards of the community that they operate in.

The Amazing Bitcoin

Bitcoin over circuit board

I am impressed with the novelty and cleverness behind the online phenomenon known as Bitcoin. For those who came in late, bitcoins could be described as digital commodities. People can trade them for actual currency and sometimes real goods. However, while it’s true that we’ve been using something called money for this purpose already, and so you may ask why we need it, Bitcoin has a couple of interesting properties:

  • Trustless: If I engage in a Bitcoin transaction with you, I don’t need to trust you, your bank, your government, or anyone specifically. Once a transaction has completed, it can be verified to have happened as I expected, removing counter-party risk that exists in many markets (for example, a fraudster may pay me in counterfeit bills).
  • Resilient: There is no central operator of the Bitcoin infrastructure, so everyone’s not worried about a particular company staying solvent, or a particular government staying in power or true to their promises in order for the system to keep working.

Up until Bitcoin, no-one had been able to come up with a system with these properties. Either counter-party risk was removed because there was an operator regulating the market (and the market wasn’t resilient in the face of that operator collapsing) or there were markets without central control that required a lot of trust when dealing with others. If the inventors of Bitcoin had not been hiding their identities, I wouldn’t be surprised if they would be in the running for a future Nobel Prize in Economics. Bitcoin is no less than a completely decentralised technology for financial contracts allowing for value to be transferred over any means – physical or virtual.

However, I’ve found that the way that Bitcoin operates to be a little surprising. It’s not like other systems that I’m used to. Since I haven’t seen these points noted down clearly in the one place, I thought others may be interested as well. (Unless you’re already very familiar with Bitcoin, in which case it’s likely to be old hat.)

1. Miners are both the source of new bitcoins and responsible for documenting all transactions

A miner is just the name for a computing node that works to discover the next block in the Bitcoin blockchain. Every ten minutes (on average), a new block containing all as-yet-undocumented transactions is generated. The first node to generate this block (which requires discovering the solution to a particular computing problem using trial-and-error approaches) also gets 25 bitcoins (BTC) for its trouble. The “winner” here is in part due to luck, and in part due to how much computing power the miner has dedicated to this. The blockchain is the ongoing record of each of these blocks, collectively forming something of a global ledger of all known transactions to date.

In theory, transactions can contain something akin to a tip, representing a fee to the (winning) miner, and these are in addition to the 25 BTC for each ten minutes work (with a single BTC worth something between US$40 and US$1140 over the last year, and currently around US$580). However, such transaction fees are relatively minor at the moment, with miners currently earning less than 20 BTC per day in total. The 25 BTC figure used to be 50 BTC in the early days, and reduces predictably over time with it halving again to 12.5 BTC by about the year 2017.

2. Transactions are not real-time and take around an hour before they are considered certain

Prospective transactions are broadcast around between all the various miners using a peer-to-peer network, who each check them for validity before including them in the current block that’s being worked on. Since a new block comes along every ten minutes (on average), there may be a wait of up to ten minutes for a new transaction to appear in the blockchain, and hence the receiver of BTC can read it and will know that they are going to get some coins.

Except miners may not include your transaction in the next block because there were already too many transactions in it, or perhaps the miner that “won” the block that time decided not to include any transactions at all, so you will need to wait for the next block. And even then it appears that there is a risk that a Bitcoin sender could “double spend” the BTC if two conflicting transactions were sent to different miners, so it’s considered prudent to wait until six blocks have been generated (including the first one with the relevant transaction) to get transaction certainty.

While this is fine for some types of transactions, such as a book order, it is not so fine for other types of transactions where goods are delivered immediately such as an app download or when at a Bitcoin ATM dispensing hard currency. Any solutions to this problem will sit outside of the standard Bitcoin infrastructure, e.g. merchant insurance, but in a world where transaction times are getting shorter and shorter, this may limit Bitcoin’s long term use in the general economy.

3. Bitcoins are not held in Bitcoin wallets

A Bitcoin wallet is technically just a public-private key pair (or multiple such pairs). This provides the means of generating a public address (from the public key, for others to send bitcoins to your wallet) and for generating new transactions (using the private key, when sending bitcoins to other people’s wallets). The bitcoins themselves are not held anywhere, but proof of ownership of them can be established from the records in the blockchain.

Given that everyone can see exactly how many bitcoins belong to every Bitcoin wallet, it’s considered good practice to use a different public address (and hence public-private key pair) for each transaction. A single transaction can take bitcoins from multiple wallets and send them out to multiple wallets, making this all a bit easier to manage.

4. Bitcoin transactions can be complex contracts

Since bitcoins themselves are not actually moved around and bitcoin balances are not kept within the Bitcoin infrastructure, each transaction sending some bitcoins refers to previous transactions where those bitcoins were “received”. At a minimum a single sending transaction needs to refer back to a single receiving transaction. As part of validating that this pair of transactions should be allowed, miners actually run a small script embedded within the sending transaction followed by another one embedded in the receiving transaction. The scripting language is pretty extensive.

Also, because Bitcoin transactions are just a series of bytes and can be sent directly to others, e.g. over email, instead of broadcasting them to the miners, complex contracts can be created. You can use Bitcoin to pay someone, but only if a third party also approves the transaction. Or you can use Bitcoin to pay a deposit / bond where the money comes back to you after an agreed period but the other party can’t spend it in the mean-time. Or you can use Bitcoin to contribute towards a transaction that will go ahead only if enough other people contribute towards it for it to reach a specified sum. Some are using Bitcoin to run a provably-fair lottery. Some are even looking to use Bitcoin to allow for electronic voting.

Concluding remarks

Bitcoin is still relatively new for a payment technology, and I would not pretend that using it is risk-free. Regulation of Bitcoin is still nascent and inconsistent between geographies, it operates in a legally grey area with perhaps half of all Bitcoin transactions being made with gambling services, and Bitcoin-based marketplaces seem to be regularly collapsing.

Even if Bitcoin itself is replaced by one of the other newer “cryptocurrencies” such as LiteCoin, Ripple or dogecoin, I suspect that its invention has opened the door for amazing new ways to transact online.

New Knowledge

My Baby's BookI still have the book that tracked my infant development after I was born. It’s a little time-capsule of medical opinion from a different age.

At the three months mark, my mum was advised to provide me with fresh fruit juice. These days, the Australian government offers a different medical recommendation:

exclusive breastfeeding for 6 months is the optimal way of feeding infants.

Although, perhaps things will change again soon. The Australian Society of Clinical Immunology and Allergy publishes a fact sheet on infant feeding with a different view again:

Based on the currently available evidence, many experts across Europe, Australia and North America recommend introducing complementary solid foods from around 4-6 months.

Similarly, I was told by a colleague that when they were a parent some decades ago, the sleeping recommendation was to place babies on their stomach (I guess to minimise any inadvertent shaping of their heads while they are still soft?). However, the Australian government offers different guidance these days:

Put your baby to sleep on its back and use light cotton blankets.

Well, aside from being confusing when parents get differing advice from their own parents compared with the government, this is actually a good thing: when the medical facts change, the medical community changes its collective mind. And while there’s a good chance that the recommendations aren’t perfect and may change again, at least the new recommendations are better than the old recommendations.

Of course, this is old hat for anyone familiar with scientific method.

Unfortunately, there doesn’t appear to be a similar evolution of knowledge when it comes to recommendations for best managing people. When I have taken courses designed to impart the best new thinking around management, a less useful approach is followed.

There is often an unwillingness to state that prior management recommendations are wrong and should be replaced with better ones. In fact, new techniques have typically been presented to me as new “tools” that can be added to my “toolkit“. This toolkit apparently can grow without limit, and it is largely up to my discretion as to when, where and to whom I should apply a given technique.

I grant that there is difficulty in running experiments needed to show that a particular technique is better than another, and dealing with people is a messier problem-space than dealing with germs or injuries. Also, sure, management science is a relatively new discipline. Still, it feels like a cop out.

I hope that one day, looking at today’s management courseware will seem as quaint as looking at my old baby book.

Pi, Python and I (part 2)

Raspberry PiIn my previous post, I talked about how I’m using a Raspberry Pi to run a Facebook backup service and provided the Python code needed to get (and maintain) a valid Facebook token to do this. This post will be discussing the actual Facebook backup service and the Python code to do that. It will be my second Python program ever (the first was in the previous post), so there will likely be better ways to do what I’ve done, although you’ll see it’s still a pretty simple exercise. (I’m happy to hear about possible improvements.)

The first thing I need to do is pull in all the Python modules that will be useful. The extra packages should’ve been installed from before. Also, because the Facebook messages will be backed-up to Gmail using its IMAP interface, the Google credentials are specified here, too. Given that those credentials are likely to be something you want to keep secret at all costs, all the more reason to run this on a home server rather than on a publicly hosted server.

from facepy import GraphAPI
import urlparse
import dateutil.parser
from crontab import CronTab
import imaplib
import time

# How many status updates to go back in time (first time, and between runs)
MAX_ITEMS = 5000
# How many items to ask for in each request
# Default recipient
DEFAULT_TO = "" # Replace with yours
# Suffix to turn Facebook message IDs into email IDs
# Gmail account
GMAIL_USER = "" # Replace with yours
# and its secret password
GMAIL_PASS = "S3CR3TC0D3" # Replace with yours
# Gmail folder to use (will be created if necessary)
GMAIL_FOLDER = "Facebook"

Before we get into the guts of the backup service, I first need to create a few basic functions to simplify the code that comes later. Initially, there’s a function that is used to make it easy to pull a value from the results of a call to the Facebook Graph API:

def lookupkey(the_list, the_key, the_default):
    return the_list[the_key]
  except KeyError:
    return the_default

Next a function to retrieve the Facebook username for a given Facebook user. Given that we want to back-up messages into Gmail, we have to make them look like email. So, each message will have to appear to come from a unique email address belonging to the relevant Facebook user. Since Facebook generally provides all their users with email addresses at the domain based on their usernames, I’ve used these. However, to make it a bit more efficient, I cache the usernames in a list so that I don’t have to query Facebook again when the same person appears in the feed multiple times.

def getusername(id, friendlist):
  uname = lookupkey(friendlist, id, '')
  if '' == uname:
    uname = lookupkey(graph.get(str(id)), 'username', id)
    friendlist[id] = uname # Add the entry to the dictionary for next time
  return uname

The email standards expect times and dates to appear in particular formats, so now a function to achieve this based on whatever date format Facebook gives us:

def getnormaldate(funnydate):
  dt = dateutil.parser.parse(funnydate)
  tz = long(dt.utcoffset().total_seconds()) / 60
  tzHH = str(tz / 60).zfill(2)
  if 0 <= tz:
    tzHH = '+' + tzHH
  tzMM = str(tz % 60).zfill(2)
  return dt.strftime("%a, %d %b %Y %I:%M:%S") + ' ' + tzHH + tzMM

Next, a function to find the relevant bit of a URL to help travel back and forth in the Facebook feed. Given that the feed is returned to use from the Graph API in small chunks, we need to know how to query the next or previous chunk in order to get it all. Facebook uses a URL format to give us this information, but I want to unpack it to allow for more targeted navigation.

def getpagingpart(urlstring, part):
  url = urlparse.urlsplit(urlstring)
  qs = urlparse.parse_qs(url.query)
  return qs[part][0]

Now a function to construct the headers and body of the email from a range of information gleaned from processing the Facebook Graph API results.

def message2str(fromname, fromaddr, toname, toaddr, date, subj1, subj2, msgid, msg1, msg2, inreplyto=''):
  if '' == inreplyto:
    header = ''
    header = 'In-Reply-To: <' + inreplyto + '>\n'
  utcdate = dateutil.parser.parse(date).astimezone("%a %b %d %I:%M:%S %Y")
  return "From nobody {}\nFrom: {} <{}>\nTo: {} <{}>\nDate: {}\nSubject: {} - {}\nMessage-ID: <{}>\n{}Content-Type: text/html\n\n


".format(utcdate, fromname, fromaddr, toname, toaddr, date, subj1, subj2, msgid, header, msg1, msg2)

Okay, now we've gotten all that out of the way, here's the main function to process a message obtained from the Graph API and place it in an IMAP message folder. The Facebook message is in the form of a dictionary, so we can look up the relevant parts by using keys. In particular, any comments to a message will appear in the same format, so we recurse over those as well using the same function.

Note that in a couple of places I call encode("ascii", "ignore"). This is an ugly hack that strips out all of the unicode goodness that was in the original Facebook message (which allows foreign language characters and symbols), dropping anything exotic to leave plain ASCII characters behind. However, for some reason, the Python installation on my Raspberry Pi would crash the program whenever it came across unusual characters. To ensure that everything works smoothly, I ensure that these aren't present when the text is processed later.

def printdata(data, friendlist, replytoid='', replytosub='', max=MAX_ITEMS, conn=None):
  c = 0
  for d in data:
    id = lookupkey(d, 'id', '') # get the id of the post
    msgid = id + ID_SUFFIX
    try: # get the name (and id) of the friend who posted it
      f = d['from']
      n = f['name'].encode("ascii", "ignore")
      fid = f['id']
      uname = getusername(fid, friendlist) + ""
    except KeyError:
      n = ''
      fid = ''
      uname = ''
    try: # get the recipient (eg. if a wall post)
      dest = d['to']
      destn = dest['name']
      destid = dest['id']
      destname = getusername(destid, friendlist) + ""
    except KeyError:
      destn = ''
      destid = ''
      destname = DEFAULT_TO
    t = lookupkey(d, 'type', '') # get the type of this post
      st = d['status_type']
      t += " " + st
    except KeyError:
    try: # get the message they posted
      msg = d['message'].encode("ascii", "ignore")
    except KeyError:
      msg = ''
    try: # there may also be a description
      desc = d['description'].encode("ascii", "ignore")
      if '' == msg:
        msg = desc
        msg = msg + "
\n" + desc except KeyError: pass try: # get an associated image img = d['picture'] msg = msg + '
\n' except KeyError: img = '' try: # get link details if they exist ln = d['link'] ln = '
\nlink' except KeyError: ln = '' try: # get the date date = d['created_time'] date = getnormaldate(date) except KeyError: date = '' if '' == msg: continue if '' == replytoid: email = message2str(n, uname, destn, destname, date, t, id, msgid, msg, ln) else: email = message2str(n, uname, destn, destname, date, 'Re: ' + replytosub, replytoid, msgid, msg, ln, replytoid + ID_SUFFIX) if conn: conn.append(GMAIL_FOLDER, "", time.time(), email) else: print email print "----------" try: # process comments if there are any comments = d['comments'] commentdata = comments['data'] printdata(commentdata, friendlist, replytoid=id, replytosub=t, conn=conn) except KeyError: pass c += 1 if c == max: break return c

The last bit of the program uses these functions to perform the backup and to set up a cron job to run the program again every hour. Here's how it works..

First, I grab the Facebook Graph API token that the previous program ( provided, and initialise the module that will be used to query it.

# Initialise the Graph API with a valid access token
  with open("fbtoken.txt", "r") as f:
    oauth_access_token =
except IOError:
  print 'Run first'

# See
graph = GraphAPI(oauth_access_token)

Next, I set up the connection to Gmail that will be used to store the messages using the credentials from before.

# Setup mail connection
mailconnection = imaplib.IMAP4_SSL('')
mailconnection.login(GMAIL_USER, GMAIL_PASS)

Now we just need to initialise some things that will be used in the main loop: the cache of the Facebook usernames, the count of the number of status updates to read, and the timestamp that marks the point in time to begin reading status from. This last one is to ensure that we don't keep uploading the same messages again and again, and the timestamp is kept in the file fbtimestamp.txt.

friendlist = {}

countdown = MAX_ITEMS
  with open("fbtimestamp.txt", "r") as f:
    since = '&since=' +
except IOError:
  since = ''

Now we do the actual work, reading the status feed and processing them:

stream = graph.get('me/home?limit=' + str(REQUEST_ITEMS) + since)
newsince = ''
while stream and 0 < countdown:
  streamdata = stream['data']
  numitems = printdata(streamdata, friendlist, max=countdown, conn=mailconnection)
  if 0 == numitems:
  countdown -= numitems
  try: # get the link to ask for next (going back in time another step)
    p = stream['paging']
    next = p['next']
    if '' == newsince:
        prev = p['previous']
        newsince = getpagingpart(prev, 'since')
      except KeyError:
  except KeyError:
  until = '&until=' + getpagingpart(next, 'until')
  stream = graph.get('me/home?limit=' + str(REQUEST_ITEMS) + since + until)

Now we clean things up: record the new timestamp and close the connection to Gmail.

if '' != newsince:
  with open("fbtimestamp.txt", "w") as f:
    f.write(newsince) # Record the new timestamp for next time


Finally, we set up a cron job to keep the status updates flowing. As you can probably guess from this code snippet, this all is meant to be saved in a file called

cron = CronTab() # get crontab for the current user
if [] == cron.find_comment("exportfbfeed"):
  job ="python ~/", comment="exportfbfeed")
  job.minute.on(0) # run this script @hourly, on the hour

Alright. Well, that was a little longer than I thought it would be. However, the bit that does the actual work is not very big. (No sniggering, people. This is a family show.)

It's been interesting to see how stable the Raspberry Pi has been. While it wasn't designed to be a home server, it's been running fine for me for weeks.

There was an additional benefit to this backup service that I hadn't expected. Since all my email and Facebook messages are now in the one place, I can easily search the lot of them from a single query. In fact, the Facebook search feature isn't very extensive, so it's great that I can now do Google searches to look for things people have sent me via Facebook. It's been a pretty successful project for me and I'm glad I got the chance to play with a Raspberry Pi.

For those that want the original source code files, rather than cut-and-pasting from this blog, you can download them here:

If you end up using this for something, let me know!

Pi, Python and I (part 1)

Raspberry PiI’ve been on Facebook for almost six years now, and active for almost five. This is a long time in Internet time.

Facebook has, captured within it, the majority of my interactions with my friends. Many of them have stopped blogging and just share via Facebook, now. (Although, at least two have started blogging actively in the last year or so, and perhaps all is not lost.) At the start, I wasn’t completely convinced it would still be around – these things tended to grow and then fade within just a few years. So, I wasn’t too concerned about all the *stuff* that Facebook would accumulate and control. I don’t expect them to do anything nefarious with it, but I don’t expect them to look after it, either.

However, I’ve had a slowly building sense that I should do something about it. What if Facebook glitched, and accidentally deleted everything? There’s nothing essential in there, but there are plenty of memories I’d like to preserve. I really wanted my own backup of my interactions with my friends, in the same way I have my own copies of emails that I’ve exchanged with people over the years. (Although, fewer people seem to email these days, and again they just share via Facebook.)

The trigger to finally do something about this was when every geek I knew seemed to have got themselves a Raspberry Pi. I tried to think of an excuse to get one myself, and didn’t have to think too hard. I could finally sort out this Facebook backup issue.

Part of the terms of my web host are that I can’t run any “robots” – it’s purely meant to handle incoming web requests. Also, none of the computers at home are on all the time, as we only have tablets, laptops and phones. I didn’t have a server that I could run backup software on.. but a Raspberry Pi could be that server.

For those who came in late, the Raspberry Pi is a tiny, single-board computer that came out last year, is designed and built in the UK, and (above all) is really, really cheap. I ordered mine from the local distributor, Element14, whose prices start at just under $30 for the Model A. To make it work, you need to at least provide a micro-USB power supply ($5 if you just want to plug it into your car, but more like $20 if you want to plug it into the wall) and a Micro SD card ($5-$10) to provide the disk, so it’s close to $60, unless you already have those to hand. You can get the Model B, which is about $12 more and gets you both more memory and an Ethernet port, which is what I did. You’ll need to find an Ethernet cable as well, in that case ($4).

When a computer comes that cheap, you can afford to get one for projects that would otherwise be too expensive to justify. You can give them to kids to tinker with and there’s no huge financial loss if they brick them. Also, while cheap, they can do decent graphics through an HDMI port, and have been compared to a Microsoft Xbox. No wonder they managed to sell a million units in their first year. Really, I’m a bit slow on the uptake with the Raspberry Pi, but I got there in the end.

While you can run other operating systems onto it, if you get a pre-configured SD card, it comes with a form of Linux called Raspbian and has a programming language called Python set up ready to go. Hence, I figured as well as getting my Facebook backup going, I could use this as an excuse to teach myself Python. I’d looked at it briefly a few years back, but this would be the first time I’d used it in anger. I’ll document here the steps I went through to implement my project, in case anyone else wants to do something similar or just wants to learn from this (if only to learn how simple it is).

The first thing to do is to head over to and create a new “App” that will have the permissions that I’ll use to read my Facebook  feed. Once I logged in, I chose “Apps” from the toolbar at the top and clicked on “Create New App”. I gave my app a cool name (like “Awesome Backup Thing”) and clicked on “Continue”, passed the security check to keep out robots, and the app was created. The App ID and App secret are important and should be recorded somewhere for later.

Now I just needed to give it the right permissions. Under the Settings menu, I clicked on “Permissions”, then added in the ones needed into the relevant fields. For what I want, I needed: user_about_me, user_status, friends_about_me, friends_status, and read_stream. “Save Changes” and this step is done. Actually, I’m not sure if this is technically needed, given the next step.

Now I needed to get a token that can be used by the software on the server to query Facebook from time to time. The easiest way is to go to the Graph API Explorer, accessible under the “Tools” menu in the toolbar.

I changed the Application specified in the top right corner to Awesome Backup Thing (insert your name here), then clicked on “Get access token”. Now I need to specify the same permissions as before, across the three tabs of User Data Permissions (user_about_me, user_status), Friends Data Permissions (friends_about_me, friends_status) and Extended Permissions (read_stream). Lastly, I clicked on “Get Access Token”, clicked “OK” to the Facebook confirmation page that appeared, and returned to the Graph API explorer where there was a new token waiting for me in the “Access token” textbox. It’ll be needed later, but it’s valid for about two hours. If you need to generate another one, just click “Get access token” again.

Now it’s time to return to the Pi. Once I logged in, I needed to set up some additional Python packages like this:

$ sudo pip install facepy
$ sudo pip install python-dateutil
$ sudo pip install python-crontab

And then I was ready to write some code. The first thing was to write the code that will keep my access token valid. The one that Facebook provides via the Graph API Explorer expires too quickly and can’t be renewed, so it needs to be turned into a renewable access token with a longer life. This new token then needs to be recorded somewhere so that we can use it for the backing-up. Luckily, this is pretty easy to do with those Python packages. The code looks like this (you’ll need to put in the App ID, App Secret, and Access Token that Facebook gave you):

# Write a long-lived Facebook token to a file and setup cron job to maintain it
import facepy
from crontab import CronTab
import datetime

APP_ID = '1234567890' # Replace with yours
APP_SECRET = 'abcdef123456' # Replace with yours

  with open("fbtoken.txt", "r") as f:
  old_token =
except IOError:
  old_token = ''
if '' == old_token:
  # Need to get old_token from
  old_token = 'FooBarBaz' # Replace with yours

new_token, expires_on = facepy.utils.get_extended_access_token(old_token, APP_ID, APP_SECRET)

with open("fbtoken.txt", "w") as f:

cron = CronTab() # get crontab for the current user
for oldjob in cron.find_comment("fbtokenrenew"):
job ="python ~/", comment="fbtokenrenew")
renew_date = expires_on - datetime.timedelta(1)
job.hour.on(1) # 1:00am
job.month.on(renew_date.month) # on the day before it's meant to expire

Apologies for the pretty rudimentary Python coding, but it was my first program. The only other things to explain are that the program sits in the home directory as the file “” and when it runs, it writes the long-lived token to “fbtoken.txt” then sets up a cron-job to refresh the token before it expires, by running itself again.

I’ll finish off the rest of the code in the next post.

Technology, Finance and Education

Yale Theatre

I have been trying out iTunes U by doing the Open Yale subject ECON252 Financial Markets. What attracted me to the subject was that the lecturer was Robert Shiller, one of the people responsible for the main residential property index in the US and an innovator in that area. Also, it was free. :)

I was interested in seeing what the iTunes U learning experience was like, and I was encouraged by what I found. While it was free, given the amount of enjoyment I got out of doing the subject, I think I’d happily have paid around the cost of a paperback book for it. I could see video recordings of all the lectures, or alternatively, read transcripts of them, plus access reading lists and assessment tasks.

The experience wasn’t exactly what you’d get if you sat the subject as a real student at Yale. Aside from the general campus experience, also missing were the tutorial sessions, professional grading of the assessments (available as self-assessment in iTunes U), an ability to borrow set texts from the library, and an official statement of grading and completion at the end. Also, the material dated from April 2011, so wasn’t as current as if I’d been doing the real subject today.

Of these, the only thing I really missed was access to the texts. I suppose I could’ve bought my own copies, but given I was trying this because it was free, I wasn’t really inclined to. Also, for this subject, the main text (priced at over $180) was actually a complementary learning experience with seemingly little overlap with the lectures.

While I tried both the video and transcript forms of the lectures, and while the video recordings were professionally done, in the end I greatly preferred the transcripts. The transcripts didn’t capture blackboard writing/diagrams well, and I sometimes went back and watched the videos to see them, but the lecturer had checked over the transcripts and they had additions and corrections in them that went beyond what was in the video. Also, I could get through a 1hr lecture in a lot less than an hour if I was reading the transcript.

Putting aside the form of delivery, the content of the subject turned out to be much more interesting that I expected at the beginning. Shiller provided a social context for developments in finance through history, explained the relationships between the major American financial organisations, and provided persuasive arguments for the civilising force of financial innovations (e.g. for resource allocation, risk management and incentive creation), positioning finance as an engineering discipline rather than (say) a tool for clever individuals to make buckets of cash under sometimes somewhat dubious circumstances. I’ll never think of tax or financial markets or insurance in quite the same way again.

I will quote a chunk from one of his lectures (Lecture 22) that illustrates his approach, but also talks about how technology changes resulted in the creation of government pension schemes. I like the idea that technology shifts have resulted in the creation of many things that we wouldn’t ordinarily associate with “technology”. By copying his words in here, I’ll be able to find them more easily in the future (since this is a theme I’d like to pick up again).

In any case, while I didn’t find the iTunes U technology to be a good alternative for university education, I think it’s a good alternative to reading a typical e-book on the subject. Of course, both e-books and online education will continue to evolve, and maybe there wont be a clear distinction in the future. But for now, it’s an enjoyable way to access some non-fiction material in areas of interest.

The German government set up a plan, whereby people would contribute over their working lives to a social security system, and the system would then years later, 30, 40 years later, keep a tab, about how much they’ve contributed, and then pay them a pension for the rest of their lives. So, the Times wondered aloud, are they going to mess this up? They’ve got to keep records for 40 years. They were talking about the government keeping records, and they thought, nobody can really manage to do this, and that it will collapse in ruin. But it didn’t. The Germans managed to do this in the 1880s for the first time, and actually it was an idea that was copied all over the world.

So, why is it that Germany was able to do something like this in the 1880s, when it was not doable anywhere else? It had never been done until that time. I think this has to do ultimately with technology. Technology, particularly information technology, was advancing rapidly in the 19th century. Not as rapidly as in the 20th, but rapidly advancing.

So, what happened in Europe that made it possible to institute these radical new ideas? I just give a list of some things.

Paper. This is information technology, but you don’t think – in the 18th century, paper, ordinary paper was very expensive, because it was made from cloth in those days. They didn’t know how to make paper from wood, and it had to be hand-made. As a result, if you bought a newspaper in, say, 1790, it would be just one page, and it would be printed on the smallest print, because it was just so expensive. It would cost you like $20 in today’s prices to buy one newspaper. Then, they invented the paper machine that made it mechanically, and they made it out of wood pulp, and suddenly the cost of paper went down. …

There was a fundamental economic difference, and so, paper was one of the things.

And you never got a receipt for anything, when you bought something. You go to the store and buy something, you think you get a receipt? Absolutely not, because it’s too – well, they wouldn’t know why, but that’s the ultimate reason – too expensive. And so, they invented paper.

Two, carbon paper. Do you people even know what this is? Anyone here heard of carbon paper? Maybe, I don’t know. It used to be, that, when you wanted to make a copy of something, you didn’t have any copying machines. You would buy this special paper, which was – do you know what – do I have to explain this to you? You know what carbon paper is? You put it between two sheets of paper, and you write on the upper one, and it comes through on the lower one.

This was never invented until the 19th century. Nobody had carbon paper. You couldn’t make copies of anything. There was no way to make a copy. They hadn’t invented photography, yet. They had no way to make a copy. You had to just hand-copy everything. The first copying machine – maybe I mentioned that – didn’t come until the 20th century, and they were photographic.

And the typewriter. That was invented in the 1870s. Now, it may seem like a small thing, but it was a very important thing, because you could make accurate documents, and they were not subject to misinterpretation because of sloppy handwriting. … And you could also make many copies. You could make six copies at once with carbon paper. And they’re all exactly the same. You can file each one in a different filing cabinet.

Four, standardized forms. These were forms that had fill-in-the-blank with a typewriter.

They had filing cabinets.

And finally, bureaucracy developed. They had management school. Particularly in Germany, it was famous for its management schools and its business schools.

Oh, I should add, also, postal service. If you wanted to mail a letter in 1790, you’d have trouble, and it would cost you a lot. Most people in 1790 got maybe one letter a year, or two letters a year. That was it. But in the 19th century, they started setting up post offices all over the world, and the Germans were particularly good at this kind of bureaucratic thing. So, there were post offices in every town, and the social security system operated through the post offices. Because once you have post offices in every town, you would go to make your payments on social security at the post office, and they would give you stamps, and you’d paste them on a card, and that’s how you could show that you had paid.

– Robert Shiller, ECON252 Financial Markets, 2011