Impress Your CTO (1)

There are plenty of developers who will happily build the functionality described in a set of requirements.

That is fine as far as it goes. We need requirements and we need to deliver working software to satisfy requirements. But it takes a LOT more than just this to build decent software products. Not only are there a myriad different ways to satisfy a set of requirements, but even more exciting, in today’s agile (small a) world developers have a great opportunity to co-create products with their product owners.  And in the process to become great developers, not just good developers.

So I want to share some personal views on what you should be thinking of to become a great developer. I’m taking for granted that you know your framework, you know how to use Stack Overflow and that you write good unit tests. I’m going to be talking about the other stuff. The stuff that, in my view, distinguishes great developers from, well, just developers. The stuff that, when I see it in my team, helps me sleep more soundly, knowing that the codebase is in good hands.

Here’s the first one.

Learn your database

Your framework is great. It has a lot of really cool abstractions so you can program away without having to (amongst others) handle all the tedious details of writing SQL [1].

Time for a story.

A product owner was launching a new product and wanted to give the impression that it had been around for a long while. A reasonable request, I mean, who wants to know they are being billed with Invoice #1. The solution was to use a higher number. So the first invoice would be #225678 rather than 1, for example.

This is trivial to do with any RDBMS I’ve seen: you just set the auto-increment to the appropriate number, rather than start counting from 1.

Not in this case. The developer didn’t want to get involved in proprietary database commands. He decided to use a parameter, let’s call it INVOICE_OFFSET, to adjust what someone would see on the UI. Sounds simple enough – set INVOICE_OFFSET to 225677 and every time you show an invoice number on the UI or in an email add INVOICE_OFFSET to the database id. Likewise, subtract before you do any CRUD operations on the database.

Can you see where this is going?

It was fine while there was only one developer involved in the product. It soon turned into a bit of a mess when new developers came on board and weren’t expecting this INVOICE_OFFSET idea. For example in a REST situation, would you use the database ID in the URL? Or use the ID that the user would recognise? And then when you were interpreting the ID in the URL, is it a database ID or an adjusted ID?

True story [2].

I fully accept that writing SQL by hand is horrible and so I am very grateful for ORMs for abstracting that away. But too many developers have gone too far the other way and treat the database as a black box. Or go as far as learning about indexes.

Databases are pretty awesome pieces of software, see what else they can do to help you build better applications.

 

[1] Hell, you could even write an application that you can then easily port from one database to another should you want to. Though if there ever was a case of YAGNI, having an app that you can port between databases should be a top contender.
[2] To be fair, I don’t think there was ever a production issue where one person could inadvertently see someone else’s invoices, but it was certainly a PITA during QA time.

I Wrote a Software: My Tech Stack

I’ve been writing recently about a side-project of mine that I’ve been doing to scratch a couple of itches I have: Using Excel as a Collaboration Tool in an Enterprise Setting and Seeing what all the fuss is about Haskell.

I started off, reasonably enough, by trying to write a web app. But getting started with anything in Haskell is hard enough: getting started with a website was a nightmare [1]. So I made a virtue of necessity. I figured if I’m going to make it easier for people to collaborate with Excel then what is the most natural way for people to collaborate with Excel in real life? The answer is simple: by sending email attachments.

So I forgot about any web interface (for now) and just concentrated on working with Excel Attachments on Emails. It meant I could just focus on learning Haskell and also forced me to work in a way that is going to be more natural for anyone who might use the service. And also it turns out that I’m bang on trend with this whole “No UI” movement [2].

In this post I’m going to get into what I’m using on this project and why:

  1. Ubuntu. I run this on the server simply because I’m used to using Ubuntu on the desktop from the past. I’ve run Ubuntu in production for a long time and it’s fine for me. (I’ve also run CentOS and Windows and going back far enough HP-UX and Solaris, but Ubuntu just seems… nice).
  2. Linode. Not all that much thought on this one. I’ve used AWS in the past and find the interface unbearably complex. I’ve used Rackspace Cloud with a lot of success. But I’ve also been curious as to how Linode and Digital Ocean stack up. I asked Twitter and I got an answer from someone in the Linode community. And so here I am.
  3. Postfix/Dovecot. I never imagined running my own mail server. I originally wanted to run all the email through Gmail. But I kept getting blocked for having a script running against the Gmail servers. In the end I figured I may as well run my own. Remains to be seen how well that works out. And (perhaps coincidentally) the best guide I could find to setting up Postfix/Dovecot was from Linode.
  4. MongoDB. I know that Mongo is no longer the cool kid on the block. But given that my use case for data storage is pretty straightforward, but I expect to need to store large Excel files, Mongo seemed neat. I’m using Mongo as a queueing system as much as a persistent data store.
  5. Haskell. For the heavy lifting. Learning it has been a long, hard slog. I must have read Learn You a Haskell 15 times (I even bought a copy). And also Real World Haskell several times. But you only really learn by doing and for me the best doing came from trying to follow Write Yourself a Scheme and then by trying to figure out how the HaExcel guys did it. Hat tips must also go to HaskellLive for getting an environment set up (though this now seems to be superseded by Stephen Diehl’s excellent write-up).
  6. Plain vanilla Ruby. Because with the wealth of gems in the community you can do a lot very easily.

[1] My brother who also works in tech bought me a copy of Building Web Applications with Haskell and Yesod. He thought that Haskell and Yesod were two individuals who were doing the teaching. This is in stark contrast to Ruby which I learnt from Alan Bradburne’s excellent Practical Rails Social Networking Sites.

[2] A cynic may say that mailing lists have been doing this for donkeys years. To which my reply is that the surely this validates the use of email for sending instructions around.

I Wrote a Software Part III: A Solution

In parts 1 and 2 I outlined the challenges I see on a daily basis with people trying to collaborate with Excel and how I’ve written some software with a view to improving things a bit.

Here is what I have working right now. You send an email with two Excel files attached that people have been working on and you get emailed back to you a result that could be:

  • The differences between the files
  • The two files merged together
  • The two files stitched together

It works with XLSX or XLSM files.

Diffs

If you send two files to “diffs” it will send you a file showing the differences between two Excels

Merges

If you send two files to “merges” then it will combine the cell values from the second spreadsheet into the first spreadsheet. This is useful where you have for example one excel with loads of questions in and you have different people answering different questions within the one excel. You can use this tool to merge together all their changes.

  • In the body of the email you should include one line that says Concat or Replace. Concat will concatenate (join together) the values in two differing cells. Replace will replace the value in the first sheet with any different values from the second sheet. (Any values that are present only in the first sheet will be untouched).
  • In the body of the email you can also include one line with a cell reference in e.g. Sheet1!A6. If you include this then the style of that cell will be used to colour code any changes that were made.

Stitches

If you send to “stitches” then the system will take new rows from the second sheet and stitch them to the bottom of the first sheet. This is useful if you have a spreadsheet with a different company on each row and you have different people contacting different companies for updates. Any information in additional columns will get stitched into a sensible place.

What about future plans?

I have some thoughts around an opinionated way of enabling collaboration across different Excel users using what I’ve called XF functions. Essentially these are special functions that are processed on the server side based on the values in a range of other spreadsheets. Watch this space for more updates on these features.

Assuming there is sufficient interest then the commercial model will be based on a monthly subscription which allows you a certain number of credits per month. A diff costs 1 credit, merges and stitches cost more.

I’m opening this up as an alpha right now so if you would like to give it a go please mail alpha at elevenfortyfive dot com and I will add you.

 

 

A digression on Software Patents

I wrote some software to make collaboration with Excel easier and have been writing up a few posts here about how it works. I’ve been struggling recently with how much of the details to describe on this public forum. Essentially I have this little niggling voice in the back of my head telling me: “Protect your IP, file a patent, protect your IP, file a patent”. Any sensible person would surely file a patent before going public with anything.

Yet I loathe software patents. And perhaps I’m not all that sensible.

Patents take a lot of time and distraction and cost lots of $$. The patent process moves at such a glacial pace that they are all but meaningless when you’re trying to get an idea off the ground. Interestingly it wasn’t always as bad as this. From “The Victorian Internet” by Tom Standage;

[Alexander Graham] Bell worked for several months to build a working prototype [of a “harmonic telegraph”, i.e. telephone]. On 14 February 1876, when it became clear that [a competitor] was pursuing the same goal, Bell filed for a patent, even though he had yet successfully to transmit speech. He was granted the patent on 3 March, and made the vital breakthrough a week later, when he succeeded in transmitting intelligible speech for the first time.

Now if I read that right it took less than a month to get a patent granted. Whereas now it takes years during which time who knows what could happen to your business. How many businesses do we take for granted today that barely existed five years ago?

If patents could be decided and genuinely only granted in non-obvious cases in a few weeks then I might see some value around the certainty they could provide. But they take forever and they seem to be as likely to be about something trivially obvious as they are about something genuinely ground-breaking.

And anyway. Even owning a patent doesn’t mean much unless you have the $$ to take someone to court over infringement.

So when it comes to the real world, at least where software is concerned, patents are a tax on innovation. They divert resources away from where they are best used. They are a brake on innovation.
Arguments I’ve heard in favour of patents include:

  1. The revenue stream argument: If you own a patent you can license the IP.
  2. The risk mitigation argument: If you are lucky enough to be successful and are about to IPO or similar then some fucker (technical term) somewhere will try to scupper you by claiming you are infringing a patent somewhere along the line. So better to have some patents that say you are allowed to do what you are doing, or failing that let you counter-sue the fucker by claiming that he is infringing on your patents.
  3. The ego-trip argument: The idea that you must be cleverer than the other guys if you have more patents than them. Never mind that (as I am reliably informed) the number of patents you own is irrelevant – what is important is the number of claims across the patents. You could have one patent application with 1,000 claims in or you could file 10 patents each with 100 claims and you would be in an equivalent IP position. So why people insist on counting patents is surprising. Still they do.

Those are all valid arguments to a degree. If you want to build a business around selling some IP then you need to license it. If someone is going to value you based on a not very meaningful metric then, whatever, don’t hate the player, hate the game and all that. Just a bit of a depressing game.

So to cut a long story short I have now excised from my psyche any thinking about software patents. Not interested. I do hate the game and so will do my best not to play in it.

BTW – in case any lawyery types are reading – I don’t have a cavalier attitude to IP protection in general. I just think that in the world of software, copyright is the correct approach, not patents.

Lots of love

Alan

 

Think software archaeology sucks? Try hardware archaeology.

This post is for anyone who has had to try to bash through someone else’s shitty source code with all its confusion, obsolete logic and lack of documentation. In other words every software developer, everywhere. Spare a thought for Lieutenant Commands Rupert T. Gould.

 

From Longitude by Dava Sobel

H-1, H-2, H-3, H-4 are the model names for the various generations of clock that were created by genius inventor John Harrison for reliably telling the time at sea: the crucial . The devices languished unused and uncared for until Lieutenant Commander Rupert T. Gould from the Royal Navy took an interest in restoring them. It took him 12 years to restore all four. This restoration is very much a hardware story but software people will empathise:

So he set to right away with an ordinary hat brush, removing two full ounces of dirt and verdigris from H-1 …

It seems only proper that more than half of Gould’s repair work – seven years by his count – fell to H-3, which had taken Harrison the longest time to build. Indeed, Harrison’s problems begat Gould’s:

“No 3 is not merely complicated, like No. 2,” Gould told a gathering of the Society for Nautical Research in 1935, “it is abstruse. It embodies several devices which are entirely unique – devices which no clockmaker has ever thought of using …” In more than one instance, Gould found to his chagrin that “remains of some device which Harrison had tried and subsequently discarded had been left in situ.” He had to pick through these red herrings to find the devices deserving of salvage.

I feel for the guy, working through someone else’s undocumented code. Struggling to clean out the accumulated cruft of (what seems like) centuries of neglect. Discovering tons of methods that aren’t just commented with HACK or TODO, but methods that it turns out never get called in production.

 

I Wrote a Software II: The Problem (aka Opportunity)

We keep collaborating with Excel despite the deficiencies

Despite all the collaboration opportunities provided for next to no cost by the likes of Trello, Office365, Google Apps, Zoho, Box, Huddle people keep finding a use case for emailing an Excel document around. And then they struggle when they sent out the wrong version or someone wants to make an update etc etc.

Office 365 has an online version of Excel. It’s sort of somewhere between “real” Excel and Google Sheets. But without much consideration for usability. Here are a few things that I have been enjoying recently:

  1. When I first open a spreadsheet I have to press a button to be able to edit it (unlike Google Docs where I can type straight away)
  2. Before I can sort or filter an online spreadsheet I have to click a mysterious button called “Format as Table”. Given that sort and filter is one of the main use cases for Excel it’s a bit bizarre to make users make additional clicks and expend additional cognitive effort to unlock that feature.
  3. There’s no freaking strikethrough.
  4. Where is the version history?

If only they had called it something different so people didn’t think it was equivalent to Excel. The surveys feature looks neat though it’s just a me-too version of Google Forms. But there are plenty of irritations that just add up.

To be fair to (real) Excel, Spreadsheet Inquire looks pretty neat. It lets you compare different spreadsheets to find the differences. This should be really useful where you have different people updating different copies of the same spreadsheet.
I have yet to see it in real life as apparently the licensing restrictions are pretty tight. But my instinct is that it’s going to be really hard to use and that will put people off.

Excel alone just isn’t practical enough to justify the amount work we do with it in a collaborative business setting.

Browsers considered dangerous

The traditional solution to this is to build database-backed systems that are accessed through a browser. It is perfectly standard to think it is a good thin an for application to be “web-based” rather than to require a custom fat client. Makes sense because the costs of supporting a piece of custom software on any kind of PC/tablet configuration can be huge. Why not use a web browser which is a more or less standardised piece of software already on everyone’s PC.

The problem here is that web browsers are weak when it comes to handling tabular data. The TABLE element in HTML is pretty much unused. Everyone who wants a table ends up having to define their own approach rather than using something standard. And many business applications are about manipulating a table of data.

Opinionated Excel?

Hmmm, I wonder if there is another piece of widely-installed and widely-supported software that we can expect business people have access to. One that is good at laying out tabular data? What happens if you start thinking of Excel as the UI layer in your system? Of course Excel has an enormous amount of power and so to effectively use it as a front-end that doesn’t then screw up data integrity you would need to take a leaf out of the success of the Rails framework, and how it transformed web development by defining an opinionated way of using the Ruby programming language. I wonder how much appetite there is to work to an opinionated way of producing Excel.

I wrote a software

It was born out of a couple of itches I needed to scratch:

  1. I wanted to learn Haskell [1]
  2. I wanted to make collaboration with Excel less painful

Excel is heavily used in organisations. It’s clunky, prone to users making errors and next to impossible to incorporate someone else’s changes. Especially once you start emailing Excel attachments to your co-workers. Yet people can’t give up on Excel. Apparently because it is a convenient way to lay out a tabular structure:

He typed the current date in the top of the spreadsheet, printed a copy, put it in a three-ring binder, and that was pretty much his whole, entire job. It was kind of sad. He took two lunch breaks a day. I would too, if that was my whole job.

Over the next two weeks we visited dozens of Excel customers, and did not see anyone using Excel to actually perform what you would call “calculations.” Almost all of them were using Excel because it was a convenient way to create a table.

Felienne Hermans of Delft University of Technology has some great research on the subject. In A modern day Pompeii: Spreadsheets at Enron she talks about digging around subpoenaed Enron emails to find that:

Over the 15 months that the email set spans, we counted 100 emails per day (!) involving spreadsheets. Some emails occurred double in the set, as both the sender and receiver were in the mailboxes acquired, so it would be more fair to say there were 100 spreadsheet email – interactions a day. But still! Talking about errors in the spreadsheets was also pretty common, 6% of all spreadsheet related emails contained word such as error or fault

We just can’t wean ourselves off Excel, despite all its shortcomings.

People have tried to solve this problem before. One of the key principles of the Enterprise Software industry is that it will replace your crappy, inefficient Excel-based processes with accurate, real-time information. Yet each new enterprise system is sufficiently inflexible in the real world that it spawns a whole set of Excel spreadsheets at either the front-end (forms or tables to collect data to feed into the machine) or at the back-end (download data to Excel to run your pivot tables for analysis).

If replacing Excel with a web-accessible database system isn’t going to do the trick, what if there were an easier way to address some of the issues people have with using Excel?

[1] Consider it “personal development”. Some people go on training courses. Some people try to pick up new programming languages and techniques. Functional programming has been completely eye-opening for me. Also it’s been great to remind myself of what developers have to go through on a daily basis even without pressure being exerted from an uncomprehending management layer.