Software Development

Rip it up and start again, without ripping it up and starting again

Time for another update on my side project, Syracuse: http://syracuse-1145.herokuapp.com/

I got some feedback that the structure of the relationships was a bit unintuitive. Fair enough, let’s update it to make more sense.

Previously code was doing the following:

  1. Use RoBERTa-based LLM for entity extraction (the code is pretty old but works well)
  2. Use Benepar dependency parsing to link up relevant entities with each other
  3. Use FlanT5 LLM to dig out some more useful content from the text
  4. Build up an RDF representation of the data
  5. Clean up the RDF

Step 5 had got quite complex.

Also, I had a look at import/export of RDF in graphs – specifically Neo4J, but couldn’t see much excitement about RDF. I even made a PR to update some of the Neo4J / RDF documentation. It’s been stuck for 2+ months.

I wondered if a better approach would be to start again using a different set of technologies. Specifically,

  1. Falcon7B instead of FlanT5
  2. Just building the representation in a graph rather than using RDF

Falcon7B was very exciting to get a chance to try out. But in my use case it wasn’t any more useful than FlanT5.

Going down the graph route was a bit of fun for a while. I’ve used networkx quite a bit in the past so thought I’d try with that first. But, guess what, it turned out more complicated than I needed. Also I do like the simplicity and elegance of RDF, even if it makes me seem a bit, old.

So the final choice was to rip up all my post-processing and turn it into pre-processing, and then generate the RDF. It was heart-breaking to throw away a lot of code, but, as programmers, I think we know when we’ve built something that is just too brittle and needs some heavy refactoring. It worked well in the end, see the git stats below:

  • code: 6 files change: 449 insertions, 729 deletions
  • tests: 81 files changed, 3618 insertions, 1734 deletions

Yes, a lot of tests. It’s a data-heavy application so there are a lot of tests to make sure that data is transformed as expected. Whenever it doesn’t work, I add that data (or enough of it) as a test case and then fix it. Most of this test data was just changed with global find/replace so it’s not a big overhead to maintain. But having all those tests was crucial for doing any meaningful refactoring.

On the code side, it was very satisfying to remove more code than I was adding. It just showed how brittle and convoluted the codebase had become. As I discovered more edge cases I added more logic to deal with them. Eventually this ended up as lots of complexity. The new code is “cleaner”. I put clean in quotes because there is still a lot of copy/paste in there and similar functions doing similar things. This is because I like to follow “Make it work, make it right, make it fast“. Code that works but isn’t super elegant is going to be easier to maintain/fix/re-factor later than code that is super-abstracted.

Some observations on the above:

  1. Tests are your friend (obviously)
  2. Expect to need major refactoring in the future. However well you capture all the requirements now, there will be plenty that have not yet been captured, and plenty of need for change
  3. Shiny new toys aren’t always going to help – approach with caution
  4. Sometimes the simple old-fashioned technologies are just fine
  5. However bad you think an app is, there is probably still 80% in there that is good, so beware completely starting from scratch.

See below for the RDF as it stands now compared to before:

Current version

@prefix ns1: <http://example.org/test/> .
@prefix org: <http://www.w3.org/ns/org#> .
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .

<http://example.org/test/abc/Core_Scientific> a org:Organization ;
    ns1:basedInRawLow "USA" ;
    ns1:buyer <http://example.org/test/abc/Stax_Digital_Assets_Acquisition> ;
    ns1:description "Artificial Intelligence and Blockchain technologies" ;
    ns1:foundName "Core Scientific" ;
    ns1:industry "Artificial Intelligence and Blockchain technologies" ;
    ns1:name "Core Scientific" .

<http://example.org/test/abc/Stax_Digital> a org:Organization ;
    ns1:basedInRawLow "LLC" ;
    ns1:description "blockchain mining" ;
    ns1:foundName "Stax Digital, LLC",
        "Stax Digital." ;
    ns1:industry "blockchain mining" ;
    ns1:name "Stax Digital" .

<http://example.org/test/abc/Stax_Digital_Assets_Acquisition> a ns1:CorporateFinanceActivity ;
    ns1:activityType "acquisition" ;
    ns1:documentDate "2022-11-28T05:06:07.000008"^^xsd:dateTime ;
    ns1:documentExtract "Core Scientific (www.corescientific.com) has acquired assets of Stax Digital, LLC, a specialist blockchain mining company with extensive product development experience and a strong track record of developing enterprise mining solutions for GPUs.",
        "Core Scientific completes acquisition of Stax Digital." ;
    ns1:foundName "acquired",
        "acquisition" ;
    ns1:name "acquired",
        "acquisition" ;
    ns1:status "completed" ;
    ns1:targetDetails "assets" ;
    ns1:targetEntity <http://example.org/test/abc/Stax_Digital> ;
    ns1:targetName "Stax Digital" ;
    ns1:whereRaw "llc" .

Previous version

@prefix ns1: <http://example.org/test/> .
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .

<http://example.org/test/abc/Core_Scientific> a <http://www.w3.org/ns/org#Organization> ;
    ns1:basedInLow "USA" ;
    ns1:description "Artificial Intelligence and Blockchain technologies" ;
    ns1:foundName "Core Scientific" ;
    ns1:industry "Artificial Intelligence and Blockchain technologies" ;
    ns1:name "Core Scientific" ;
    ns1:spender <http://example.org/test/abc/Purchase_Stax_Digital> .

<http://example.org/test/abc/Acquired_Assets_Stax_Digital_Llc> a ns1:TargetDetails ;
    ns1:label "Assets" ;
    ns1:name "Acquired Assets Stax Digital, LLC" ;
    ns1:nextEntity "Stax Digital, LLC" ;
    ns1:previousEntity "acquired" ;
    ns1:targetEntity <http://example.org/test/abc/Stax_Digital> .

<http://example.org/test/abc/Purchase_Stax_Digital> a ns1:Activity ;
    ns1:activityType "Purchase" ;
    ns1:documentDate "2022-11-28T05:06:07.000008"^^xsd:dateTime ;
    ns1:documentExtract "Core Scientific (www.corescientific.com) has acquired assets of Stax Digital, LLC, a specialist blockchain mining company with extensive product development experience and a strong track record of developing enterprise mining solutions for GPUs.",
        "Core Scientific completes acquisition of Stax Digital." ;
    ns1:label "Acquired",
        "Acquisition" ;
    ns1:name "Purchase Stax Digital" ;
    ns1:targetDetails <http://example.org/test/abc/Acquired_Assets_Stax_Digital_Llc> ;
    ns1:whenRaw "has happened, no date available" .

<http://example.org/test/abc/Stax_Digital> a <http://www.w3.org/ns/org#Organization> ;
    ns1:basedInLow "LLC" ;
    ns1:description "blockchain mining" ;
    ns1:foundName "Stax Digital, LLC",
        "Stax Digital." ;
    ns1:industry "blockchain mining" ;
    ns1:name "Stax Digital" .

DIFF
1a2
> @prefix org: <http://www.w3.org/ns/org#> .
4,5c5,7
< <http://example.org/test/abc/Core_Scientific> a <http://www.w3.org/ns/org#Organization> ;
<     ns1:basedInLow "USA" ;
---
> <http://example.org/test/abc/Core_Scientific> a org:Organization ;
>     ns1:basedInRawLow "USA" ;
>     ns1:buyer <http://example.org/test/abc/Stax_Digital_Assets_Acquisition> ;
9,10c11
<     ns1:name "Core Scientific" ;
<     ns1:spender <http://example.org/test/abc/Purchase_Stax_Digital> .
---
>     ns1:name "Core Scientific" .
12,31c13,14
< <http://example.org/test/abc/Acquired_Assets_Stax_Digital_Llc> a ns1:TargetDetails ;
<     ns1:label "Assets" ;
<     ns1:name "Acquired Assets Stax Digital, LLC" ;
<     ns1:nextEntity "Stax Digital, LLC" ;
<     ns1:previousEntity "acquired" ;
<     ns1:targetEntity <http://example.org/test/abc/Stax_Digital> .
< 
< <http://example.org/test/abc/Purchase_Stax_Digital> a ns1:Activity ;
<     ns1:activityType "Purchase" ;
<     ns1:documentDate "2022-11-28T05:06:07.000008"^^xsd:dateTime ;
<     ns1:documentExtract "Core Scientific (www.corescientific.com) has acquired assets of Stax Digital, LLC, a specialist blockchain mining company with extensive product development experience and a strong track record of developing enterprise mining solutions for GPUs.",
<         "Core Scientific completes acquisition of Stax Digital." ;
<     ns1:label "Acquired",
<         "Acquisition" ;
<     ns1:name "Purchase Stax Digital" ;
<     ns1:targetDetails <http://example.org/test/abc/Acquired_Assets_Stax_Digital_Llc> ;
<     ns1:whenRaw "has happened, no date available" .
< 
< <http://example.org/test/abc/Stax_Digital> a <http://www.w3.org/ns/org#Organization> ;
<     ns1:basedInLow "LLC" ;
---
> <http://example.org/test/abc/Stax_Digital> a org:Organization ;
>     ns1:basedInRawLow "LLC" ;
36a20,34
> 
> <http://example.org/test/abc/Stax_Digital_Assets_Acquisition> a ns1:CorporateFinanceActivity ;
>     ns1:activityType "acquisition" ;
>     ns1:documentDate "2022-11-28T05:06:07.000008"^^xsd:dateTime ;
>     ns1:documentExtract "Core Scientific (www.corescientific.com) has acquired assets of Stax Digital, LLC, a specialist blockchain mining company with extensive product development experience and a strong track record of developing enterprise mining solutions for GPUs.",
>         "Core Scientific completes acquisition of Stax Digital." ;
>     ns1:foundName "acquired",
>         "acquisition" ;
>     ns1:name "acquired",
>         "acquisition" ;
>     ns1:status "completed" ;
>     ns1:targetDetails "assets" ;
>     ns1:targetEntity <http://example.org/test/abc/Stax_Digital> ;
>     ns1:targetName "Stax Digital" ;
>     ns1:whereRaw "llc" .
Software Development

HackerNews wisdom on dealing with a huge, crappy codebase

Hmmm, is it really that crappy? It’s supporting 20 million USD of revenue with just 3 developers. Yes, ok, I buy that the code is terrible and makes you feel ill whenever you have to look at it.

The comments are heartening to read. In general people recommend against a major re-write. They recommend writing tests to confirm how certain behaviour works and then refactoring piece by piece. It’s along the lines of https://martinfowler.com/bliki/StranglerFigApplication.html

If you read the HN thread and still think that the best approach is to start again from scratch, I refer you to https://www.joelonsoftware.com/2000/04/06/things-you-should-never-do-part-i/

And if you still think a clean slate is the way to go, I give you this story: In 1972 the UK changed from using pounds, shillings and pence to using decimal currency. In 1995 I was training as a programmer. The teacher told us that 2 of the main 4 banks in the UK still calculated interest payments in pounds, shillings and pence. They had just put a wrapper that converted to/from decimal currency around this legacy code.

If you are dealing with poorly-understood code that you can’t really touch without risking everything falling to pieces, you are definitely not alone. It is fixable, given enough time and money. But it’s very unlikely that it can be fixed in one fresh re-write. Most realistic is to address it piece by piece.

In this vein I did find this comment particularly interesting because it represented a concrete small step that could help start solving the problem:

You need to develop an small app that will handle authentication/authorization. Next time a feature comes you will implement that page in the new stack, the rest of production pages will still run in the old code base .

That’s it.

A concurrent small migration to the new system without changing all the system at once.

Why it works? New systems often fail to encapsulate all the complexity. also two systems, duplicate your workload until you decide to drop new system because of the first statement.

Finally, get stats from nginx and figure out which routes aren’t used in a month, try disabled some and see how much dead routes you can find and clean

https://news.ycombinator.com/item?id=32887439

Will it work? Maybe, maybe not, but this sort of mindset – of looking for specific problems that can be fixed and not getting overwhelmed by the extent of the all the problems in the codebase – is the best way to be addressing this kind of challenge.

Software Development

Impress your CTO – avoid these boring complaints

Here are some of the most boring complaints I hear.

  • No-one knows how it works
  • You can’t measure the quality of what I’m doing
  • It’s because of all of our technical debt
  • We need to re-write this from scratch

They each contain an element of truth but at the same time they manage to completely miss the point. Hence boring.

No-one knows how it works

This really means “I don’t know how it works and nor does the person I usually work closest with”. I once saw a developer spend a huge amount of time trying to recreate, by reading the friendly source, the possible state transitions for a particularly key entity. Because “no-one understood it”. This had two problems:

  1. He may have missed something
  2. By reverse engineering requirements from the working code, what he ended up describing might have included bugs rather than how the feature was supposed to work.

And in this particular instance I could have pointed the developer to one person still in the business and one person who had left but would still be helpful. Next time you’re tempted to assert “no-one understands these state transitions” just change it to a question: “Is there anyone either on the team here, or who has left but is still friendly, who can help me understand how these state transitions are supposed to work?”

You can’t measure the quality of what I’m doing

This is invariably an attempt to hide something. I once worked with a team who didn’t report their test coverage because the lead developer felt that software is too complicated for a metric as simple as test coverage to be meaningful. We debated the subject and eventually agreed that although 100% coverage is probably not that meaningful, it is worth at least knowing where you are. Where were they when they measured code coverage? About 15%. I was amazed. Here we were debating the costs and benefits of 90% or 100% code coverage and all the time we were staring 15% in the face. I cannot think of anyone who would seriously argue that code coverage of 15% is in any way acceptable. For sure, you can’t measure everything, but the skill of a good developer is in helping finding a useful metric to use. For example on a recent project we agreed on a simple metric that if RubyCritic gives us anΒ  A or a B grade then that’s good, if it’s any worse then we need to know why. It’s not perfect but it’s a lot better than hiding behind “you can’t measure what I’m doing”.

It’s because of all of our technical debt

As an experiment, I once agreed with a team to spend one month just doing technical debt cleanup. The results? Nothing noticeably better or faster for the users, nothing notably better quality as far as the QA people were concerned, no metric to show anything had improved, and I was still getting the same developers 3 months later making the same complaints about technical debt. The reality is that there will always be technical debt of some shape or form. Just like real world debt, some technical debt can be good if it helps you achieve other ends that you couldn’t otherwise achieve. Better developers would have a plan that says, for example, “Technical debt in module X is causing us problems, in order to fix those we will need to do Y”. This is better because it is specific and measurable and, if defined well enough, deliverable.

We need to re-write this from scratch

Stop thinking that. It’s a dreadful idea. Whenever you think of re-writing you are thinking of the 20% of the system that is a PITA to maintain. You’re not thinking of the 80% of the system that you will also need to re-write. I remember one project where a 5-month re-write was still going on 18 months later, still with no end in sight. And another where a re-architected system was being built in parallel but wasn’t able to keep up with the new features being added to the “legacy” platform. In shoert I’ve never seen a complete re-write that people were glad to have done. If you do need to make dramatic changes then you will need to find some way to change very specific parts of the application one by one. It will take a very long time: make sure you do your homework before advocating this.