Technology Adoption

Where is the GenAI disruption going to happen?

I recently heard Saad Ansari (former Director of AI at Jasper) speaking about how he sees forthcoming evolution in the GenAI space. It was really thought provoking so here are some notes of mine from the talk.

He sees four key use cases for GenAI:

  1. Co-piloting. Github users are already familiar with an AI tool called co-pilot. More generally, you can think of ChatGPT or similar as a co-pilot who is there to help you do your tasks, whether that is by drafting an email or a job spec for you or helping you learn a new topic of prepare for a meeting.
  2. Personalization
  3. Bringing everyone the “power of Pixar”
  4. Robotics (both virtual agents and physical robots)

I’m going to go into a bit more on the personalization piece.

Go back far enough and the internet was all about search1. You go to Google and get “about 8,400,000,000 results (0.35 seconds)”. Then you scan page 1 and possibly 2 to see if there’s anything relevant.

Then, over time, things become more personalised for the user. One high profile example was the Netflix Prize from 2009. This was a competition with a $1m prize to use machine learning to improve Netflix’s recommendation algorithm (“if you liked show X then probably you will like shows Y and Z”). At the time this ML work was pretty groundbreaking.

Now with GenAI we are in a new world again. In this world new things can be created to the user’s taste. Saad used the words “synthesis” and “remixing” to describe this. The GenAI models have seen enormous amounts of text, images, audio etc in their training which they can use to synthesise new things. They are like a music producer doing a remix. From their training data they can make something that is just what the user is interested in, has never fully existed before, but is similar to what it has been trained on.

What does this sea change in personalization mean for future disruption?

From this perspective, Saad believes, someone like Adobe or TurboTax is safe. It’s easier for them to enhance their products with GenAI than it is for a new GenAI entrant to add the core features that companies like this have.

On the other hand, someone like Amazon might not be safe. A more personalized shopping service could well disrupt them. Imagine a service like:

  1. You upload some photos of your family
  2. Based on the photos an AI figures out your interests
  3. It gives you some ideas of local activities to do nearby
  4. And gives you some links to things you might want to buy

Be honest, it sounds pretty realistic, doesn’t it?

Notes

  1. Or you could go back a bit further to the dark days of domain dipping but it’s the same principle ↩︎
Technology Adoption

What Bing Chat can teach us about technology adoption

Some thoughts prompted by this great write-up about what an outstanding success Microsoft has made of integrating OpenAI’s chatbot tech into Bing: https://www.bigtechnology.com/p/wacky-unhinged-bing-chatbot-is-still

“The fact that people are even writing about Microsoft Bing at all is a win,” one Microsoft employee told me this week. “Especially when the general tenor is not negative. Like, it’s funny that it’s arguing with you over if it’s 2022 or not.”

compared to

when Google’s Bard chatbot got a question wrong in a demo last week, it lost $100 billion in market cap within hours.

Part of this is due to Microsoft’s underdog status in search. But much of it, I think, is how they have brought the users (us) along with them on the journey. They have made us think of Microsoft + ChatGPT as part of “us” vs Google being “them”.

Consider the following disasters with Large Language Models:

The common theme linking all of these. They came out of nowhere: they were launched to great fanfare and raised expectations really high.

Bing Chat couldn’t be more different. Chat GPT was released as an experimental tool, got feedback from early users and rapidly iterated to improve the initial versions. It got us onside and loving it despite its flaws.

Then Microsoft announced their mega investment, getting us also more invested in the product, and creating excitement about implementing it into Bing.

Finally, Microsoft iterated at pace to get something working into their product, building on the excitement and momentum that we, the users, were generating.

So when it finally released, we were really excited and keen to use it (witness the app download stats) and sympathetic to its deficiencies, or, perhaps we even enjoyed the deficiencies.

Some obvious lessons in here about telegraphing your intentions early, bringing your users along with you and iterating at pace.

Machine Learning

From AI to Assistive Computation?

This post on Mastodon has been playing on my mind. It was written on 27th November, after the debacle with Galactica but before ChatGPT burst into the public’s consciousness.

Link to the full thread on Mastodon

I love the challenge it posts.

I am sure there are some areas where the term “AI” is meaningful, for example in academic research. But in the wider world, Ilyaz has a very strong argument.

Usually when people think of AI they’ll imagine something along the lines of 2001: A Space Odyssey or Aliens or I, Robot or Bladerunner or Ex Machina: Something that seems uncannily human but isn’t. I had this image in mind when I first wanted to understand AI and so read Artificial Intelligence: A Modern Approach. What an anti-climax that book was. Did you know that, strictly speaking, the ghosts in pac-man are AI’s? A piece of code that has its own objectives to carry out, like a pac-man ghost, counts as AI. It doesn’t have to ‘think’.

Alan Turing invented The Turing Test in 1950 as a test for AI. For a long time this seemed like a decent proxy for AI: if you’re talking to two things and can’t tell which is the human and which is the machine then we may as well say that the machine is artificially intelligent.

But these days you have large language models that can easily pass the Turing Test. It’s got to the point that ChatGPT has been explicitly coded/taught to fail the Turing test. We’ve got to the point where the AI’s can fake being human so much that they’re being programmed to not sound like humans!

A good description of these language models is ‘Stochastic Parrots‘: ‘Parrots’ because they repeat the patterns they have seen without necessarily understanding any meaning and ‘Stochastic’ because there is randomness in the way they have learnt to generate text.

Services like ChatGPT are bringing this sort of tech into the mainstream and transforming what we understand is possible with computers. This is a pattern we’ve seen before. The best analogy I can think of for where we are today in the world of AI tech is how Spreadsheets and then Search Engines and then Smartphones changed the world we live in.

They don’t herald the advent of Skynet (any more than any other tech from one of the tech titans), nor do they herald a solution for the world’s ills.

So maybe we should reserve the term ‘AI’ for the realms of academic study and instead use a term like ‘Assistive Computation’ as Ilyaz suggests when it comes to real-world applications.

Pretty provocative but at the same time pretty compelling.

To end this post, I’ll leave you with an old AI/ML joke that is somewhat relevant to the discussion here (though these days you’d have to replace with ‘linear regression’ with ‘text-davinci-003’ to get the same vibe):

Edited 2023-01-30: Added link to the full thread on Mastodon

Software Development

Coding with ChatGPT

I’ve been using ChatGPT to help with some coding problems. In all the cases I’ve tried it has been wrong but has given me useful ideas. I’ve seen some extremely enthusiastic people who are saying that ChatGPT writes all their code for them. I can only assume that they mean it is applying common patterns for them and saving boilerplate work. Here is a recent example of an interaction I had with ChatGPT as an illustration.

The initial prompt:

Hi, I want to write a python function that will find common subsets that can be extracted from a list of sets. A common subset is one where several elements always appear together.

For example with the following sets:
s1 = {“a”,”b”,”c”}
s2 = {“a”,”b”,”c”}
s3 = {“c”}
s4 = {“d”,”e”}
s5 = {“d”,”e”,”f”}
s6 = {“d”,”e”,”f”,”g”}

The function should return
[{“a”,”b”},{“d”,”e”}]

What I liked about using it:

  1. It forced me to think about an individual function that can be tested in isolation
  2. It forced me to think really explicitly in terms of the inputs and outputs of the function
  3. The answers it provided, specifically using itertools.combinations, gave me a good direction to try out.

What I didn’t like about using it:

  1. The code didn’t work
  2. It gives the impression of knowing what it’s talking about. I have to keep reminding myself that it’s just producing a wall of text/code based on what it has seen in the past. Some of the stuff it generates is complete nonsense (see below when it tries to explain what output a particular function would create, or whether a set is a subset of another set).
  3. It’s a walled garden. Whereas in another life I might have posted my question on StackOverflow and got an answer that others could also benefit from, this discussion is just between me and ChatGPT. Not that StackOverflow is a panacea – I’ve still got an unanswered question on there from 3 + years ago, but, nevertheless, there is an important principle at stake.

See below the screenshots of the whole interaction:

Needless to say, this final piece of code didn’t return the expected output either.

Full code of the above conversation is below for anyone who wants to do some cut and paste.

Python 3.10.6 (main, Nov 14 2022, 16:10:14) [GCC 11.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> def common_subsets(sets):
...     # Initialize an empty set to store the common subsets
...     common = set()
...     # Iterate through all possible subsets of the first set
...     for subset in itertools.chain.from_iterable(itertools.combinations(sets[0], r) for r in range(len(sets[0])+1)):
...         subset = set(subset)
...         # Check if the current subset is present in all other sets
...         if all(subset <= s for s in sets[1:]):
...             common.add(subset)
...     return list(common)
... 
>>> s1 = {"a","b","c"}
>>> s2 = {"a","b","c"}
>>> s3 = {"c"}
>>> s4 = {"d","e"}
>>> s5 = {"d","e","f"}
>>> s6 = {"d","e","f","g"}
>>> 
>>> print(common_subsets([s1,s2,s3,s4,s5,s6]))
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<stdin>", line 5, in common_subsets
NameError: name 'itertools' is not defined
>>> import itertools
>>> s1 = {"a","b","c"}
>>> s2 = {"a","b","c"}
>>> s3 = {"c"}
>>> s4 = {"d","e"}
>>> s5 = {"d","e","f"}
>>> s6 = {"d","e","f","g"}
>>> 
>>> print(common_subsets([s1,s2,s3,s4,s5,s6]))
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<stdin>", line 9, in common_subsets
TypeError: unhashable type: 'set'
>>> def common_subsets(sets):
...     # Initialize an empty set to store the common subsets
...     common = set()
...     # Iterate through all possible subsets of the first set
...     for subset in itertools.chain.from_iterable(itertools.combinations(sets[0], r) for r in range(len(sets[0])+1)):
...         subset = frozenset(subset)
...         # Check if the current subset is present in all other sets
...         if all(subset <= s for s in sets[1:]):
...             common.add(subset)
...     return list(common)
... 
>>> s1 = {"a","b","c"}
>>> s2 = {"a","b","c"}
>>> s3 = {"c"}
>>> s4 = {"d","e"}
>>> s5 = {"d","e","f"}
>>> s6 = {"d","e","f","g"}
>>> 
>>> print(common_subsets([s1,s2,s3,s4,s5,s6]))
[frozenset()]
>>> from typing import List
>>> 
>>> def common_subsets(sets:List[set]) -> List[set]:
...     common = set()
...     for subset in itertools.chain.from_iterable(itertools.combinations(sets[0], r) for r in range(1,len(sets[0])+1)):
...         subset = frozenset(subset)
...         if all(subset <= s for s in sets[1:]):
...             common.add(subset)
...     return list(common)
... 
>>> s1 = {"a","b","c"}
>>> s2 = {"a","b","c"}
>>> s3 = {"c"}
>>> s4 = {"d","e"}
>>> s5 = {"d","e","f"}
>>> s6 = {"d","e","f","g"}
>>> 
>>> print(common_subsets([s1,s2,s3,s4,s5,s6]))
[]
>>> from typing import List
>>> 
>>> def common_subsets(sets:List[set]) -> List[set]:
...     subsets = [set(itertools.combinations(s, r)) for s in sets for r in range(1, len(s) + 1)]
...     common = set.intersection(*subsets)
...     return list(common)
... 
>>> s1 = {"a","b","c"}
>>> s2 = {"a","b","c"}
>>> s3 = {"c"}
>>> s4 = {"d","e"}
>>> s5 = {"d","e","f"}
>>> s6 = {"d","e","f","g"}
>>> 
>>> print(common_subsets([s1,s2,s3,s4,s5,s6]))
[]
>>> from typing import List
>>> 
>>> def common_subsets(sets:List[set]) -> List[set]:
...     # find all subsets of all sets
...     subsets = [set(itertools.combinations(s, r)) for s in sets for r in range(1, len(s) + 1)]
...     # find subsets that are in all subsets list
...     common = set.intersection(*subsets)
...     # check if subsets are present in all sets
...     common = [subset for subset in common if all(subset <= s for s in sets)]
...     return list(common)
... 
>>> s1 = {"a","b","c"}
>>> s2 = {"a","b","c"}
>>> s3 = {"c"}
>>> s4 = {"d","e"}
>>> s5 = {"d","e","f"}
>>> s6 = {"d","e","f","g"}
>>> 
>>> print(common_subsets([s1,s2,s3,s4,s5,s6]))
[]

And at the risk of embarrassing myself in front of the whole world, and for the benefit of any future ChatGPT iteration that is reading this post, below is the solution I did come up with. Improvements welcome 😀

import itertools
from typing import List

def get_all_set_combinations(sets: List):
    all_combinations = set()
    for s in sets:
        if len(s) == 1:
            continue
        for l in range(2,len(s)+1):
            combos = itertools.combinations(s,l)
            for x in combos:
                all_combinations.add(tuple(x))
    return all_combinations

def find_extractable_subsets(sets: List):
    combos = get_all_set_combinations(sets)
    matching = set()
    for combo in sorted(combos, key=len, reverse=True):
        combo_set = set(combo)
        if not is_candidate_set_extractable(combo_set, sets):
            continue
        addable = True
        for x in matching:
            if combo_set & set(x) == combo_set:
                addable = False
                break
        if addable:
            matching.add(combo)
    return matching

def is_candidate_set_extractable(candidate, sets):
    for s in sets:
        # if this candidate is fully included in a set then it's a candidate to be exractable
        if (candidate & s) == candidate or (candidate & s) == set():
            continue
        else:
            return False
    return True


### And can be tested with:
s1 = {"a","b","c"}
s2 = {"a","b","c"}
s3 = {"c"}
s4 = {"d","e"}
s5 = {"d","e","f"}
s6 = {"d","e","f","g"}
find_extractable_subsets([s1,s2,s3,s4,s5,s6])

# With the expected result:
# {('b', 'a'), ('e', 'd')}

# it only picks the longest matching subsets, e.g.
find_extractable_subsets([s1,s2,s4,s5,s6])

# produces expected result:
# {('e', 'd'), ('b', 'c', 'a')}
Technology Adoption

The ChatGPT Arms Race

ChatGPT makes it so easy to produce good-looking content that people are getting concerned about the scope for cheating in school.

This is a story about the arms race between those who want to use ChatGPT to create content and those who want to be able to spot ChatGPT-created content.

Putting something like this out there was always going to be a red rag to a bull.

Smileys broke the checker. Obviously you wouldn’t do this in real-life: smileys might fool a tool checking whether content was created by a language model, but they won’t fool a human reader. But they are just here as an illustration – you could equally insert characters that a human wouldn’t see.

But it looks like you don’t even need to go to these lengths … surprise … someone had the bright idea of using the language model to re-write its content to make it look more like a human wrote it:

(!)

This genie is now out of the bottle. Trying to ban ChatGPT is a fool’s errand. It might even be counterproductive:

  • There will be a proliferation of similar tools built on large language models. Perhaps not as optimized for human-sounding chat, but certainly good enough at producing content
  • Schoolkids who don’t have access to these tools will find themselves at a disadvantage in the real world compared to those who learn how to make best use of them

One really basic example: one teacher I know was so impressed with the output of ChatGPT they said they’d use it to help students learn how to structure their essays. I’m sure with a bit of imagination there’d by plenty of other ways to use large language models to help teach students better.

It’s a better user of people’s energy to find ways to use large language models than to use the same energy to try to fight them.

Machine Learning

Revisiting Entity Extraction

In September 2021 I wrote about the difficulties of getting anything beyond basic named entity recognition. You could easily get the names of companies mentioned in a news article, but not whether one company was acquiring another or whether two companies were forming a joint venture, etc. Not to mention the perennial “Bloomberg problem”: Bloomberg is named in loads of different stories. Usually they are referenced as a company reporting the story, sometimes as the owner of the Bloomberg Terminal. Only a tiny proportion of mentions of Bloomberg are about actions that the Bloomberg company is done.

These were very real problems that a team I was involved in were facing around 2017, and were still not fixed in 2021. I figured I’d see if more recent ML techologies, specifically Transformers, could help solve these problems. I’ve made a simple Heroku app, called Syracuse, to showcase the results. It’s very alpha, but the quality is not too bad right now.

Meanwhile, the state of the art has moved on leaps and bounds over the past year. So I’m going to compare Syracuse with the winner from my 2021 comparison: Expert.ai‘s Document Analysis Tool and with ChatGPT – the new kid on the NLP block.

A Simple Test

Article: Avalara Acquires Artificial Intelligence Technology and Expertise from Indix to Aggregate, Structure and Deliver Global Product and Tax Information

The headline says it all: Avalara has acquired some Tech and Expertise from Indix.

Expert.AI

It is very comprehensive. For my purposes, too comprehensive. It identifies 3 companies: Avalara, ICR and Indix. The story is about Avalara acquiring IP from Indix. ICR is the communications company that is making the press release. ICR appearing in this list is an example of the “Bloomberg Problem” in action. Also it’s incorrect to call Indix IP a company – the company is Indix. The relevant sentence in the article mentions Indix’s IP, not a company called Indix IP: “Avalara believes its ability to collect, organize, and structure this content is accelerated with the acquisition of the Indix IP.

It also identifies many geographic locations, but many of them are irrelevant to the story as they are just lists of where Avalara has offices. If you wanted to search a database of UK-based M&A activity you would not want this story to come up.

Expert.AI’s relationship extraction is really impressive, but again, overly comprehensive. This first graph shows that Avalara gets expertise, technology and structure from Indix IP to aggregate things.

But there are also many many other graphs which are less useful, e.g:

Conclusion: Very powerful. Arguably too powerful. It reminds me of the age-old Google problem – I don’t want 1,487,585 results in 0.2 seconds. I’m already drowning in information, I want something that surfaces the answer quickly.

ChatGPT

I tried a few different prompts. First I included the background text then added a simple prompt:

I’m blown away by the quality of the summary here (no mention of ICR, LLC, so it’s not suffering from the Bloomberg Problem). But it’s not structured. Let’s try another prompt.

Again, it’s an impressive summary, but it’s not structured data.

Expert.ai + ChatGPT

I wonder what the results would be by combining a ChatGPT summary with Expert.AI document analysis. Turns out, not much use.

Syracuse

Link to data: https://syracuse-1145.herokuapp.com/m_and_as/1

Anyone looking at the URLs will recognise that this is the first entry in the database. This is the first example that I tried as an unseen test case (no cherry-picking here).

It shows the key information in a more concise graph as below. Avalara is a spender, Indix is receiving some kind of payment and the relevant target is some Indix Technology (the downward triangle represents something that is not an organization)

I’m pretty happy with this result. It shows that despite how impressive something like Expert.AI and ChatGPT are, they have limitations when applying to more specific problems, like in this case. Fortunately there are other open source ML technologies out there that can help, though it’s a job of work to stitch them together appropriately to get a decent result.

In future posts I’ll share more comparisons of more complex articles and share some insights into what I’ve learned about large language models through this process (spoiler – there are no silver bullets).

Machine Learning

Large Language Models, Hype and Prompt Chaining

Meta released Galactica recently to great fanfare and then rapidly removed it.

Janelle Shane poked some fun at Galactica in a post that showed how you can get it give you nonsense answers while making then serious point that you should be very aware of the hype. From a research point of view, Galactica is obviously super exciting. From a real-life point of view, you’re not about to replace your chatbot with Galactica, not while it suffers from hallucinations.

But there are serious use-cases for large language models like Galactica and Googles’ Flan-T5. Just not writing fully-fledged research articles.

You have to ask the model a number of smaller questions one after the other. In the jargon: ‘prompt chaining‘. For example – referring to Janelle’s example question that fooled Galactica:

Prompt: how many giraffes are in a mitochondria?
Galactica: 1

Don’t treat the language model as the holder of all knowledge. Treat the language model as an assistant who is super keen to help and is desperate not to offend you. You have to be careful what you ask, and perhaps ask several questions to get to the real answer. Here is an example I did with Flan T5 using a playground space on HuggingFace.

Prompt: Does mitochondria contain giraffes?
Flan T5: no

Prompt: How many giraffes are in a mitochondria?
Flan T5: ten

Using the same question that Galactica was given, we get a nonsense answer. Flan T5 is even more keen that Galactica to give us an impressive-sounding answer. But if you take both questions together then you can draw a more meaningful conclusion. Chain the prompts and first ask the ‘yes/no’ question and then only ask the second question depending on the answer you get from the first.

Having written all of this, today I learnt about OpenAI’s ChatGPT which seems like a massive step forward towards solving the hallucination problem. I love how fast this space is moving these days.