Technology Adoption

Where is the GenAI disruption going to happen?

I recently heard Saad Ansari (former Director of AI at Jasper) speaking about how he sees forthcoming evolution in the GenAI space. It was really thought provoking so here are some notes of mine from the talk.

He sees four key use cases for GenAI:

  1. Co-piloting. Github users are already familiar with an AI tool called co-pilot. More generally, you can think of ChatGPT or similar as a co-pilot who is there to help you do your tasks, whether that is by drafting an email or a job spec for you or helping you learn a new topic of prepare for a meeting.
  2. Personalization
  3. Bringing everyone the “power of Pixar”
  4. Robotics (both virtual agents and physical robots)

I’m going to go into a bit more on the personalization piece.

Go back far enough and the internet was all about search1. You go to Google and get “about 8,400,000,000 results (0.35 seconds)”. Then you scan page 1 and possibly 2 to see if there’s anything relevant.

Then, over time, things become more personalised for the user. One high profile example was the Netflix Prize from 2009. This was a competition with a $1m prize to use machine learning to improve Netflix’s recommendation algorithm (“if you liked show X then probably you will like shows Y and Z”). At the time this ML work was pretty groundbreaking.

Now with GenAI we are in a new world again. In this world new things can be created to the user’s taste. Saad used the words “synthesis” and “remixing” to describe this. The GenAI models have seen enormous amounts of text, images, audio etc in their training which they can use to synthesise new things. They are like a music producer doing a remix. From their training data they can make something that is just what the user is interested in, has never fully existed before, but is similar to what it has been trained on.

What does this sea change in personalization mean for future disruption?

From this perspective, Saad believes, someone like Adobe or TurboTax is safe. It’s easier for them to enhance their products with GenAI than it is for a new GenAI entrant to add the core features that companies like this have.

On the other hand, someone like Amazon might not be safe. A more personalized shopping service could well disrupt them. Imagine a service like:

  1. You upload some photos of your family
  2. Based on the photos an AI figures out your interests
  3. It gives you some ideas of local activities to do nearby
  4. And gives you some links to things you might want to buy

Be honest, it sounds pretty realistic, doesn’t it?

Notes

  1. Or you could go back a bit further to the dark days of domain dipping but it’s the same principle ↩︎
Machine Learning, Technology Adoption

Productizing AI is Hard: Part 94

This is a story about how a crude internet joke got from Reddit to ChatGPT to the top of Google.

Here is what you get currently if you ask Google “Are there any African countries starting with K”

The featured snippet is obviously nonsense. Once you get past the featured snippet it’s got sensible content, but the whole point of the featured snippet is: We display featured snippets when our systems determine this format will help people more easily discover what they’re seeking, both from the description about the page and when they click on the link to read the page itself. They’re especially helpful for those on mobile or searching by voice.

So you’d be forgiven for thinking that Google has quite a high bar for what it puts into a featured snippet.

With my debugging hat on, my first hypothesis is that Google is interpreting this search query as the first line in a joke rather than as a genuine question. If that’s the case then this featured snippet is a great one to show because it builds on the joke.

But. Even if that explains the logic behind showing the snippet, it doesn’t mean that this is the best snippet to show. I’d still consider this a bug if it were in one of my systems. At the very least there should be some context to say: “if this is the first line in a joke, then here is the expected response”.

How did this joke get into the featured snippet?

Here’s the page that the Google featured snippet links to. It’s a web page showing a purported chat with ChatGPT that shows ChatGPT agreeing that there are no African countries starting with K. It’s from emergentmind.com, a website that includes lots of content about ChatGPT

I don’t know whether this is a genuine example of ChatGPT producing text that looks grammatically correct but is actually nonsense, or whether it’s a spoof that was added to emergentmind.com as a joke. But there is definitely a lot of this “African countries starting with k” content on Reddit, and we know that Reddit was used to train ChatGPT. So it’s very plausible that ChatGPT picked up this “knowledge”, but, being a language model, can’t tell whether it’s reality, fake or just a joke.

Either way, the fact that this is presented as ChatGPT text on emergentmind.com helps give it enough weight to get into a featured snippet.

One obvious lesson is don’t trust featured snippets on Google. Only last month I wrote about another featured snippet that got things wrong, this time about terms of use for LinkedIn. Use DuckDuckGo if you just want a solid search engine that finds relevant pages, no more, no less.

But this example raises some interesting food for thought ….

Food for thought for people working with LLMs:

  1. If you are training your model on “the entire internet”[1] then you will get lots of garbage in there
  2. As more and more content gets created by large language models, the garbage problem will only get worse

And food for thought for people trying to build products with LLMs:

  1. Creating a demo of something that looks good using LLMs is super easy, but turning it into a user-facing product that can handle all these garbage cases remains hard. Not impossible, but still hard work.
  2. So how do you design your product to maximize the benefits from LLMs while minimizing the downside risk when your LLM gets things wrong?[2]

I’ve written in the past about the hype cycle related to NLP. That was 4 months ago in April. Back then I was uncomfortable that people were hyping LLMs out of all proportion to their capabilities. Now it seems that we are heading towards the trough of disillusionment – with people blowing out of all proportion the negative aspects. The good news is that, if it’s taken less than 6 months to get from the peak of “Large Language Models are showing signs of sentience and they’re going to take your job” to the trough of “ChatGPT keeps getting things wrong and OpenAI is about to go under”[3], then this must mean that the plateau of productivity beckons. I think it’s pretty close (months vs years).

Hat tip to https://mastodon.online/@rodhilton@mastodon.social/110894818521176741 for the context. (Warning before you click – the joke is pretty crude and it’s arguable how funny it is).

Notes

[1] For whatever definition you have for “entire” and “internet”

[2] I saw a Yann LeCun quote (can’t find it now, sadly, so perhaps it’s apocryphal) about one company using 30 LLMs to cross-check results and decrease the risk of any one of them hallucinating. I’m sure this brute force approach can work, but there will also be other smarter ways, depending on the use case

[3] Whether OpenAI succeeds or fails as a company has very little to do with the long-term productivity gains from LLMs, in much the same way that Friendster’s demise didn’t spell the end of social networking platforms

Technology Adoption

Google vs ChatGPT correctness

A story about expectations from Google vs expectations from ChatGPT.

We all know that ChatGPT creates patterns of plausible text that can sometimes be complete nonsense.

But we tend to expect Google to be accurate. The top hit or two have typically been very high quality. These days the top results are attempts to answer the question rather than links to relevant pages. An unfortunate side-effect is that this means Google can suffer from the same sorts of accuracy errors that ChatGPT is famous for.

Conclusion – I need to train my kids not to take for granted what Google tells them.

Here’s a simple example. I was curious whether I should be encouraging my kids to get a LinkedIn account and have a presence somewhere where old Gen X’ers are likely to look. So I asked Google. Guess what? The answer was out of date.

The first link says “13” and links to 8 Things Teenagers (and Their Parents) Need to Know about LinkedIn.

The second link points to LinkedIn’s User Agreement. Pretty clear from the LinkedIn Ts & Cs that the answer is 16.

For those who are curious, the “Minimum Age” hyperlink links to the following text: Members who were below this new Minimum Age when they started using the Services under a previous User Agreement which had allowed certain persons under 16 to use the Services, may continue to use the Services. As of June 2017 persons under the age of 16 are not eligible to use our Services.

Technology Adoption

How close are the machines to taking over?

We overestimate the impact of technology in the short-term and underestimate the effect in the long run.

Amara’s Law (see https://fs.blog/gates-law/)

At one end of the debate we have people like Geoffrey Hinton flagging concerns about AI becoming able to control us. At the other end you’ve got people like Yann LeCun who tend to have a more optimistic outlook. Both people of similar levels of credibility in the space.

I’m going to suggest where I see the disconnect.

It’s in the language we use.

To most people, AI means something out of science fiction. Literally Skynet or I, Robot or Ex Machina: Something with its own motivations that are often at odds with humanity.

For researchers, the AI space is much broader. The NPCs you play against in computer games are AIs. You can even read about the AI behind the ghosts in the classic Pac-Man game. When AI researchers think about science fiction AI they use a different term: “Artificial General Intelligence” (AGI).

If you read that a researcher is talking about “AI” then you should be thinking: “wow, look how far we have come since Pac Man”. If they are talking about “AGI” then that is the beginnings of the path to science-fiction AI. But still just the beginnings.

I’ve made a handy graphic that shows where I think we are on this journey between Pac Man and Ex Machina. Obviously it’s somewhat tongue in cheek, but it’s informed by Amara’s law: There is a lot of hype about any new technology so people inevitably overestimate how much it will change things over the next year or two. But over the longer term …. a different story.

Technology Adoption

What Bing Chat can teach us about technology adoption

Some thoughts prompted by this great write-up about what an outstanding success Microsoft has made of integrating OpenAI’s chatbot tech into Bing: https://www.bigtechnology.com/p/wacky-unhinged-bing-chatbot-is-still

“The fact that people are even writing about Microsoft Bing at all is a win,” one Microsoft employee told me this week. “Especially when the general tenor is not negative. Like, it’s funny that it’s arguing with you over if it’s 2022 or not.”

compared to

when Google’s Bard chatbot got a question wrong in a demo last week, it lost $100 billion in market cap within hours.

Part of this is due to Microsoft’s underdog status in search. But much of it, I think, is how they have brought the users (us) along with them on the journey. They have made us think of Microsoft + ChatGPT as part of “us” vs Google being “them”.

Consider the following disasters with Large Language Models:

The common theme linking all of these. They came out of nowhere: they were launched to great fanfare and raised expectations really high.

Bing Chat couldn’t be more different. Chat GPT was released as an experimental tool, got feedback from early users and rapidly iterated to improve the initial versions. It got us onside and loving it despite its flaws.

Then Microsoft announced their mega investment, getting us also more invested in the product, and creating excitement about implementing it into Bing.

Finally, Microsoft iterated at pace to get something working into their product, building on the excitement and momentum that we, the users, were generating.

So when it finally released, we were really excited and keen to use it (witness the app download stats) and sympathetic to its deficiencies, or, perhaps we even enjoyed the deficiencies.

Some obvious lessons in here about telegraphing your intentions early, bringing your users along with you and iterating at pace.

Technology Adoption

The ChatGPT Arms Race

ChatGPT makes it so easy to produce good-looking content that people are getting concerned about the scope for cheating in school.

This is a story about the arms race between those who want to use ChatGPT to create content and those who want to be able to spot ChatGPT-created content.

Putting something like this out there was always going to be a red rag to a bull.

Smileys broke the checker. Obviously you wouldn’t do this in real-life: smileys might fool a tool checking whether content was created by a language model, but they won’t fool a human reader. But they are just here as an illustration – you could equally insert characters that a human wouldn’t see.

But it looks like you don’t even need to go to these lengths … surprise … someone had the bright idea of using the language model to re-write its content to make it look more like a human wrote it:

(!)

This genie is now out of the bottle. Trying to ban ChatGPT is a fool’s errand. It might even be counterproductive:

  • There will be a proliferation of similar tools built on large language models. Perhaps not as optimized for human-sounding chat, but certainly good enough at producing content
  • Schoolkids who don’t have access to these tools will find themselves at a disadvantage in the real world compared to those who learn how to make best use of them

One really basic example: one teacher I know was so impressed with the output of ChatGPT they said they’d use it to help students learn how to structure their essays. I’m sure with a bit of imagination there’d by plenty of other ways to use large language models to help teach students better.

It’s a better user of people’s energy to find ways to use large language models than to use the same energy to try to fight them.

Technology Adoption

Stories of Technology Adoption in the 19th Century

The Royal Society of Arts, formerly the  Royal Society for the Encouragement of Arts, Manufactures and Commerce, and sometimes just knows as “the Society” was founded in 1754 to encourage new technological innovations. They did this by hosting competitions to invent certain things and then giving out prizes (called ‘premiums’) to the winners.

It was one thing for the Society to encourage technological innovations. It could be quite another for those innovations to achieve mass adoption. I’m going to share stories about:

  • eliminating the need for child chimney sweeps and
  • introducing public toilets.

to show how things that can seem so obvious to us now, were once highly contentious and radical ideas which took a lot of work to become part of daily life.

I’m taking these examples from “Arts and Minds: How the Royal Society of Arts Changed a Nation” by Anton Howes.

CHILD CHIMNEY SWEEPS

One of the Society’s most significant campaigns to promote the invention and adoption of a technology came in 1796, when it offered a premium for a mechanical means of cleaning chimneys. The Society’s aim in this was to abolish the employment of children, sometimes as young as 4, who were forced to climb up inside chimneys in order to clean them.

The Society of Arts’ premuim was won in 1805 by George Smart, a timber merchant and engineer. His tool, the ‘scandiscope’, could be operated from the fireplace, was cheap, effective on all but the bendiest of flues, and weighed ‘no more than a musket’… Yet the existence of an effective invention was not enough to abolish the use of climbing boys.

At first, … campaigners tried to cooperate with the sweeps, offering prizes for the number of flues swept using the scandiscope, subsidising their purchase of the machines, and advertising the reliable sweeps who used them. But the sweeps took advantage of this generosity, purposefully misusing the scandiscopes in an effort to turn customers against them. By 1809, the campaigners had had enough…. They encouraged brand new entrants into the sweeping trade, extolling the modest profits that might be made by using the machines. They also encouraged the owners of larger homes to buy their own machines (to be used by domestic servants), so to actively remove customers from the market.

The campaign eventually met with success. The scandiscopes were gradually brought into use, in London as well as further afield, and the lot of the climbing boys improved. Crucially, the scandiscope made laws banning the use of climbing boys possible, although this took decades of more campaigning as well as further improvements to Smart’s machine.

(pages 76-79)

In this story, the Royal Society thought that chimney sweeps would flock to using a new tool that meant they would no longer need to use little boys to go up chimneys.

The author suggests that the chimney sweeps actively tried to subvert the new invention. You don’t even need to go that far. I can easily imagine a sweep battling with his first use a scandiscope for half an hour before giving up and just going back to the old way of doing things.

Either way, it’s interesting how in the end the campaigners stopped trying to convert existing chimney sweeps. They took a completely different approach by appealing to chimney owners or brand new market entrants.

PUBLIC TOILETS

This was a different story. In this case the resistance to change came from the powers that be, in whose view it was a ridiculous idea. So the Royal Society or Arts was forced to create a massive proof of concept to show the value of the idea. This took place at the Great Exhibition of 1851. The Henry Cole of this story is this person: https://en.wikipedia.org/wiki/Henry_Cole

In mid-nineteenth century London, the options for going to the toilet were limited. There were only a few facilities available to the public. Wealthy people might buy a small item from a shop, and ask to use the shop’s toilet… Most people, however, just relieved themselves in alleyways, doorways and on walls… To the reformers in the Society of Arts, it was obvious that there needed to be a system of public toilets. With the expected influx of people for the Great Exhibition, the campaign gained a sense of urgency.

The Society suggested a ‘general system’ of what became known, euphemistically, as ‘public conveniences’ … The suggestion was ignored, but this did not stop Henry Cole. He reasoned that the exhibition itself might help ‘reconcile the public to the use of a convenience’. Exposing the public to novelties was, after all, one of the Great Exhibition’s main purposes. They simply did not know they wanted them, Cole decided, because the concept was so unfamiliar. He arranged for the Crystal Palace to have its own toilets, for which entry was charged at a halfpenny or penny, depending on the services required. Over the course of the Exhibition, these ‘waiting rooms’ were visited by over 700,000 women and 820,000 men. The figures did not even include the use of urinals, which were free. For Cole and the reformers, this was evidence that their suggested system of public conveniences for London might be self-supporting.

(page 150)

It’s hard to imagine a time when the idea of public toilets could be considered such a radical notion. Indeed, Henry Cole’s view that the public didn’t even know that they wanted these is very similar to the familiar quote associated to Henry Ford about faster horses. It took quite an epic spectacle to get people to see the benefits of this new approach.

TAKEAWAYS

On one level these are interesting case studies of bygone times.

But look a bit deeper and you’ll see some lessons for workers in ‘change’ roles that will resonate through the years:

  1. The Chimney Sweeps story shows people resisting a new technology, perhaps because they fundamentally disagreed with it, or perhaps because they didn’t see how it could compete with the status quo. In this case the Royal Society had to give up getting them on board and work around them instead of working with them, and
  2. The Public Toilets story shows one relatively common objection to change: “if this new way is so much better, then someone would have done it already”. To convince these people the Royal Society had to come up with an industrial strength proof of concept within an environment that they could control.
Machine Learning, Software Development, Technology Adoption

Transformers for Use-Oriented Entity Extraction

The internet is full of text information. We’re drowning in it. The only way to make sense of it is to use computers to interpret the text for us.

Consider this text:

Foo Inc announced it has acquired Bar Corp. The transaction closed yesterday, reported the Boston Globe.

This is a story about a company called ‘Foo’ buying a company called ‘Bar’. (I’m just using Foo and Bar as generic tech words, these aren’t real companies).

I was curious to see how the state of the art has evolved for pulling out these key bits of information from the text since I first looked at Dandelion in 2018.

TL;DR – existing Natural Language services vary from terrible to tolerable. But recent advances in language models, specifically transformers, point towards huge leaps in this kind of language processing.

Dandelion

Demo site: https://dandelion.eu/semantic-text/entity-extraction-demo/

While it was pretty impressive in 2018, the quality for this type of sentence is pretty poor. It only identified that the Boston Globe is an entity, but Dandelion tagged this entity as a “Work” (i.e. a work of art or literature). As I allowed more flexibility in finding entities, it also found that the term “Inc” and “Corp” usually relate to a Corporation, and it found a Toni Braxton song. Nul points.

Link to video

Explosion.ai

Demo site: https://explosion.ai/demos/displacy-ent

This organisation uses pretty standard named entity recognition. It successfully identified that there were three entities in this text. Pretty solid performance at extracting named entities, but not much help for my use case because the Boston Globe entity is not relevant to the key points of the story.

Link to video

Microsoft

Demo site: https://aidemos.microsoft.com/text-analytics

Thought I’d give Microsoft’s text analytics demo a whirl. Completely incomprehensible results. Worse than Dandelion.

Link to video

Completely WTF

Expert.ai

Demo site: https://try.expert.ai/analysis-and-classification

With Microsoft’s effort out of the way, time to look at a serious contender.

This one did a pretty good job. It identified Foo Inc and Bar Corp as businesses. It identified The Boston Globe as a different kind of entity. There was also some good inference that Foo had made an announcement and that something had acquired Bar Corp. But didn’t go so far as joining the dots that Foo was the buyer.

In this example, labelling The Boston Globe as Mass Media is helpful. It means I can ignore it unless I specifically want to know who is reporting which story. But this helpfulness can go too far. When I changed the name “Bar Corp” to “Reuters Corp” then the entity extraction only found one business entity: Foo Inc. The other two entities were now tagged as Mass Media.

Long story short – Expert.ai is the best so far, but a user would still need to implement a fair bit of post-processing to be able to extract they key elements from this text.

Link to video.

Expert.ai is identifying entities based on the nature of that entity, not based on the role that they are playing in the text. The relations are handled separately. I was looking for something that combined the relevant information from both the entities and their relations. I’ll call it ‘use-oriented entity extraction’ following Wittgenstein‘s quote that, if you want to understand language: “Don’t look for the meaning, look for the use”. In other words, the meaning of a word in some text can differ depending on how the word is used. In one sentence, Reuters might be the media company reporting a story. In another sentence, Reuters might be the business at the centre of the story.

Enter Transformers

I wondered how Transformers would do with the challenge of identifying the different entities depending on how the words are used in the text. So I trained a custom RoBERTa using a relatively small base set of text and some judicious pre-processing. I was blown away with the results. When I first saw all the 9’s appearing in the F1 score my initial reaction was “this has to be a bug, no way is this really this accurate”. Turns out it wasn’t a bug.

I’ve called the prototype “Napoli” because I like coastal locations and Napoli includes the consonants N, L and P. This is a super-simple proof of concept and would have a long way to go to become production-ready, but even these early results were pretty amazing:

  1. It could tell me that Foo Inc is the spending party that bought Bar Corp
  2. If I changed ‘Bar’ to ‘Reuters’ it could tell me that Foo Inc bought Reuters Corp
  3. If I changed the word “acquired” to “sold” it would tell me that Foo Inc is the receiving party that sold Reuters Corp (or Bar Corp etc).
  4. It didn’t get confused by the irrelevant fact that Boston Globe was doing the reporting.

Link to video

Technology Adoption

AIR spam

So I was installing a fresh version of Acrobat Reader 9 on an XP machine a few days ago and guess what … it comes bundled with AIR runtime by the looks (and there is no option to select/deselect when you try to install).

“Interesting” tactic by Adobe to rapidly grow the installed base who can run AIR applications.

Installer options:

Acrobat Installer
Acrobat Installer

 

My Add/Remove programs after installation

Air installs with Acrobat Reader
Air installs with Acrobat Reader

Technology Adoption

Information Democracy

Nick Carr wrote a piece a while ago about the pros and cons of easy information distribution of information. Specifically the double edged sword: GPS http://www.roughtype.com/archives/2008/01/looking_at_a_se.php

As GPS transceivers become common accessories in cars, the benefits have been manifold. Millions of us have been relieved of the nuisance of getting lost or, even worse, the shame of having to ask a passerby for directions.

But, as with all popular technologies, those dashboard maps are having some unintended consequences. In many cases, the shortest route between two points turns out to run through once-quiet neighborhoods and formerly out-of-the-way hamlets.

Scores of villages have been overrun by cars and lorries whose drivers robotically follow the instructions dispensed by their satellite navigation systems.

That’s the problem with the so-called transparency that’s resulting from instantly available digital information. When we all know what everyone else knows, it becomes ever harder to escape the pack.
There is, of course, much to be said for the easy access to information that the internet is allowing. Information that was once reserved for the rich, the well-connected, and the powerful is becoming accessible to all. That helps level the playing field, spreading economic and social opportunities more widely and fairly.

At the same time, though, transparency is erasing the advantages that once went to the intrepid, the dogged, and the resourceful … The commuter who pored over printed maps to find a short cut to work finds herself stuck in a jam with the GPS-enabled multitudes.

You have to wonder whether, as what was once opaque is made transparent, the bolder among us will lose the incentive to strike out for undiscovered territory. What’s the point when every secret becomes, in a real-time instant, common knowledge?

I don’t buy the argument.

It’s safe to assume that technological advances (like GPS) will lead to externalities (unintended side effects, both positive and negative). I live in a village which suffers every now and again from GPS-enabled truck drivers getting stuck in narrow lanes – so I can see where he’s coming from with this particular criticism. But it seems to me that Nick is arguing something more than just about side-effects. He seems to arguing that there is something in the very nature of the technological advances – “what was once opaque is made transparent” – that devalues us. That makes us less willing to strike out for something new.

Hogwash. What is the point of striking out for undiscovered territory? People will always do that just for the hell of it, in the hope of commercial gain or personal aggrandisement. For a number of reasons. What they are striking out for may be different to what their grand-parents considered undiscovered but they will continue nonetheless.

I like analogies. When I think of Nick Carr railing against GPS making it “too easy” to get from A to B I try to imagine similar scenarios. So how about going back 1000 years to the introduction of the abacus into Western Europe which made it far easier to do your maths than it would have been to multiply CLXI by XIV in Roman Numerals. In the same way that GPS democratises navigational data, the abacus democratised the ability to do mathematical calculations. That’s a good thing, right?