Q&A: Peter Fairfax, Data Science Manager
We spoke to Brandwatch Data Science Manager Peter Fairfax to understand more about ChatGPT and large language models (LLMs) that are behind these new developments and what Brandwatch, and our industry, can expect in the future.
Hi Peter, these developments being built using ChatGPT are definitely exciting! Can you tell us more about ChatGPT and how it works?
“ChatGPT is a language model, and there are many others out there. In simple terms, a language model analyses the likelihood of words being written, based on existing texts the model has been shown. For example, imagine we have the sentence: Will Smith likes to eat [BLANK].
A language model can compare the likelihood of [BLANK] being lots of words, such as spaghetti or cats. Will Smith isn’t a cat-eating psychopath, and there has been a recent fascination online with crude, AI-generated videos of him eating spaghetti. A model trained on that data will probably think spaghetti has a greater likelihood than cats.
Predictive text on your phone is powered by a simple language model. Bigger models can be more complicated, but the important thing to remember is that language models are powered by word probabilities.”
Why is ChatGPT a big deal if it’s just another language model?
“A few reasons. Firstly, it’s a particularly large and sophisticated one. As LLMs become bigger and are fed more data, they tend to be able to solve more complex tasks. For example, ChatGPT and GPT-4 can write code and answer complex questions. Bigger models with more training can do more abstract and complicated tasks, and help us unlock more features like the ones we are launching soon. For me, the capacity of the most recent LLMs show that we’re on the cusp of AI overtaking humans on certain language-based tasks – like the language equivalent of Deep Blue beating the best human chess champion in 1997.
Unfortunately for machines, this means from now on they’ll probably spend more time answering questions about lost parcels than having fun playing board games, but I’m fine with that.”
There’s a lot of hype, but what are the limitations?
“ChatGPT is stunningly good at a lot of things, but even the best tech has weaknesses. In ChatGPT’s case, it doesn’t have up-to-date knowledge and sometimes struggles with numeracy.
As well as that, the internet can be a dark and strange place at times, with people sharing all sorts of views. This material can end up in the training data of LLMs, so they sometimes parrot views we find unsavoury or wrong.”
How can Brandwatch tech work alongside more sophisticated LLMs going forward?
“As I said, these models aren’t perfect but they do complement our tech.
ChatGPT doesn’t have any idea what’s happening right now – its training data cutoff is September 2021. Brandwatch Consumer Research takes in up to 50,000 new documents per second and enriches each with sentiment, location, GPT powered entities, and all the other metadata within a few minutes.
ChatGPT can struggle with arithmetic, confidently giving plausible but wrong answers. By contrast, Brandwatch is built to analyze data at scale and create insights using AI, statistics, and complex aggregations. Our new features mesh the best of ChatGPT and Brandwatch, so you’ve got reliable live quantitative insights that are even easier for a user to understand.
One example is AI-powered conversation insights. This lets you click on any data point in our dashboard, such as peaks or segments within your data, and now with ChatGPT’s power of summarizing large amounts of text you’ll get a succinct, natural language overview of what is causing the trend.
More generally, I expect we’ll see a move towards more chatbot-like functionality for non-expert users, allowing people to interact with all our data without needing to know all the tricks for drawing insights from the noise. Behind the scenes, this will require explaining more information in text form, and I really hope one side effect of this would be to improve accessibility for blind and low vision people.”
How will LLMs affect the future of our industry?
“It’s impossible to really anticipate, but off the top of my head it would be: more modes of interaction, seamlessly combining numerical AI with language models, and promoting human creativity.
We should be able to make social listening available to more people, and build on technologies like AI-powered Search. It’s great that you don’t need to be an expert in Boolean to write queries anymore, but new LLMs will also make synthesizing insights and telling stories quicker and easier.
There will always be a key role for statistics, and the world of AI beyond language models. These will continue to be the backbone powering real-time analysis and monitoring as well as long term predictions, and these will mesh more closely with language models.
ChatGPT and other LLMs are able to produce human-sounding content extremely well, and will continue to improve. The companies with the brightest long-term future are those who listen carefully to their users and think creatively about the core problems which can be solved by AI, plus what parts of the job users want to spend less time doing.
There will be risks along the way, and we shouldn’t get so caught up in the hype that we forget our ethical and scientific standards. As data scientists, we have ethical obligations to accuracy and robustness. But there will also be huge serendipity too, and I’m genuinely excited to see what cool stuff we’re now able to build, and who we can help along the way.”