The difference between image analysis and image recognition
We’re excited by the opportunities image analysis presents, but have also learnt a lot about the current challenges in this space.
First off, it’s important to understand the difference between ‘image recognition’ and ‘image analysis’. In short:
- Image recognition finds images shared online/in a platform’s archive containing certain things – usually brand logos.
- Image analysis identifies what’s within an image you already have – for example, you show it the below image, and it identifies (and subsequently tags) what’s in the image – in this case, two women, mugs, indoors etc.
Image recognition in theory promises something amazing – the ability to never miss an image of your brand or your logo, even when they don’t actually mention you in the accompanying text.
There have been some interesting early examples of how this can be used – such as Miller Lite’s digital agency, DigitasLBi, finding new audiences to target through images relevant to the brand, or brands using user-generated images as campaign assets.
But it’s image analysis where I think the most exciting opportunities lie.
Image analysis represents the possibility of organizing the web’s images into an archive that is fully searchable and analyze-able. Looking for an image for your next campaign of your beverage being enjoyed on a beach? No problem – here are all of the images representing that posted by your fans, so you can contact them about using them.
Want to know where your new snack is being consumed so you can target your next ad more appropriately? No problem – here’s a breakdown of all the types of places images of your product are taken.
Image analysis can identify all manner of things in images, from the types of people and surroundings, to objects, emotions and even the time of day.
This is not only powerful for uncovering user-generated content that can then be used in campaigns and amplified, but also for better understanding your audience and how they view, consume, and promote your products.
Understanding the challenges
But it’s not without its challenges and the technology is still evolving – take the recent example of Flickr’s auto-tagging system causing offence by tagging concentration camps as ‘jungle gyms’.
Clearly, not the desired outcome, but this case raises an important consideration with image analysis – context.
Whilst recently working with our product team on how we might tag and group particular images, it became apparent just how difficult it can be for a computer to fully understand context to the same degree that humans can – but also what a huge opportunity it will be when it’s cracked.
For example, what do you see in the image below?
You or I would look at this and probably surmise that this is a man and a woman, possibly a couple, having a serious or emotional conversation. We probably assume it’s night-time, and that they’re in some kind of bar.
Think about all of the things we’ve taken into account – through years of life experience – to come to that conclusion. The subtleties of the red lighting, the type of seat, the lights in the corner, the body language.
Image analysis technology on the other hand can likely work out that it’s dark, so probably night-time, and assumes they’re indoors because it can’t detect sky, buildings or trees.
It can see there’s a man and a woman. It might even be able to work out that they’re having a conversation from their positions. If it’s really clever, it might be able to understand their emotions (for example, Google’s Cloud Vision can detect basic emotions).
But what it is unlikely to understand – at this stage, anyway – is the mood and context of the image. The overall tone. The moodiness, the fact that it’s a bar, the emotions on their faces combined with the scenario. All those things humans understand instinctively.
Another example is celebrities – it’s not (currently) possible for a computer to know every single face of every A-Z list celebrity in the world, so whilst it may well know this is an image of a man with his (very cute) dog, currently the technology out there probably won’t pick up on the fact that it’s actually world-famous footballer Lionel Messi – something you might want to know about if you’re a brand and he’s holding your product rather than a dog.
The tech is developing quickly, but it’s not quite there yet.
But imagine a future where it can. Where you can look for images with a particular emotion and answer questions like ‘are people happy or sad when using my product?’, or ‘are any celebrities using my products in their photos?’
This issue of context is such an interesting challenge and it’s going to be incredibly exciting to see how far computers can go as this area develops.
We’re certainly fired up about seeing how far we can push this technology and are already working on some exciting things with partners and our own team.
We’re only at the beginning of this journey but hope to be able to tell you more about it soon. In the meantime, have you got any great ideas about using image analysis? How would it help your brand? Let us know in the comments or send us a tweet to @brandwatch.