Earlier this week, my colleague Ryan Thornburg retweeted this news from Google Research:
Computers are starting to deliver reasonable captions for images: http://t.co/BatxOmHjAq
— Matt Cutts (@mattcutts) November 18, 2014
The post from Google describes the process of object detection, classification and labeling. The researchers include examples of effective computer-generated captions and others that fall short.
As an editor who has written many captions (and called them cutlines back in the day), I read the post with great interest. Could this lead to computers replacing editors?
Probably not. Even the best of these computer-generated captions states the obvious.
They don’t provide background and context. They don’t connect the image to a larger story. They don’t tell us what we cannot see. Effective captions, written by people, do all of those things in addition to describing the photograph.
Still, I appreciate the value of robo-captions on another level, if not for journalism. The Google scientists put it this way:
This kind of system could eventually help visually impaired people understand pictures, provide alternate text for images in parts of the world where mobile connections are slow, and make it easier for everyone to search on Google for images.
I’ll be curious to see how computer-generated captions evolve. For now, though, I view them as I view robo-articles: sometimes functional, but in need of human editors.