Today, Google open source its latest version for image captioning system available as open source model in TensorFlow. This release contains significant improvements to the computer vision component of the captioning system, is much faster to train, and produces more detailed and accurate descriptions compared to the original system. These improvements are outlined and analyzed in the paper Show and Tell: Lessons learned from the 2015 MSCOCO Image Captioning Challenge, published in IEEE Transactions on Pattern Analysis and Machine Intelligence.
Google Brain team started working on a system that could analyze an image and write caption for it.. which was started in 2014. The V1 system able to achieve an accuracy of 89.6% and later upgraded to Inception V2 enabling 91.8% accuracy.
The current version V3 enables the system to analyze images upto 93.9% of accuracy. The latest version V3 can detect multiple objects in an image along with their characteristics and write more relevant caption.
Google later announced that image captioning system is now available open source which is a part of TensorFlow. This version of release contains significant improvements and new updates which is much faster to produce more details and accurate descriptions compared to the original system.