Pure Language Processing and Pc Imaginative and prescient was once two utterly completely different fields. Effectively, not less than again after I began to study machine studying and deep studying, I really feel like there are a number of paths to observe, and every of them, together with NLP and Pc Imaginative and prescient, directs me to a totally completely different world. Over time, we are able to now observe that AI turns into an increasing number of superior, with the intersection between a number of fields of examine getting extra widespread, together with the 2 I simply talked about.
At the moment, many language fashions have functionality to generate photographs based mostly on the given immediate. That’s one instance of the bridge between NLP and Pc Imaginative and prescient. However I suppose I’ll put it aside for my upcoming article because it is a little more complicated. As an alternative, on this article I’m going to debate the easier one: picture captioning. Because the title suggests, that is primarily a method the place a particular mannequin accepts a picture and returns a textual content that describes the enter picture.
One of many earliest papers on this matter is the one titled “Present and Inform: A Neural Picture Caption Generator” written by Vinyals et al. again in 2015 [1]. On this article, I’ll deal with implementing the deep studying mannequin proposed within the…