"If you go to Bing and you search for a bird, you get a bird picture. But here, the pictures are created by the computer, pixel by pixel, from scratch. These birds may not exist in the real world -- they are just an aspect of our computer's imagination of birds," Xiaodong He from Microsoft's research lab in a blog post late on Thursday.
According to results on an industry standard test, reported in a research paper posted on arXiv.org, the bot produced a nearly three-fold boost in image quality compared to the previous state-of-the-art technique for text-to-image generation.
The core of this bot is a technology known as a "Generative Adversarial Network" or GAN.
The network consists of two Machine Learning models -- one that generates images from text descriptions and another, known as a discriminator, that uses text descriptions to judge the authenticity of generated images.
The researchers said that text-to-image generation technology could find practical applications acting as a sort of sketch assistant to painters and interior designers or as a tool for voice-activated photo refinement.
For now, the technology is imperfect.
"For AI and humans to live in the same world, they have to have a way to interact with each other. The language and vision are the two most important modalities for humans and machines to interact with each other," The blog post explained.