MIT researchers have trained a deep-learning algorithm to spawn videos on the basis of a single frame.
Computer scientists from the Computer Science and Artificial Intelligence Lab (CSAIL) created the system by feeding their algorithm two million unlabelled videos—which would last for about two years if played one after the other.
The footage showed snippets of everyday life, which helped the computer build a corpus of knowledge about real-life situations.
The computer was then asked to generate videos starting from random frames, in accordance to what it had learnt from analysing the dataset. Each of the computer’s creations were assessed by another machine learning programme— which would tell whether a video was more likely to be genuine or machine-generated— in a process called “adversarial learning.”
According to Motherboard, initially the system would try and fool its fellow algorithm by simply tweaking the video’s background. The issue was solved by establishing a new rule that forced the computer to keep the background static while animating objects in the foreground.
Solved that, the computer’s creation turned more convincing. The one-second videos generated by the algorithms are rather verisimilar hypotheses of how ordinary scenes captured in hospitals, beaches or stations might play out. One main shortcoming is that, in some cases, the objects featured in the footage move like shapeless globs of matter— with eerie results when babies’ faces are involved.
Still, according to the researchers, the outcome is “promising that our model can generate plausible motion.”
A paper on the study is going to be presented next week at the Conference on Neural Information Processing System, in Barcelona.
Co-working space and blog dedicated to all things data science.