A team of AI researchers at Facebook have published a paper earlier this week describing a technique to extract playable characters from real-world videos.
Called in a rather self-explanatory way 'Vid2Game', this method is reportedly powered by two AI networks. One, Pose2Pose, manipulates a given pose based on an input stream of control signals such as those from a joystick or gamepad. The other, Pose2Frame, subsequently generates a high-resolution output frame given a background image (which can also be dynamic, apparently).
As mentioned in the paper's conclusion, this AI-based technique could potentially lead to different types of games that are 'realistic and personalized'.
In this work, we develop a novel method for extracting a character from an uncontrolled video sequence and then reanimating it, on any background, according to a 2D control signal. Our method is able to create long sequences of coarsely-controlled poses in an autoregressive manner.
These poses are then converted into a video sequence by a second network, in a way that enables the careful handling and replacement of the background, which is crucial for many applications. Our work paves the way for new types of realistic and personalized games, which can be casually created from everyday videos. In addition, controllable characters extracted from YouTube-like videos can find their place in the virtual worlds and augmented realities.
The full paper is available for reading on the official website of the Cornell University, a private research university based in Ithaca, New York. Credits for the work go to Oran Gafni, Lior Wolf, and Yaniv Taigman; below you can take a look at how 'Vid2Game' works in practice.