Attention-guided Algorithms to Retarget and Augment Animations, Stills, and Videos
Still pictures, animations and videos are used by artists to tell stories visually. Computer graphics algorithms create visual stories too, either automatically, or, by assisting artists. Why is it so hard to create algorithms that perform like a trained visual artist? The reason is that artists think about where a viewer will look at and how their attention will flow across the scene, but algorithms do not have a similarly sophisticated understanding of the viewer.
Our key insight is that computer graphics algorithms should be designed to take into account how viewer attention is allocated. We first show that designing optimization terms based on viewers’ attentional priorities allows the algorithm to handle artistic license in the input data, such as geometric inconsistencies in hand-drawn shapes. We then show that measurements of viewer attention enables algorithms to infer high-level information about a scene, for example, the object of storytelling interest in every frame of a video.
All the presented algorithms retarget or augment the traditional form of a visual art. Traditional art includes artwork such as printed comics, i.e., pictures that were created before computers became mainstream. It also refers to artwork that can be created in the way it was done before computers, for example, hand-drawn animation and live action films. Connecting traditional art with computational algorithms allows us to leverage the unique strengths on either side. We demonstrate these ideas on three applications:
Retargeting and augmenting animations: Two widely practiced forms of animation are two-dimensional (2D) hand-drawn animation and three-dimensional (3D) computer animation. To apply the techniques of the 3D medium to 2D animation, researchers have attempted to compute 3D reconstructions of the shape and motion of the hand-drawn character, which are meant to act as their ‘proxy’ in the 3D environment. We argue that a perfect reconstruction is excessive because it does not leverage the characteristics of viewer attention. We present algorithms to generate a 3D proxy with different levels of detail, such that at each level the error terms account for quantities that will attract viewer attention. These algorithms allow a hand-drawn animation to be retargeted to a 3D skeleton and be augmented with physically simulated secondary effects.
Augmenting stills: Moves-on-stills is a technique to engage the viewer while presenting still pictures on television or in movies. This effect is widely used to augment comics to create ‘motion comics’. Though state of the art software, like iMovie, allows a user to specify the parameters of the camera move, it does not solve the problem of how the parameters are chosen. We believe that a good camera move respects the visual route designed by the artist who crafted the still picture; if we record the gaze of viewers looking at composed still pictures, we can reconstruct the artist’s intention. We show, through a perceptual study, that the artist succeeds in directing viewer attention in comic book pictures, and we present an algorithm to predict the parameters of camera moves-on-stills from statistics derived from eyetracking data.
Retargeting video: Video retargeting is the process of altering the original video to fit the new display size, while best preserving content and minimizing artifacts. Recent techniques define content as color, edges, faces and other image-based saliency features. We suggest that content is, in fact, what people look at. We introduce a novel operator that extends the classic “pan-and-scan” to introduce cuts in addition to automatic pans based on viewer eyetracking data. We also present a gaze-based evaluation criterion to quantify the performance of our operator.