Guide to the Carnegie Mellon University Multimodal Activity (CMU-MMAC) Database
journal contributionposted on 01.04.2008 by Fernando De la Torre, Jessica Hodgins, Adam Bargteil, Xavier Martin, Justin Macey, Alex Collado, Pep Beltran
Any type of content formally published in an academic journal, usually following a peer-review process.
This document summarizes the technology, procedures, and database organization of the CMU Multi-Modal Activity Database (CMU-MMAC). The CMU-MMAC database contains multimodal measures of the human activity of subjects performing the tasks involved in cooking and food preparation. The CMU-MMAC database was collected in Carnegie Mellon University’s Motion Capture Lab. A kitchen was built and to date five subjects have been recorded cooking five different recipes: brownies, pizza, sandwich, salad and scrambled eggs. The following modalities were recorded: • Video: (1) Three high spatial resolution (1024 × 768) color video cameras at low temporal resolution (30 Hertz). (2) Two low spatial resolution (640 × 480) color video cameras at high temporal resolution (60 Hertz). (3) One wearable low spatial resolution (640×480) camera at low temporal resolution (12 Hertz). • Audio: (1) Five balanced microphones. (2) Wearable watch. • Motion capture: A Vicon motion capture system with 12 infrared MX-40 cameras. Each camera records images of 4 megapixel resolution at 120 Hertz. • Five 3-axis accelerometers and gyroscopes. Several computers were used for recording the various modalities. The computers were synchronized using the Network Time Protocol (NTP).