Guide to the Carnegie Mellon University Multimodal Activity (CMU-MMAC) Database Fernando De la Torre Jessica Hodgins Adam Bargteil Xavier Martin Justin Macey Alex Collado Pep Beltran 10.1184/R1/6555020.v1 https://kilthub.cmu.edu/articles/journal_contribution/Guide_to_the_Carnegie_Mellon_University_Multimodal_Activity_CMU-MMAC_Database/6555020 This document summarizes the technology, procedures, and database organization of the CMU Multi-Modal Activity Database (CMU-MMAC). The CMU-MMAC database contains multimodal measures of the human activity of subjects performing the tasks involved in cooking and food preparation. The CMU-MMAC database was collected in Carnegie Mellon University’s Motion Capture Lab. A kitchen was built and to date five subjects have been recorded cooking five different recipes: brownies, pizza, sandwich, salad and scrambled eggs. The following modalities were recorded: • Video: (1) Three high spatial resolution (1024 × 768) color video cameras at low temporal resolution (30 Hertz). (2) Two low spatial resolution (640 × 480) color video cameras at high temporal resolution (60 Hertz). (3) One wearable low spatial resolution (640×480) camera at low temporal resolution (12 Hertz). • Audio: (1) Five balanced microphones. (2) Wearable watch. • Motion capture: A Vicon motion capture system with 12 infrared MX-40 cameras. Each camera records images of 4 megapixel resolution at 120 Hertz. • Five 3-axis accelerometers and gyroscopes. Several computers were used for recording the various modalities. The computers were synchronized using the Network Time Protocol (NTP). 2008-04-01 00:00:00 Robotics