Lifelog systems, inspired by Vannevar Bush’s concept of
“MEMory EXtenders” (MEMEX), are capable of storing a
person’s lifetime experience as a multimedia database. Despite
such systems’ huge potential for improving people’s
everyday life, there are major challenges that need to be addressed
to make such systems practical. One of them is how
to index the inherently large and heterogeneous lifelog data
so that a person can efficiently retrieve the log segments that
are of interest. In this paper, we present a novel approach to
indexing lifelogs using activity language. By quantizing the
heterogeneous high dimensional sensory data into text representation,
we are able to apply statistical natural language
processing techniques to index, recognize, segment, cluster,
retrieve, and infer high-level semantic meanings of the collected
lifelogs. Based on this indexing approach, our lifelog
system supports easy retrieval of log segments representing
past similar activities and generation of salient summaries
serving as overviews of segments.