Carnegie Mellon University
file.pdf (496.8 kB)
Download file

Parallelization Strategies for a Dynamic Lexical Tree Decoder

Download (496.8 kB)
journal contribution
posted on 2011-01-01, 00:00 authored by Matthias Vogelgesang, Florian MetzeFlorian Metze

Increasingly, physical limitations lead to a shift from high clocked single core processors to CPUs with up to eight, or more, independent but slower processing cores, and multi-core or even multi-CPU computers. In order to retain performance gains in the future, the speech decoding process has to be re-organized to employ a certain amount of thread-level parallelism on those CPUs. In this work, we compare two common approaches for dynamic prefix tree decoders: Parallel Score Computation and Parallel Search, and a combination of both. Both have already been studied intensively, however it is shown here, that the latter suffers from hardware cache effects which limit absolute speed-ups and scalability in general. We propose a cache efficient variation of the Parallel Score Computation which is more scalable and faster than any other parallel strategy we compared it with.


Publisher Statement

© 2011, F. Metze



Usage metrics