People detection and tracking are essential capabilities in several fields such as: ambient intelligent systems, visual servoing applications , augmented reality and human-computer interaction, video compression or robotics.
Detection and tracking in monocular images are topics widely explored in the related literature. However, the use of stereo vision for these purposes is an emerging research area. Stereo vision brings several advantages over monocular images. First, all the methods designed for tracking in monocular images can be applied, but with much richer per-pixel information (colour or luminance plus depth). Depth information can be employed to achieve a better tracking of people as well as a better understanding of their gestures. Besides, depth is an important piece of information for the development of robust background estimation techniques. Second, disparity information (from which depth is obtained) is relatively invariable to illumination changes. Therefore, systems that employ stereo vision are expected to be more robust in real scenarios where sudden illumination changes might occur.