Boosting != Boosting Cascade

In machine learning literature, boosting and support vector machine (SVM) are the most popular algorithms out of box.

The emergence of boosting, which originated from almost an engineering-based perspective (multiple weak classifiers combine together to yield a strong one), is earlier, and less mathematically deliberate than SVM.

Upon the introduction of rapid face detection by Viola and Jones, the use of boosting, at least in the CV context, almost bears the same meaning as fast and simple. This is, however, not quite the exactly true.

First of all, the 2001 paper by Viola and Jones uses Haar feature, a feature that can be computed very fast from integral image, which is also easily obtained from a gray image.

Secondly, they employ a modified boosting algorithm that uses 1 stage at a time. As a reminder, suppose that the boosting score is expressed as a weighted voting of individual classifiers

h_m(\mathbf x) = \alpha_1 h_1(\mathbf x) + \alpha_2 h_2(\mathbf x) + \cdots + \alpha_m h_m(\mathbf x)

This expression is by no means simpler than the SVM computation. It does not even differ from any monolithic classifier.

The two reasons above bankrupt the idea that boosting == fast.

The boosting cascade introduced by VJ is trained in a way such that the classification is divided into multiple stages. Each time only one test h_i(\mathbf x) is considered. If the test passes, go to the next test; otherwise reject. Only the candidate that passes all m rounds is regarded as positive. This is effectively a degenerated decision tree, which boosts the classification speed by rejecting many candidates that fail at early stages.

Therefore, boosting cascade == fast.

My first CVPR paper, yay

Automatic Adaptation of a Generic Pedestrian Detector to a Specific Traffic Scene