Face-tracking with OpenCV

I've been toying with OpenCV lately and surprisingly it is pretty easy to implement face-tracking using OpenCV. However, OpenCV documentation is a bit lacking especially with its Python binding.

Installing OpenCV

In OSX (yes, I've moved to OSX from Windows since I do a lot of *nix development now), this should be pretty easy.

$ brew install opencv

Note: A bit of caveat here if you're using pyenv to manage your Python version. I had to use the system Python in order to correctly build the opencv python binding. i.e in the terminal, issue the command pyenv global system to set pyenv to use Python provided by the system. On my machine, this will set pyenv to use my brewed Python.



We are going to use Haar-Cascade to detect faces. You can read more about how it works in OpenCV documentation, but in general what it does is that it looks for area that defines face. For example, it will scan the image looking for feature like certain area is darker than some area, that might indicate eyes, nose bridge, etc. To make it runs faster, they create cascade of classifiers (thus the name). It groups certain features and it will look for group of feature one by one. It needs to pass all these classifiers before we can conclude this is a face.

It is actually possible to use Haar-Cascade to detect any other object as well, but you need to train them first and it takes hours to days to do this. So, I'm just going to use the one provided by OpenCV.


import cv2
import cv2.cv as cv

def detect(img):
    cascade = cv2.CascadeClassifier("/usr/local/Cellar/opencv/2.4.9/share/OpenCV/haarcascades/haarcascade_frontalface_alt.xml")
    # some other examples
    # cascade = cv2.CascadeClassifier("/usr/local/Cellar/opencv/2.4.9/share/OpenCV/haarcascades/haarcascade_mcs_nose.xml")
    # cascade = cv2.CascadeClassifier("/usr/local/Cellar/opencv/2.4.9/share/OpenCV/haarcascades/haarcascade_upperbody.xml")
    # cascade = cv2.CascadeClassifier("/usr/local/Cellar/opencv/2.4.9/share/OpenCV/haarcascades/haarcascade_mcs_mouth.xml")
    rects = cascade.detectMultiScale(img, 1.05, 4, cv2.cv.CV_HAAR_SCALE_IMAGE, (20,20))

    if len(rects) == 0:
        return [], img
    rects[:, 2:] += rects[:, :2]
    return rects, img

def main():
    vc = cv2.VideoCapture(0)
    # small resolution, so the frame rate is higher
    vc.set(cv.CV_CAP_PROP_FRAME_WIDTH, 320)
    vc.set(cv.CV_CAP_PROP_FRAME_HEIGHT, 240)

    if vc.isOpened(): # try to get the first frame
        rval, frame = vc.read()
        rval = False

    while rval:
        cv2.imshow("preview", frame)
        rval, frame = vc.read()

        # detection
        rects, img = detect(frame)
        for x1, y1, x2, y2 in rects:
            cv2.rectangle(img, (x1, y1), (x2, y2), (127, 255, 0), 2)

        cv2.imshow("preview", img)

        key = cv2.waitKey(20)
        if key == 27: # exit on ESC


if __name__ == '__main__':

The important piece is detectMultiScale(image[, scaleFactor[, minNeighbors[, flags[, minSize[, maxSize]]]]]). To understand what each parameter is, refer this good SO thread.