May 7, 2015 · Computer Vision

Finger Tracking with OpenCV and Python

Tracking the movement of a finger is an important feature of many computer vision applications. One of the challenges in detecting fingers is differentiating a hand from the background and identifying the tip of a finger. I'll show you my technique for tracking a finger, which I used in this project. To see finger tracking in action check out this video from my application.

Skin color histogram

In an application where you want to track a user's movement a skin color histogram can be a helpful tool. This histogram subtracts the background from an image, only leaving parts of the image that contain skin. A much simpler method to detect skin would be to find pixels that are in a certain RGB or HSV range. The problem with this is that changing light environments and skin colors can really mess with the skin detection. Using a histogram tends to be more accurate and takes into account the current light environment.

My application takes skin color samples from the user's hand and then creates a histogram. Green rectangles are drawn on the frame and the user places their hand inside these rectangles.

The rectangles are drawn with the following function:

def draw_hand_rect(self, frame):  
    rows,cols,_ = frame.shape

    self.hand_row_nw = np.array([6*rows/20,6*rows/20,6*rows/20,10*rows/20,10*rows/20,10*rows/20,14*rows/20,14*rows/20,14*rows/20])

    self.hand_col_nw = np.array([9*cols/20,10*cols/20,11*cols/20,9*cols/20,10*cols/20,11*cols/20,9*cols/20,10*cols/20,11*cols/20])

    self.hand_row_se = self.hand_row_nw + 10
    self.hand_col_se = self.hand_col_nw + 10

    size = self.hand_row_nw.size
    for i in xrange(size):
        black = np.zeros(frame.shape, dtype=frame.dtype)
        frame_final = np.vstack([black, frame])
        return frame_final

There's nothing to complicated going on here. I created four arrays (hand_row_nw, hand_col_nw, hand_row_se, hand_col_se) to hold the coordinates of each rectangle. The code then iterates over these arrays and draws them on the frame using cv2.rectangle.

Now that the user knows where to place his or her hand, the next step is to extract pixels from these rectangles and use them to create an HSV histogram.

def set_hand_hist(self, frame):  
    hsv = cv2.cvtColor(frame, cv2.COLOR_BGR2HSV)
    roi = np.zeros([90,10,3], dtype=hsv.dtype)

    size = self.hand_row_nw.size
    for i in xrange(size):
        roi[i*10:i*10+10,0:10] = hsv[self.hand_row_nw[i]:self.hand_row_nw[i]+10, self.hand_col_nw[i]:self.hand_col_nw[i]+10]

    self.hand_hist = cv2.calcHist([roi],[0, 1], None, [180, 256], [0, 180, 0, 256])
    cv2.normalize(self.hand_hist, self.hand_hist, 0, 255, cv2.NORM_MINMAX)

This function converts the input frame to HSV. It then takes the 900 elements inside the green rectangles and puts them in the roi matrix. The magic happens in the last two lines. cv2.calcHist creates a histogram using the skin regions and cv2.normalize normalizes this matrix. That's it! Now you have a histogram to detect skin regions in the frame.

Skin detection

Now that you have a skin color histogram you can use it to find the parts of the frame that contain skin. OpenCV provides you with a convenient method, cv2.calcBackProject, that uses a histogram to isolate features in an image. You can read more about back projection here and here. I used this function to apply the skin color histogram to a frame:

def apply_hist_mask(frame, hist):  
    hsv = cv2.cvtColor(frame, cv2.COLOR_BGR2HSV)
    dst = cv2.calcBackProject([hsv], [0,1], hist, [0,180,0,256], 1)

    disc = cv2.getStructuringElement(cv2.MORPH_ELLIPSE, (11,11))
    cv2.filter2D(dst, -1, disc, dst)

    ret, thresh = cv2.threshold(dst, 100, 255, 0)
    thresh = cv2.merge((thresh,thresh, thresh))

    cv2.GaussianBlur(dst, (3,3), 0, dst)

    res = cv2.bitwise_and(frame, thresh)
    return res

In the first two lines I converted the input frame to HSV and then used cv2.calcBackProject with the skin color histogram hist. Next I used convolution to smooth the image. Lastly I masked the input frame. This final frame should just contain skin color regions of the frame.

The following two images show the output of apply_hist_mask.

Finger tip detection

Great, now we have a frame with skin regions, but what we really want to find is the location of a finger tip. Using OpenCV you can find what are called contours in a frame, which you can read about here. Using contours you can find convexity defects, which will be potential finger tip locations. In my application I wanted to find the tip of a finger with which a user is pointing. To do this I determined the convexity defect, which is furthest from the centroid of the contour. This is done by the following code:

def draw_final(self, frame, hand_detection):  
    hand_masked = image_analysis.apply_hist_mask(frame, hand_detection.hand_hist)

    contours = image_analysis.contours(hand_masked)
    if contours is not None and len(contours) > 0:
        max_contour = image_analysis.max_contour(contours)
        hull = image_analysis.hull(max_contour)
        centroid = image_analysis.centroid(max_contour)
        defects = image_analysis.defects(max_contour)

        if centroid is not None and defects is not None and len(defects) > 0:   
            farthest_point = image_analysis.farthest_point(defects, max_contour, centroid)

            if farthest_point is not None:
                self.plot_farthest_point(frame, farthest_point)

Here I used some functions from image_analysis. draw_final first finds the contours of an image.

Then it determines the largest contour. For this contour it finds the hull, centroid and defects. The hull is blue and defects are pink.

Now that you have all these defects you find the one that is farthest from the center of the contour. This point is assumed to be the pointing finger. The center is blue and farthest point is red. And there you have it, you've found a finger tip.

Comments powered by Disqus