Face Tracking – Kinect 4 Windows v2

4 minute read

Since working on the virtual rail project I haven’t had much chance to carry out much in the way of Kinect programming. That all changes now since I ordered a Kinect 4 Windows v2 as I have been hacking away over the last few weeks and wanted to share my experiences here. I don’t intend to cover anything introductory so for that please see the Programming for Kinect for Windows v2 Jumpstart videos here http://channel9.msdn.com/Series/Programming-Kinect-for-Windows-v2 and also the blog series here http://mtaulty.com/CommunityServer/blogs/mike_taultys_blog/archive/tags/Kinect/default.aspx as Mike takes you right from opening the box to getting dancing skeletons and onwards to whichever direction it takes him.

 WP_20140817_18_39_43_Pro

I’m going to ease into this gently with a look into face tracking and I’m going to start out with Managed c# code within a Windows Store application but I suspect that this might change as I learn how best to translate the concepts over to native c++ code. I have been re-learning c++ for a while now and have taken the opportunity also to look into DirectX 11 so I expect a lot of my future samples to go in this direction.

Kinect Sensor

To get the sensor started you will need to install the Kinect for Windows SDK 2.0 Public Preview from here. You will need to have references to two WinRT components; WindowsPreview.Kinect and Microsoft.Kinect.Face. These are distributed as extension SDKs and can be added via the Add Reference dialog under Extensions. Also, you will need to select a processor architecture for the project as “Any CPU” is not supported (x86 leaves the visual studio designer working correctly). Also, you will need to remember to set the Windows Store app capabilities to include microphone and web cam.

references For an event-based program the general pattern is get a KinectSensor object, open a reader object for a data stream, subscribe to events for when the data arrives and finally call Open on the sensor object. Here is a very simple example:

  1. private KinectSensor _kinect = KinectSensor.Default;
  2.  
  3. private void StartKinect()
  4. {
  5.     // Open Kinect-stream
  6.     _kinect.Open();
  7.  
  8.     // Open readers
  9.     _colorReader = _kinect.ColorFrameSource.OpenReader();
  10.  
  11.     // Hook-up frame arrived events
  12.     _colorReader.FrameArrived += OnColorFrameArrivedHandler;
  13. }
  14.  
  15. private void OnColorFrameArrivedHandler(object sender, ColorFrameArrivedEventArgs e)
  16. {
  17.     // Retrieve frame reference
  18.     ColorFrameReference colorRef = e.FrameReference;
  19.  
  20.     // Acquire the color frame
  21.     using (ColorFrame colorFrame = colorRef.AcquireFrame())
  22.     {
  23.         // Check if frame is still valid
  24.         if (colorFrame == null) return;
  25.  
  26.         // Process Color-frame
  27.     }
  28. }

 

Reactive Extensions

For me, Reactive Extensions is a natural fit for Kinect programming in an event driven application as it removes the need to think about the mechanics of how to compose and orchestrate the associated event streams. A similar example to that above but using Rx shows how to subscribe to an Observable and shows how to specify which threading context the subscription and event delivery should occur on.

 

  1. private KinectSensor _kinect = KinectSensor.Default;
  2. private ColorFrameReader _colorReader;
  3. private void StartKinect()
  4. {
  5.     // Open Kinect-stream
  6.     _kinect.Open();
  7.  
  8.     // Open readers
  9.     _colorReader = _kinect.ColorFrameSource.OpenReader();
  10.  
  11.     var colorFrames = Observable.FromEvent<ColorFrameArrivedEventArgs>(
  12.         ev => { _colorReader.FrameArrived += (s, e) => ev(e); },
  13.         ev => { _colorReader.FrameArrived -= (s, e) => ev(e); })
  14.         .SubscribeOn(Scheduler.TaskPool)
  15.         .ObserveOn(Scheduler.TaskPool);
  16.  
  17.     IDisposable disp = colorFrames.Subscribe(OnColorFrame);
  18. }
  19.  
  20. private void OnColorFrame(ColorFrameArrivedEventArgs e)
  21. {
  22.     // Retrieve frame reference
  23.     ColorFrameReference colorRef = e.FrameReference;
  24.  
  25.     // Acquire the color frame
  26.     using (ColorFrame colorFrame = colorRef.AcquireFrame())
  27.     {
  28.         // Check if frame is still valid
  29.         if (colorFrame == null) return;
  30.  
  31.         // Process Color-frame
  32.     }
  33. }

Rx is available as a nuget package – so install into your app and you should be good to go:

rx-nuget

Face Detection

It is necessary to use the tracking ID of the body to start the face tracking and also I will need to get colour frames (so we can see the face). As a result I will use a MultiSourceFrameReader which I will ask to deliver body and colour frames in the same event (at the same time). The body data will contain the skeletons in the camera’s view but we just require to get a tracking ID from he body data which is required for face tracking. The colour frames will be copied into a WriteableBitmap for display.  

 

So with the preparation work done we need to reference the Face library

  1. _frameSource = new FaceFrameSource(_kinect, 0, FaceFrameFeatures.BoundingBoxInColorSpace | FaceFrameFeatures.FaceEngagement | FaceFrameFeatures.Happy);
  2. _faceFrameReader = _frameSource.OpenReader();
  3.  
  4. var faceFrames = Observable.FromEvent<FaceFrameArrivedEventArgs>(
  5.     ev => { _faceFrameReader.FrameArrived += (s, e) => ev(e); },
  6.     ev => { _faceFrameReader.FrameArrived -= (s, e) => ev(e); })
  7.     .SubscribeOn(Scheduler.TaskPool)
  8.     .ObserveOn(Dispatcher);
  9.  
  10. _faceFrameSubscription = faceFrames.Subscribe(OnFaceFrames);

Using a similar pattern to the multi-frame reader we can subscribe to events on a FaceFrameSource object and specify the data we are interested in. I have subscribed for a face bounding box, whether the face is ‘engaged’ with the Kinect and whether the face is happy or not.

Now the general pattern for the event handlers is as follows:

  1. // using pattern to dispose frames as soon as possible
  2. using (var frame = args.FrameReference.AcquireFrame())
  3. {
  4.     if (frame != null)
  5.     {
  6.         // Copy required data
  7.     }
  8. }
  9.  
  10. // Schedule further processing..

so we hang onto the frame for as short a time period as possible.

For the Face frame handler I will hook up the data for the bounding box to a display rectangle and whether the face is happy and/or engaged with the camera will be represented by some simple ui controls.

Here’s a screenshot of the resulting sample running:

screenshot

Sample project can be downloaded here http://1drv.ms/1uJ9bFV

Comments