metro app and kinect voice control

2 minute read

Peter Daukintis

Recently I had a need to reacquaint myself with the latest Kinect SDK. v1.6 has just been released and is downloadable from here Kinect SDK. So I downloaded it and started playing around with some code and it occurred to me that what I would really like was an environment in which I could use the Kinect in a metro application. Using the Kinect SDK from a metro app is not a supported scenario and neither is communication between a metro app and a desktop app. Having said that, I recall reading somewhere that someone may have used a wcf service to carry out this communication so I guessed that a local socket connection would probably work (I believe this is an unsupported mechanism and would likely fail store certification). My goal was not to release an app but just to be able to make use of it for experimental/personal usage.

Now, I had previously written some WinRT code for socket communication using StreamSocket (see StreamSocket Example) and I saw a few blog posts about using WinRT libraries on the Windows 8 desktop so I though I’d give it a try…

The first thing was to get a communication path between a metro app and windows 8 desktop app. I followed the instructions here to enable WinRT in a desktop app. Then, I copied the server side of the socket code and the xaml into a WPF app. With a few minor adjustments (replacing OnNavigatedTo event handler and MessageDialog) it was good to go..

 

wpfapp

To test that everything worked ok, I

  • Ran the wpf application in the debugger…
  • Ran the store ui app in the debugger
  • Listen from the desktop
  • Connect from the store ui app

And….

It worked!

(side by side apps screenshot…)

So, the next step is to configure the wpf application to respond to Kinect voice commands and ship them over the socket.

First I added references to Kinect SDK and toolkit assemblies, which on my machine were located as follows:

C:Program FilesMicrosoft SDKsKinectDeveloper Toolkit v1.6.0SamplesbinMicrosoft.Speech.dll

And

C:Program FilesMicrosoft SDKsKinectv1.6AssembliesMicrosoft.Kinect.dll

Add code to carry out speech recognition using the Kinect with a very simple grammar.

_sre = new SpeechRecognitionEngine(recognizer);

var gb = new GrammarBuilder { Culture = recognizer.Culture };

var choices = new Choices();
choices.Add(new SemanticResultValue("go", "NEXT"));
choices.Add(new SemanticResultValue("back", "PREV"));

gb.Append(choices);
var g = new Grammar(gb);

_sre.LoadGrammar(g);
_sre.SpeechRecognized += sre_SpeechRecognized;

_sre.SetInputToAudioStream(_kinect.AudioSource.Start(),
    new Microsoft.Speech.AudioFormat.SpeechAudioFormatInfo(EncodingFormat.Pcm, 16000, 16, 1, 32000, 2, null));

_sre.RecognizeAsync(RecognizeMode.Multiple);

When the speech recognizer recognises the next (go) and previous (back) words I will transmit that text over the socket to the waiting metro app whereupon I will use them to activate commands.

To make a useful (sort of) example I will create a Metro app which loads images from your pictures library into a flip view and use the voice commands to allow navigation of the pictures using 'next'/"previous".

   <a href="/assets/images/2012/10/wp_000036.jpg"><img style="background-image:none;border-bottom:0;border-left:0;padding-left:0;padding-right:0;display:block;float:none;margin-left:auto;border-top:0;margin-right:auto;border-right:0;padding-top:0;" title="WP_000036" border="0" alt="WP_000036" src="/assets/images/2012/10/wp_000036_thumb.jpg" width="573" height="449" /></a>  <p>&#160;</p>  <p>Disclaimer: Please note that I haven’t tested this code beyond trying it in my development environment and it was not written to be robust in any way.</p>  <p>Download the solutions here.</p>  <p><a href="http://sdrv.ms/QW12JJ" target="_blank">KinectSocketServer</a></p>  <p><a href="http://sdrv.ms/QW1bwX" target="_blank">KinectPictureViewer</a></p>

Comments