System.Speech.Recognition(语音识别) - Ansel Song

原文：http://www.cnblogs.com/fx_guo/archive/2011/01/14/1935546.html

虽然您可以在应用程序中使用通用听写语言模型，但是很快您将遇到大量应用程序开发困难，它们是关于如何处理识别结果的。例如，以比萨饼定购系统为例。用户可能说“I'd like a pepperoni pizza”，结果将包含该字符串。但是它也可能包含“I'd like pepper on a plaza”，或者很多发音类似的语句，这取决于用户的发音差别或背景噪音情况。同样，用户可能说“Mary had a little lamb”，而结果将包含它，即使它对于比萨饼定购系统毫无意义。所有这些错误的结果对于应用程序而言毫无用处。因此，应用程序应该始终提供专门描述应用程序所要听到内容的语法。

在图 8 中，

using System;
using System.Windows.Forms;
using System.ComponentModel;
using System.Collections.Generic;
using System.Speech.Recognition;

namespace Reco_Sample_1
{
    public partial class Form1 : Form
    {
        //create a recognizer
        SpeechRecognizer _recognizer = new SpeechRecognizer();

public Form1() { InitializeComponent(); }

        private void Form1_Load(object sender, EventArgs e)
        {
           //Create a pizza grammar
           Choices pizzaChoices = new Choices();
           pizzaChoices.AddPhrase("I'd like a cheese pizza");
           pizzaChoices.AddPhrase("I'd like a pepperoni pizza");
           pizzaChoices.AddPhrase("I'd like a large pepperoni pizza");
           pizzaChoices.AddPhrase(
               "I'd like a small thin crust vegetarian pizza");
           Grammar pizzaGrammar =
               new Grammar(new GrammarBuilder(pizzaChoices));

           //Attach an event handler
           pizzaGrammar.SpeechRecognized +=
               new EventHandler<RecognitionEventArgs>(
                   PizzaGrammar_SpeechRecognized);

_recognizer.LoadGrammar(pizzaGrammar);
}

        void PizzaGrammar_SpeechRecognized(
            object sender, RecognitionEventArgs e)
        {
            MessageBox.Show(e.Result.Text);
        }
    }
}

我使用了一个普通的 Windows 窗体应用程序，并添加了若干行代码以实现基本的语音识别。首先，我引入 System.Speech.Recognition 命名空间，然后实例化一个 SpeechRecognizer 对象。然后，我在 Form1_Load 中执行三个操作：生成一个语法，将一个事件处理程序附加到该语法，以便从该语法接收 SpeechRecognized 事件，然后将该语法加载到识别器。此时，识别器将开始听取符合该语法定义的模式的语音。当它识别出符合该语法的内容时，将调用该语法的 SpeechRecognized 事件处理程序。该事件处理程序本身访问 Result 对象并使用识别的文本。

System.Speech.Recognition API 支持 W3C 语音识别语法规范 (SRGS) — 位于 www.w3.org/TR/speech-grammar。该 API 甚至提供一组用于创建和使用 SRGS XML 文档的类。但多数情况下，使用 SRGS 有些过了，因此 API 也提供了 GrammarBuilder 类，它能够很好地满足比萨饼定购系统的需要。

GrammarBuilder 允许您从一组短语和选项中组成语法。在图 8中，我已经排除了不关注的问题（“Mary had a little lamb”），并对引擎进行了限制，以便它可以在模糊的声音之间进行更好的选择。当用户将“pizza”错误发音为“plaza”时，它甚至不会考虑单词“plaza”。因此通过这几行代码，我已经极大地增强了该系统的准确性。但是，该语法仍然存在一些问题。

详尽列出用户可能说的话语的方法很单调、易于出错且难于维护，而且实际上只能对很少的语法实现这一点。比较可取的做法是定义一个能够定义单词组合方式的语法。此外，如果应用程序关注比萨饼的尺寸、陷料和类型，开发人员还需大量的工作来分析结果字符串中的这些值。如果该识别系统可以在结果中识别这些语义属性，就会方便得多。使用 System.Speech.Recognition 和 Windows Vista 识别引擎会使该操作非常容易。

图 9 显示在用户从备选列表中说出一些内容时，如何使用 Choices 类组成语法。在该代码中，每个 Choices 实例的内容在构造函数中作为一个字符串参数序列指定。但是您有其他很多用于填充 Choices 的选项：您可以迭代添加新短语，从一个数组构造 Choices，将 Choices 添加到 Choices 以生成用户能够理解的复杂组合规则，或者将 GrammarBuilder 实例添加到 Choices 以生成更为灵活的语法（如本示例的 Permutations 部分中演示的那样）。

private void Form1_Load(object sender, EventArgs e)
{
//[I'd like] a [<size>] [<crust>] [<topping>] pizza [please]

    //build the core set of choices
    Choices sizes = new Choices("small", "regular", "large");
    Choices crusts = new Choices("thin crust", "thick crust");
    Choices toppings = new Choices("vegetarian", "pepperoni", "cheese");

//build the permutations of choices...

    //choose all three
    GrammarBuilder sizeCrustTopping = new GrammarBuilder();
    sizeCrustTopping.AppendChoices(sizes, "size");
    sizeCrustTopping.AppendChoices(crusts, "crust");
    sizeCrustTopping.AppendChoices(toppings, "topping");

    //choose size and topping, and assume thick crust
    GrammarBuilder sizeAndTopping = new GrammarBuilder();
    sizeAndTopping.AppendChoices(sizes, "size");
    sizeAndTopping.AppendChoices(toppings, "topping");
    sizeAndTopping.AppendResultKeyValue("crust", "thick crust");

    //choose topping only, and assume the rest
    GrammarBuilder toppingOnly = new GrammarBuilder();
    toppingOnly.AppendChoices(toppings, "topping");
    toppingOnly.AppendResultKeyValue("size", "regular");
    toppingOnly.AppendResultKeyValue("crust", "thick crust");

    //assemble the permutations
    Choices permutations = new Choices();
    permutations.AddGrammarBuilders(sizeCrustTopping);
    permutations.AddGrammarBuilders(sizeAndTopping);
    permutations.AddGrammarBuilders(toppingOnly);

    //now build the complete pattern...
    GrammarBuilder pizzaRequest = new GrammarBuilder();
    //pre-amble "[I'd like] a"
    pizzaRequest.AppendChoices(new Choices("I'd like a", "a"));
    //permutations "[<size>] [<crust>] [<topping>]"
    pizzaRequest.AppendChoices(permutations);
    //post-amble "pizza [please]"
    pizzaRequest.AppendChoices(new Choices("pizza", "pizza please"));

//create the pizza grammar
Grammar pizzaGrammar = new Grammar(pizzaRequest);

    //attach the event handler
    pizzaGrammar.SpeechRecognized +=
        new EventHandler<RecognitionEventArgs>(
            PizzaGrammar_SpeechRecognized);

//load the grammar into the recognizer
_recognizer.LoadGrammar(pizzaGrammar);
}

void PizzaGrammar_SpeechRecognized(object sender, RecognitionEventArgs e)
{
    StringBuilder resultString = new StringBuilder();
    resultString.Append("Raw text result: ");
    resultString.AppendLine(e.Result.Text);
    resultString.Append("Size: ");
    resultString.AppendLine(e.Result.Semantics["size"].Value.ToString());
    resultString.Append("Crust: ");
    resultString.AppendLine(e.Result.Semantics["crust"].Value.ToString());
    resultString.Append("Topping: ");
    resultString.AppendLine(
        e.Result.Semantics["topping"].Value.ToString());
    MessageBox.Show(resultString.ToString());
}

图 9还显示如何使用语义值标记结果。当使用 GrammarBuilder 时，您可以将 Choices 附加到该语法，并将一个语义值附加到该选项，如以下语句示例所示：

AppendChoices(toppings, "topping");

有时一个特定的话语将具有从未公开的隐含语义值。例如，如果用户不指定比萨饼的尺寸，该语法可以将尺寸指定为“常规”，如以下语句所示：

AppendResultKeyValue("size", "regular");

从结果中获取语义值是通过访问 RecognitionEventArgs.Result.Semantics[] 来完成的。

发表于 2013-11-29 10:05 Ansel Song 阅读(803) 评论(0) 收藏举报