SpeechRecognitionEngine

Windows 平台上的語音辨識方案有兩種：一種是 .NET Framework 所內建的 System.Speech 命名空間，而另一種則是由 Microsoft Speech Platform SDK 所提供的 Microsoft.Speech 命名空間。兩者的結構頗為相似，許多類別和方法都是一樣的，不過前者的環境建置比較單純，用的是 Windows 本身的語音辨識功能，不需要再安裝 SDK。

由於 DR 還是想用 Python 撰寫，所以安裝了 IronPython 這個結合 Python 與 .NET Framework 的執行環境，而所用的作業系統版本為 Windows 7 Ultimate（64-bit）。IronPython 在安裝完畢後，需要在 Windows 的環境變數「Path」中增加 IronPython 的安裝路徑，例如「C:\Program Files (x86)\IronPython 2.7」，這樣才能比較方便的執行 IronPython 程式。

Windows 語音辨識可支援英文、法文、德文、西班牙文、日文以及中文，需先至控制台的語音辨識選項確認目前系統所支援的語系，若要支援系統預設以外的語言則得再安裝對應的 Windows 語言套件（language packs）。此外也建議安裝最新版本的語音辨識套件（ MSSpeech_SR_*_TELE.msi）以提昇準確率。

範例程式碼（system_speech.py）使用 SpeechRecognitionEngine() 類別中的 SetInputToWaveFile() 方法，顧名思義就是可將 WAV 聲音檔作為語音辨識的來源：

# -*- coding: utf-8 -*-
import sys
import clr

clr.AddReference("System.Speech")

from System.Globalization import CultureInfo
from System.Speech.Recognition import SpeechRecognitionEngine, DictationGrammar, RecognizeMode

completed = 0

def speech_recognized(sender, event):
    print event.Result.Text

def recognize_completed(sender, event):
    global completed
    completed = 1

def recognition_engine(filename):
    global completed
    dictation = DictationGrammar()
    
    recognizer = SpeechRecognitionEngine(CultureInfo("en-US"))
    recognizer.LoadGrammar(dictation)
    recognizer.SetInputToWaveFile(filename)
    recognizer.SpeechRecognized += speech_recognized
    recognizer.RecognizeCompleted += recognize_completed
    recognizer.RecognizeAsync(RecognizeMode.Multiple)
    
    while completed == 0:
        pass
    
if __name__ == "__main__":
    if len(sys.argv) >= 2:
        recognition_engine(sys.argv[1])

執行方式為：

ipy system_speech.py test.wav

分類

筆記

最新內容

SpeechRecognitionEngine