Speech Recognition Systems – Microsoft Professional Program

Learn with this Speech Recognition Systems course about the pieces of a modern automatic speech recognition (ASR) system as we cover fundamental acoustic and linguistic theory, data preparation, language modeling, acoustic modeling, and decoding.

Microsoft Professional Program - Speech Recognition Systems

About this course

This course is part of the Microsoft Professional Program in Artificial Intelligence.

Developing and understanding Automatic Speech Recognition (ASR) systems is an inter-disciplinary activity, taking expertise in linguistics, computer science, mathematics, and electrical engineering.

When a human speaks a word, they cause their voice to make a time-varying pattern of sounds. These sounds are waves of pressure that propagate through the air. The sounds are captured by a sensor, such as a microphone or microphone array, and turned into a sequence of numbers representing the pressure change over time. The automatic speech recognition system converts this time-pressure signal into a time-frequency-energy signal. It has been trained on a curated set of labeled speech sounds, and labels the sounds it is presented with. These acoustic labels are combined with a model of word pronunciation and a model of word sequences, to create a textual representation of what was said.

Instead of exploring one part of this process deeply, this course is designed to give an overview of the components of a modern ASR system. In each lecture, we describe a component’s purpose and general structure. In each lab, the student creates a functioning block of the system. At the end of the course, we will have built a speech recognition system almost entirely out of Python code.

What you’ll learn

  • Fundamentals of Speech Recognition
  • Basic Signal Processing for Speech Recogntion
  • Acoustic Modeling and Labeling
  • Common Algorithms for Language Modeling
  • Decoding Acoustic Features into Speech

Prerequisites

  • Some python experience
  • Basic Machine Learning principles
  • Knowledge of probability and statistics

Meet the instructors

Adrian Leven

Adrian Leven

Content Developer
Microsoft Corporation

Adrian Leven is a Content Developer at Microsoft Learning with a focus on Human-Computer Interaction. He received his B.S. In Computer Science from Stanford University.

Start learning Speech Recognition Systems

You can enroll now for the Speech Recognition Systems course at our DataChangers Academy ! Do you want to learn more? Then check out our other Data & AI courses.

Please use a Windows Live ID email address to register at the DataChangers Academy if you want to obtain a certificate after finishing the courses.

Show what you know and get a certificate

After finishing this course, you can obtain a Microsoft Professional Program certificate. In order to obtain a certificate, you can buy a voucher from us (in collaboration with MD2C).

 

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.