Description
In this course, you will learn :
- How to integrate Google's powerful Speech-to-Text AI models into a Python programme.
- Begin by learning about the main use cases for Speech-to-Text (STT) and an overview of the API.
- You will then run some API demo code to create a transcription for an audio file. Don't worry, you'll go over each line of code to ensure you understand it.
- Learn about recognition configuration, speech adaptation, and the various speech recognition models.
- Learn about the word error rate and how to calculate transcription accuracy.
- Able to incorporate STT into your own Python projects and have a valuable new skill to add to your resume.
Syllabus :
1. Your First Program
- Simple Transcription Demo Code
- Demo Code Deep Dive
- Quiz
2. Recognition Configuration
- Create a Storage Bucket and Transcribe an Audio File
- Punctuation
- Multichannel Audio (Stereo)
- Enhancing a Multi-channel Transcript
- Speaker Diarization
3. Speech Adaptation
- Overview of Speech Adaptation
- Phrases
- Classes
- Boost
- Boost tuning
4. Models
- Choosing the Recognition Model
- Enhanced Phone Call Model
- Summary Quiz
5. Word Error Rate WER
- How to Measure Transcription Accuracy
- Identifying the Right Boost Value
- WER Demo