How to extract Speech from Video using Python?

Author: neptune | 01st-Dec-2022 | Views: 3733
#Python

Extract text using Google Speech Recognition API.

In this article, We will extract speeches using google api from videos. After extracting the speeches we will convert them into a text file. It is going to be a simple machine learning task  using google speech recognition library. Speech Recognition is widely used nowadays under machine learning concepts. Speech Recognition is also used in many fields.


For example, the subtitles that we see on Amazon prime, Netflix, and YouTube videos are best examples of using Artificial Intelligence in Speech Recognition.

Steps to follow:-

  • Step 1: Little Understanding about API
  • Step 2: Video to Audio Conversion
  • Step 3: Audio to Text Conversion
  • Final Step: Enjoy your day


Libraries used for conversion:-

We are going to use two libraries for this conversion task.

  • Speech Recognition
  • MoviePy

Before we start, let's install them if you haven't installed them yet. Installing a module library is very easy in python. We can even install libraries in one or two lines of code.





Run the following cmd in your terminal window:-

pip install SpeechRecognition moviepy

SpeechRecognition module supports multiple APIs for recognition, Google Speech API we are going to use here. You can learn more about the module from here.

MoviePy is a module library that can read and write almost each type of audio and video formats, including GIF’s.

Now, start the most important part i.e code. Open your editor and start by importing the libraries.



Video to Audio Conversion

Now, we will convert the video into an audio file. Lot’s of video formats are available, some of them MP4, 3GP, OGG, WMV etc. Let’s also take a look in some audio formats. Here are some of them MP3, AAC, WMA, AC3 (Dolby Digital) etc. We should know our video’s format to do the conversion without any problem.

Now, start the conversion using MoviePy library. It is going to be very easy. 

# Libraries import
import moviepy.editor as mp

# It will clip the video
# subclip(starttime, endtime) to clip portion of video
# you can remove the subclip to convert complete video
clip = mp.VideoFileClip(r"sample1.mp4").subclip(10, 100)

# It will write the audio in converted_audio.wav file.
clip.audio.write_audiofile(r"Converted_audio.wav")
print("Finished the convertion into audio...")

I recommend converting it to wav format. It works great with the speech recognition library, which will be covered in the next step.

 



Audio to text conversion

In this step, we will convert the audio into the text using “recognize_google” API. Finally we save the text file i.e. recognized.txt .

# Libraries import
import speech_recognition as sr

# It will read audio file
audio = sr.AudioFile("Converted_audio.wav")
print("Audio file readed...")

# Here the magic start
# create an instance of recognizer as r
r = sr.Recognizer()

with audio as source:
audio_file = r.record(source)

# Here we get our text
result = r.recognize_google(audio_file)

# Now we will store the text in file
with open('recognized.txt',mode ='w') as file:
file.write(result)

print("Wooh.. You did it...")

If you are getting error like broken pipe then reduce the duration of audio file or use “.subclip(starttime, endtime)” as shown in code. Still face any other error let me know in comment section.






We did it! We have finally got our text. We have created a program that converts a video into an audio file and then extracts the speech from that audio. And lastly, exporting the recognized speech into a text document. 


Here is the Complete code:-

# Libraries import
import speech_recognition as sr
import moviepy.editor as mp

# It will clip the video
# subclip(starttime, endtime) to clip portion of video
# you can remove the subclip to convert complete video
clip = mp.VideoFileClip(r"sample1.mp4").subclip(10, 100)

# It will write the audio in converted_audio.wav file.
clip.audio.write_audiofile(r"Converted_audio.wav")
print("Finished the convertion into audio...")


# Now from here we convert audio into text

# It will read audio file
audio = sr.AudioFile("Converted_audio.wav")
print("Audio file readed...")

# Here the magic start
# create an instance of recognizer as r
r = sr.Recognizer()

with audio as source:
audio_file = r.record(source)

# Here we get our text
result = r.recognize_google(audio_file)

# Now we will store the text in file
with open('recognized.txt',mode ='w') as file:
file.write(result)

print("Wooh.. You did it...")
view rawComplete_code.py hosted with ❤ by GitHub

Hoping that you enjoyed reading this post and working on the project.

I hope you have learned something new today. Working on hands-on programming projects like this one is the best way to sharpen your coding skills.

Link to complete project on Github.

Thanks for reading!



anonymous | July 17, 2022, 5:48 p.m.

It works👍


anonymous | May 16, 2022, 9:21 p.m.

Helped me


anonymous | Sept. 20, 2021, 11:19 p.m.

👍


anonymous | May 15, 2021, 10:28 a.m.

Explained in a Simple way Add more articles about JavaScript.



Related Blogs
Deploy Django project on AWS with Apache2 and mod_wsgi module.
Author: neptune | 25th-May-2023 | Views: 1149
#Python #Django
In this blog I use the AWS Ubuntu 18.22 instance as Hosting platform and used Apache2 server with mod_wsgi for configurations. We create a django sample project then configure server...

5 Best Python Testing Frameworks.
Author: neptune | 12th-Apr-2023 | Views: 115
#Python #Testing
Python offers various testing frameworks, including Pytest, unittest, Nose, Robot Framework, and Behave, to build robust and reliable software...

5 Languages that Replace Python with Proof
Author: neptune | 13th-Apr-2023 | Views: 113
#Python
Julia, Rust, Go, Kotlin, and TypeScript are modern languages that could replace Python for specific use cases...

10 Proven Ways to Earn Money Through Python
Author: neptune | 11th-Apr-2023 | Views: 96
#Python
Python offers numerous earning opportunities from web development to teaching, data analysis, machine learning, automation, web scraping, and more...

View More