Using AI for Good Video Creation
In this episode, Jerrel Arkes and Joop Snijder discussed the use of AI for creating good videos. In particular, they discussed Descript, an AI-powered tool for converting voice into text.
Joop shared his experience with Descript, noting that it can detect voices in audio files and “perform the speech to text and label each part of the text to the right person.” He noted that in cases of overlapping voices, the tool has difficulty labeling the right person in the sentences.
Joop went on to explain how Descript is an asset to content marketing. He noted, “after removing words like the hums, it instantly edits the audio and video together to keep them in sync.” He also noted that the tool can be used to “remove some sentences that are not fluent,” and “overdub words with [his] own voice.”
Joop shared that Descript uses AI for overdubbing. He said, “I have trained my voice with the script they provide by recording 10 minutes of my voice. It can reproduce my voice by training an AI model out of my recording.”
When asked about the results, Joop said, “[It] sounds great.” He did note, however, that when it hadn’t trained certain words he was trying to overdub, it “substitutes the word with gibberish.”
Other Benefits of Descript
Joop shared that the tool “saves [him] a lot of time,” and that he can use the transcript to “transform all the text into a blog post, article or get a summary of the transcript.” He added that he can also “use it to type the text that the video avatar of [himself] should say in [his] voice.”
“It performs the speech to text and it labels each part of the text to the right person,” said Joop Snijder. “And now that I have the transcript of the podcast, I can edit our podcast not by editing the audio file, but by editing the text.”
In conclusion, Jerrel Arkes and Joop Snijder agreed that Descript is a great tool for content marketing. Joop noted that it “definitely [is] worth the time and money” to use the tool, and can be used “for subtitles for videos, or get[ting] a transcription of some audio.”