SEE HOW THIS AI CAN CLONE YOUR VOICE IN 5 SECONDS

The creation of a cloned voice requires collection of a dataset and then training the model to get an impressive cloned voice. This model which is based on paper ‘Transfer learning from speaker verification to multispeaker Text to speech synthesis(SV2TTS)’ can clone a voice with just 5 sec audio input.  Users can play a voice audio file from a dataset or use their own audio clip. A single audio clip can give impressive results but it improves when trained with three clips or more. Users can see the embeddings after inputting clips by mapping display in the interface.  Each speaker’s embedding can be applied to random utterance or user’s own text input to get the text in cloned voice. 

Voice cloning technology is relatively accessible on the Internet today. Various AI based companies have built models that synthesize a person’s voice from only a few audio samples. 

This Toolbox however stands to be one of the fastest technologies of its kind!