MyShell is an AI service provider specializing in online identity creation. They have released the OpenVoice model for voice mimicry, utilizing voice samples that are not widely used.
The research on AI voice mimicking models has been continuously growing, and OpenVoice stands out for its ability to finely control voice dynamics, including tone and rhythm, resulting in more realistic voices.
The model can be divided into two parts: text-to-speech conversion and voice alignment. The text-to-speech part converts written text into spoken words, which are then aligned to match the target voice. This process is known as the Tone Color Converter.
Although the model and its weight values are available for download, it is limited for non-commercial use only. MyShell also pointed out that there may be methods to detect if a voice has been generated using the OpenVoice model.
Source: ArXiV, GitHub
TLDR: MyShell offers the OpenVoice model for voice mimicry, which has advanced control over voice dynamics. The model consists of text-to-speech conversion and voice alignment, and it is available for non-commercial use with detection possibilities.
Leave a Comment