Skip to main content

Audio and Video Translation (Public Beta)

The audio and video function is currently in the public beta stage. Welcome to feedback your experience to me, and I will improve it as much as possible.

Also, please note that the audio and video function needs to use Tencent's real-time speech recognition service to convert audio to text. Whether you are a member or not, this service currently requires additional purchase of translation points to use, because this service requires additional information besides the key to call, and the selection translation currently does not support filling in this additional information. I will change it to be able to use it through the key in future versions, and the progress can be followed Issue #1569.

Since v8.3.0, 划词翻译 has added an audio and video translation function, which can be used to translate audio and video on web pages.

The audio and video translation function can be opened in either of the following ways:

  • Right-click on the webpage, select "划词翻译" -> "Audio and Video Translation" in the pop-up right-click menu
  • Click the "Video" icon in the upper left corner of the translation panel

After opening the audio and video translation function, a pop-up will appear in the lower right corner of the webpage, and the audio and video on the webpage will automatically pause. Then you can choose one of the following two ways to translate the audio and video on the webpage.

The audio and video function is in internal testing

The audio and video function was tested on Youtube and Bilibili during development, and theoretically supports other video websites as well, but some video websites may be special. When the audio and video are turned on, it will prompt "No audio and video on the current page" or cannot recognize the text. If you encounter such video websites, please feedback to me, and I will make adaptations.

First: Translate Sentence by Sentence

After clicking the "Play video and start recognition" button, the video will start playing, and the button will change to "Stop recognition and translate". Then 划词翻译 will start receiving the video's speech for recognition, and the recognized text content will be updated in real-time and displayed below the button. The text content that has not been recognized will be displayed in gray, and the recognized content will be displayed in white.

You can click the "Stop recognition and translate" button at any time. After clicking, 划词翻译 will send the recognized text to the translation panel for translation.

Second: Real-time Translation

After clicking the "Play video and start translation" button, 划词翻译 will start recognizing the video's speech as text, and the text content will be updated in real-time and displayed below the button. When a complete sentence is recognized, the text content and translation result will be displayed in black background and white text at the top of the entire webpage as real-time subtitles.

Note: 划词翻译 v8.3.0 will use the [Google Translate] by default to get the translation result, and later versions will open the choice.

warning

The Tencent Cloud real-time speech recognition service used by 划词翻译 currently cannot accurately break sentences, so in some videos with compact speech, it may happen that the video says several sentences before 划词翻译 displays a large block of translation results. In this case, it is recommended to use the sentence-by-sentence translation. In future versions, 划词翻译 will integrate more speech recognition services to improve recognition accuracy.

How should I choose?

If you want to get more accurate translation results, choose "Translate Sentence by Sentence". Sentence-by-sentence translation allows you to manually break sentences, which also improves translation accuracy. In addition, since it is translated using the translation panel, multiple translation services will be displayed for comparison.

If you want to display translation results in real-time as the video plays, choose "Real-time Translation". Under real-time translation, the translation service will automatically determine when to break sentences, but it is not accurate, so real-time translation sacrifices translation accuracy to achieve "almost" real-time subtitles. In addition, real-time translation will only display the translation result of one translation service (currently fixed to use the translation result of [Google Translate]).

Other considerations

The built-in speech recognition service of 划词翻译 does not support the audio and video translation function, so only Tencent Cloud's real-time speech recognition service can be used. Using Tencent Cloud's real-time speech recognition service requires going to the [service application] page to purchase translation points.

If you run out of translation points during use, the recognition service will be disconnected immediately. To avoid affecting use, please ensure that you have enough translation points.