User Tools

Site Tools


en:postprocessing:contribute

Creating subtitles for recordings

Directly to the subtitles styleguide

Congress is over, and now it's the time to create something persistent. We'd like to create subtitles for all Congress Talks. Ideally both in native language and a German/English translation (or even other languages) provided we find enough volunteers. Here we also need you help.

From our first experiences:

New: If there are talks waiting for quality checks, consider to start doing that before doing new transcriptions of other talks. You will get a better feeling what you need to be aware of, when transcribing. But still:

Please read this manual carefully!

Please do never click on “Publish” button in Amara, before the transcription of the talk is done completely!

Please do not start with any translation of the talk or any timing (do no timing manually at all!), as long as the transcription in original language of the talk is not done completely!

Please do not click on “Finish transcribing” in the subtitles interface, before the transcription of the talk is done completely!

Please only use the Amara-Links generated by our Subtitle-Interface. We can only link subtitles created there.
Do not create own Video-Links on Amara or use links of dubious YouTube accounts if you want your subtitles to find their way in to the CDN!

If you have any questions, please join us in IRC!

Thank you!

In short, how is a subtitle made

This is how our current process roughly looks like at the moment:

  • We: Provide all necessary data in the database and create the amara links and the etherpad
  • We: Fill the etherpad with the auto transcript
  • You: Use for example www.otranscribe.com to work on the transcript in the pad and to format it according to our styleguide.
  • You: Click the “Finished Transcribing” button on the corresponding page of the talk on www.c3subtitles.de if a transcript is finished.
  • We: Format and auto time the transcript
  • We: Upload the transcript with the timestamps to amara
  • You: Watch and check the talk in the amara interface and click “Publish” in amara when the talk is finished.

Situation

The subtitles created live can rarely be used to create subtitle-tracks. Most of the time their quality is too poor as that they would save you any time. Creating clean recording-quality subtitles from scratch is usually faster.

Should there be a Talk where that's not the case and you remember a especially good live subtitle please write a mail to subtitles -!-at-!- lists.cccv.de.

Our quality requirements towards subtitles for recordings are different to those for live-subtitles. Unlike it is common in other places, we want our subtitles to be as accurate and close as possible to what has actually been said. Besides fillers like “ähhs” and “ehms”, which should be left out, what is written should represent the Talk as closely as possible. This time of course without errors and with consistent style.

Meanwhile we use transcripts from services like trint.com and other services as our base. As a result transcripts must not be written by hand but only be corrected.

Overview-page for recording-subtitles

On our overview page there's a interface showing the progress and processing status of the subtitles of past Congresses.

Not yet transcribed talks are shown in gray, parts that are currently being transcribed in red, parts that have to be processed in other ways in yellow and completed subtitles in green. The bars represent actual durations of all congress-talks that have been released as video.

As soon as an event is selected, an overview of all talks of that event is displayed.

When a talk is selected a detailed view is shown.

With a click on “etherpad” you can find the auto transcript. If the link is missing for a talk or it does not work, please talk to us - preferable via IRC. We will then generate the transcript and/or make it public. For this step you do not need an account. Please mark your progress on the c3subtitles page.

There is no need to log in to our interface on c3subtitles.de to be able to create subtitles. However, you have to log in to Amara later on for the quality control.

Work on the transcript on otranscribe.com

otranscribe has turned out to be quite easy and effective to use.

otranscribe has the advantage that you can use a local video file and that you can adapt the shortcuts for the control of the video to your needs. You do not need to change the window between the video player and the text editor.

Copy the content of the etherpad into ontranscribe when you work on it and later copy it back into the etherpad.

As styleguide you should use our styleguide, you can find it here: Style guide

If the transcript is finished, please click “Finished Transcribing” in the c3subtitles interface. This causes us to know that this transcript is ready for auto timing.

(Auto-)Timing

When the transcript of a video is done the transcript needs to be timed according to the video. The timing is actually supposed to be done in amara, but in many cases it is quite usefull to do this with youtube auto timing and later correct it in the quality control step.

Please don't spend your efforts on timing by hand. Rather use your time for reviewing automatically timed videos and doing error correction.

When you have finished to transcribe a talk and the button for the finished transcript is pressed, the process is initiated automatically.

During this time it doesn't make any sense to work on that video. If the timing is done the display of the talk will change to “quality control in progress”.

Quality control

After auto-timing, the subtitle track has to be checked for blunders and mistakes. Optimally that's done by a different person than the one who transcribed the talk. You might find someone to do QC on your subtitles in the IRC.

However, if you can't find anyone it's still better to do it yourself than to not publish the work.

Most important is to check for (auto-)timing errors, and to adjust timing if necessary.
That is again done in Amara.

At this point in time most subtitle blocks are formatted in double lines with a maximum length of 42 characters each. Please do not change this, this is done this way intentionally. It improves readability by creating blocks which are shown for a as long as possible period of time, especially in fast tech talks.

As during the transcription process, please save your draft regularly and note the timecode of how far you got in the subtitles interface.

When you've finished reviewing a subtitle you can finalize it by hitting 'Publish' on Amara.

Optionally you can also mark the subtitles track as finished in the subtitles interface, but the scripts extracting content from Amara does that for you a few minutes later anyway.

The subtitles track is now finished and ready for publication on http://mirror.selfnet.de/c3subtitles/, which will automatically be done.

Translation of subtitles

If there's a finished subtitles track in native language the time has come to start to translate them.

To do this, you also work on the video on Amara. There you add a new language and now you can translate from the native language to the one you want to translate to.

If you do this translation starting from the finished language in the Amara editor, the timing of the native language is kept and this part of the process can later be skipped. Apart from that, everything else works pretty similar to doing the original transcription. If you pause your work or have it finished, please also mark that in our subtitles interface.

en/postprocessing/contribute.txt · Last modified: 2022/10/09 14:14 by thore

Donate Powered by PHP Valid HTML5 Valid CSS Driven by DokuWiki