User Tools

Site Tools


en:postprocessing:contribute

This is an old revision of the document!


Creating subtitles for recordings

Directly to the subtitles styleguide

Congress is over, and now it's the time to create something persistent. We'd like to create subtitles for all Congress Talks. Ideally both in native language and a German/English translation (or even other languages) provided we find enough volunteers. Here we also need you help.

From our first experiences:

New: If there are talks waiting for quality checks, consider to start doing that before doing new transcriptions of other talks. You will get a better feeling what you need to be aware of, when transcribing. But still:

Please read this manual carefully!

Please do never click on “Publish” button in Amara, before the transcription of the talk is done completely!

Please do not start with any translation of the talk or any timing (do no timing manually at all!), as long as the transcription in original language of the talk is not done completely!

Please do not click on “Finish transcribing” in the subtitles interface, before the transcription of the talk is done completely!

Please only use the Amara-Links generated by our Subtitle-Interface. We can only link subtitles created there.
Do not create own Video-Links on Amara or use links of dubious YouTube accounts if you want your subtitles to find their way in to the CDN!

If you have any questions, please join us in IRC!

Thank you!

In short, how is a subtitle made

This is how our current process roughly looks like at the moment:

  • We: Provide all necessary data in the database and create the amara links and the etherpad
  • We: Fill the etherpad with the auto timed transcript
  • You: Use for example www.otranscribe.com to work on the transcript in the pad and to format it according to our styleguide.
  • You: Click the “Finished Transcribing” button on the corresponding page of the talk on www.c3subtitels.de if a transcript is finished.
  • We: Format and auto time the transcript
  • We: Upload the transcript with the timestamps to amara
  • You: Watch the talk in the amara interface and click “Publish” in amara when the talk is finished.

Situation

The subtitles created live can rarely be used to create subtitle-tracks. Most of the time their quality is to poor as that they would save you any time. Creating clean recording-quality subtitles from scratch is usually faster.

Should there be a Talk where that's not the case and you remember a especially good live subtitle please write a mail to subtitles -!-at-!- c3voc.de.

Our quality requirements towards subtitles for recordings are different to those for live-subtitles. Unlike it is common in other places, we want our subtitles to be as accurate close as possible to the to what has actually been said. Besides fillers like “ähhs” and “ehms”, which should be left out, what is written should represent the Talk as closely as possible. This time of course without errors and with consistent style.

Meanwhile we use transcripts from services like trint.com and other services as our base. Accordingly transcripts do no longer have to be written by hand but only corrected.

Overview-page for recording-subtitles

On our overview page there's a interface showing the progress and processing status of the subtitles of past Congresses.

Not yet transcribed talks is shown in gray, parts that are currently being transcribed in red, parts that have to be processed in other ways in yellow and completed subtitles in green. The bars represent actual durations of all congress-talks that have been released as video.

As soon as a event is selected, a overview of all talks of that event is displayed.

When a talk is selected a detailed view is shown.

With a click on “etherpad” you can find the auto transcript. If the links is missing for a talk or it does not work, please talk to us preferable via IRC. We will than generate the transcript and/or make it public. For this step you do not need any account. Please mark your progress on the c3subtitles page.

There's no need to log in to our interface on c3subtitles.de to be able to create subtitles. However, you have to log in to Amara later on for the quality control.

Work on the transcript on otranscribe.com

otranscribe has turned out to be quite easy and effective to use.

otranscribe has the advantage that you can use a local video file and that you can adapt the shortcuts for the control of the video to your needs. You do not need to change the window between the video player and the textfile.

Copy the content of the etherpad into ontranscribe when you work on it and later copy it back into the etherpad.

As styleguide you should use our styleguide, you can find it here: Style guide

If the transcript is finished, please click “Finished Transcribing” in the c3subtitles interface. This causes us to know that this transcript is ready for auto timing.

(Auto-)Timing

When the transcript of a video is done the transcript needs to be timed accourding to the video. The timing is actually supposed to be done in amara, but in many cases it is quite usefull to do this with youtube auto timing and later correct it in the quality controll step.

Please do not take the extra effort and start to manually time on amara. Please use this time later for the review process with the already auto timed subtitles. </WRAP

When you have finished to transcribe a talk and the button for the finished transcript is pressed, the process is initiated automatically.

During this time it doesn't make any sense to work on that video. If the timing is done the display of the talk will change to “quality control in progress”.

Creating subtitles on Amara

We use amara.org for the actual subtitle-creation-process. This is an external website especially designed for subtitling online-videos. Even tough we're not happy with all features and functions, this is currently the easiest way to share in-progress subtitles online and (continuing) to work on them collaboratively.

Amara requires an account, however you can use existing social-media accounts to log in.

You can find the links to our talks on the overview page.

Create subtitles track

As soon as an Amara link for a talk appears on our site, use it to go to the video-page on Amara.

“Add a new language” by clicking the corresponding button to create a new subtitles track. You have to define the native language of the talk (the one it's held in. Use 'Klingon' for multi-language talks), as well as the language you want to create subtitles for.

Please always create the native language of a talk first and do not start with a translation! That will save us a lot of work later with timing issues and other stuff!

If the talk is of mixed languages (such as lightning talks) use 'Klingon' – seriously!

After a new Language is added..

… you can start working on it by clicking it.

Actual transcription process

Now you can transcribe subtitles by using the Amara online editor.

We don't want to focus on the description of the editor, as Amara has it's own help-page for it..

Please have a look at the styleguide. Please stick to what was actually said.
Just note that we see the “42 character rule” more as as a recommendation than a rule. We want to have comprehensive subtitles and can live with a bit heavy load.

It's also a great idea to look at transcriptions of other people. (e.g. 32C3 Opening Event)

While working on transcriptions you should save your work from time to time by clicking on “Save Draft”, especially if you take a break. Please do not click on “Publish” or “Finish transcribing” yet, if you are not done completely with transcription!

Please also note in our interface how far you got.

This way the overall status of all talks can be visualized nicely for everybody. Also, this way it's easier for others to continue the your work.

Our subtitles interface runs a cronjob which checks for new subtitle tracks.
Since that cronjob can only register a track after it has been saved once, we recommend saving your subtitle as soon as you start working on it.

When you've finished transcribing completely, save your draft in Amara and click the “Finish Transcribing” button in the subtitles interface.

Do not go into sync mode in Amara!
Our script will do that.

(Auto-)Timing

After you finished the transcription, your snippets have to be timed to the correct position in the video.

There is an interface for doing that in Amara itself, However, this is a task that can be automated very well.
Therefore we ask you to not do any timing yourself, but let the script that is invoked when you hit 'Finish Transcribing' do the job.

Please don't spend your efforts on timing by hand. Rather use your time for reviewing automatically timed videos and doing error correction.

While the subtitles are being auto-timed you can't work on the video. If as soon as it's finished, the bar will turn yellow and you can do quality control.

Quality control

After auto-timing, the subtitle track has to be checked for blunders and mistakes.
Optimally that's done by a different person than the one who transcribed the talk. You might find someone to do QC on your subtitle in the IRC.
However, if you can't find anyone it's still better to do it yourself than to not publish the work.

Most important is to check for (auto-)timing errors, and to adjust timing if necessary.
That is again done in Amara.

As during the transcription process, please save your draft regularly and note the timecode of how far you got in the subtitles interface.

When you've finished reviewing a subtitle you can finalize it by hitting 'Publish' on Amara.

Optionally you can also mark the subtitles track as finished in the subtitles interface, but the cronjob scripts extracting content from Amara does that for you a few minutes later anyway.

The subtitles track is now finished and ready for publication, which will automatically be done via script upload to the medie.ccc.de CDN, where it is now available to everyone.

Translation of subtitles

If there's a finished subtitles track in native language the time has come to start translations.

To do this, you also work on this video on Amara. There you add a new language and now you can translate from the native language to the one you want to translate to.

If you do this translation starting from the finished language in the Amara editor, the timing of the native language is kept and this part of the process can be later skipped. Apart from that, everything else works very similar to doing the original transcription. If you pause your work or have it finished, please also mark that in our subtitles interface.

en/postprocessing/contribute.1546882876.txt.gz · Last modified: 2020/09/19 22:03 (external edit)

Donate Powered by PHP Valid HTML5 Valid CSS Driven by DokuWiki