A platform independent tool for the rapid transcription of large numbers of speech segments.

Source Code and Requirements

Blitzscribe2 can be obtained as part of the Java Speech Toolkit (JSTK, package de.fau.cs.jstk.app.blitzscribe) or as a self contained .jar archive (download below). It requires Java6 or better.

Blitzscribe2 v. 1.0

Usage

Blitzscribe2 is mainly designed for keyboard interactions, to avoid time loss due to mouse-keyboard switches.

Fields are: waveform (1), playback progress (2), text field for transcription (3) and list of available turns (and their transcription, if available) (4).

Mouse interactions:

Double-click on the waveform to (re-)start the playback at the desired position.
Double-click on any turn to select it for transcription.

Keyboard interactions:

[ENTER] commit transcription, load and play next turn
[CTRL+SPACE] start/pause/resume audio playback
[SHIFT+ENTER] commit transcription, load next turn
[SHIFT+BACKSPACE] commit transcription, load previous turn
[CTRL+BACKSPACE] restart audio playback from the beginning
[F2-4] toggle highlighting of turns containing mispronunciations (*), unknown words (?) and unintelligible words (#ui)

Use the Open/Save/Save as Buttons to load or save the transcription files. Blitzscribe2 generates a protocol file (yourfile.trl~) that contains a journal of interactions, including timing. This allows to reconstruct the transcription process and learn about transcription time.

Audio File Format

Blitzscribe is (for now) limited to 16kHz, 16bit, mono WAV RIFF data (with header), but can be easily modified to read any supported format of the JSTK (raw, speex, alaw, ...).

Transcription File Format

The transcription file format (extension .trl) is plain ASCII and basically a file list. Each line contains the filename (either absolute or relative to the directory of the trl file) and, if available, the transcription after a whitespace. If the file name ends on _SOMENUMBER_SOMENUMBER.wav, Blitzscribe expects these to be time marks in milliseconds, and it displays a number and duration of the turn instead of the filename.

Example1: (with partial transcription)

20090427-Hornegger-IMIP01_0001480_0003230.wav so welcome to the
20090427-Hornegger-IMIP01_0004360_0025300.wav
20090427-Hornegger-IMIP01_0025960_0035550.wav

Example2:

file1.wav
file2.wav
file3.wav here is already something transcribed

Protocol File Format

The protocol file is named as the transcription file with a trailing '~' character. Each line is formatted as

<UNIXTIMESTAMP_IN_MSEC> <FILE> <TRANSCRIPTION&

Support

Feel free to post issues at http://code.google.com/p/jstk/issues/list or to the JSTK mailing list jstk(at)speech.informatik.uni-erlangen.de

Contact

Secretary

Address