Sunteți pe pagina 1din 6

Alexander Street Press

Specifications for Transcription of Video


1) Overview
Alexander Street Presss general requirements are that video transcripts should be keyed as text files, with minimal format tagging, and not saved as X ! files" #owever, several tags will be needed and are defined within this document" $ranscription % All conversations and onscreen text appearing in the video should be transcribed" $agging % All transcript content should be provided with its corresponding tags" $ime stamping % $ime stamps should be recorded at the beginning of every segment and every & seconds throughout the length of the video" Ancillary documentation' (e have not historically provided files containing release notes and liner notes to support transcription work" ASP is willing to revisit this if the vendor determines it would speed the process" $he quality requirements are defined in greater detail later in the document" $o summari)e, ASP expects error rate of each transcript to be 0.5% or lower" $his means one incorrect character out of *++ is permissible, if a word is misspelled by one letter, the characters of the entire word count as incorrect" Punctuation must accurately reflect participant speech and not interfere with readability" Appropriate tags, including product%specific tags, must be conform to the specifications provided in this document"

2) HEADER & FOOTER:


a) HEADER & FOOTER:
The following format should be used as a header on all transcribed files. $wo blank lines should be inserted before -./01 $2A1S320P$"
2oot element tag <aspnd id="" author="" lang="" lld="" isbn="" lccn="" fd="" lp=""> <p>TRANSCRIPT OF I!"O FI#"$<%p>

<p>&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&<%p> <p>'"(IN TRANSCRIPT$<%p>

All transcripts should end with the following footer" A single blank line should be inserted between the completed text and .14 $2A1S320P$"5
<p>"N! TRANSCRIPT<%p> <%aspnd>

1 Alexander Street Press Transcription Specification v3 Proprietary & Confidential

3) General Tagging specifications


Attribute Definition Tagging Format & E ample Paragraph tags 4esignate paragraphs 6logical breaks from one speaker or thought to another5 7se a standard <p> text</p> to designate paragraphs (and timestamps). A new set of paragraph tags should be used when one or more of the following occur' switch from voice%over 8vo9 to participant 8pa9 an extended silence 6&: seconds5 ma;or shift in news item <if transcribers can identify it' for example, in a newsreel when on%screen text identifies the beginning of a new topic="

!otes Attribute Definition Tagging Format & E ample !otes

.ach line of conversation between * or more people should be separated by a blank line 6paragraph tags5"

.very timestamp marked should fall within 8p9 tags" $imestamps $ime stamps should be added every five seconds" Starting from time code <++'++'++=, at the beginning of the video" .very timestamp marked should fall within paragraph 8p9 tags" 11'11'11 <hour'minute'second= Example [00 0! 0"# is used to stamp the three minute and fi$e second point in the $ideo. 2egarding timestamps for indicators, including' <sil"=, <non%.nglish narration=, <non%.nglish song=, <music= (hen indicators are used to cover lengths of time greater than & seconds, there is no need to put a timestamp for each & second increment" $imestamps go at the beginning of each indicator" $here doesnt need to be a closing timestamp for the indicator as long as there is a timestamp at the beginning of the next paragraph" A period of silence beginning within & seconds of the preceding timestamp does not need another timestamp before it" 0f there is a &: second period of silence and the narration>song>music begins following the silence, use additional timestamps before those indicators' .xample' 8p9<++'+?'*&=8st9 0S#0 0S#0 A-@(.0%$.20 *A Bebruary CDEC8>st98>p9 8p9<++'+?'?+=<sil"=8p9 8p9<++'+?'&+=<non%.nglish narration=8>p9

"pecial !otes

8p9<++'+F'C+=8sp91apoleon A" 3hagnon8>sp98>p9 $imestamps for chapteri)ed or segmented video'

2 Alexander Street Press Proprietary & Confidential

#$ 2%$2 & not currentl' use()

A time stamp must be included immediately preceding the title of the segment" 0f the normal &%second time stamp does not fall at this point, an additional time stamp will be required" A space is provided before the time stamp except when it appeared before and after tags" $itles will normally be indicated by an 8st9tag that closely resembles a title listed in the release>liner note" 1G$ A!! 8S$9 $A/S 01403A$. A 1.( 4G37 .1$" E ample: 8p98st97niversal 1ewsreel <time stamp=2eport of August *F CD&D8>st9 98>p9 8p98sp91A22A$G28>sp9 blah blah <time stamp=blahH"blah <time stamp=blah blahH"8>p9 <time stamp=8p98st9#A(A00 ;oins the union"8>st98>p9 8p98sp91A22A$G28>sp9 blah blah blahH"8st9Ioe -low, /overnor of #awaii8>st9blah blah blahH"8>p9 Gnscreen text Any on%screen text that is not subtitles" $his can be on%screen textual cues providing a scene title or a scene transitionJe"g" a placard indicating the name and affiliation of a KparticipantL" 8st9 4o not break speech for on%screen text that appears in the middle of speech" 3apture on%screen text before the speech it interrupts" 2egarding illegible on%screen text' Put a standard 8ill9 tag around any text that cant be clearly read off the screen" 3orrect' 8p98st93harles 3lifone -%*A -ombardier8>st98>p9 8p98sp93harles 3lifone8>sp9 Mou could see all the ships at sea" <++'FN'++=!iterally, hundreds andH8>p9 0ncorrect' 8p98sp93harles 3lifone8>sp9 Mou could see all the ships %8>p9 8p98st93harles 3lifone -%*A -ombardier8>st98>p9 8p98sp93harles 3lifone8>sp9 % at sea" <++'FN'++=!iterally, hundreds andH8>p9 Gmit any B-0 warning text that precedes the video" -egin transcription when the title card of the video appears or when speaking begins" Subtitles 2oman 6(estern alphabet5 language subtitles 6.nglish or other5 displayed on screen should be captured as part of the transcript" 3apitali)ation and spelling should conform to the subtitle as displayed on the screen, but no additional punctuation is required" A!! language subtitles should be captured using appropriate diacritics and spelling as displayed on the screen" Subtitles are usually displayed at the bottom of the screen" 8sb9 8>sb9 .xample' 8p9<++'+*'F&=8sp91A22A$G28>sp9 All the members of the family work on the farm"8>p9

Attribute Definition Tagging

E ample

!otes Attribute Definition

Tagging E ample

3 Alexander Street Press Proprietary & Confidential

!otes

8p9<++'+*'&+=8sb98sp9Antal8>sp9 my children take care of the goats before school every day8>sb98>p9 Subtitles for an individual speaker, whether named or unnamed, should be treated as spoken word and grouped together under a single 8sp9 tag and should not have a new paragraph for each line of subtitled text on screen" 0f on%screen text and subtitles display at the same time, transcribe each separately, in the same order as they appear on screen" Speaker Person identified as a speaker in the audio or the subtitles" 1ot used within 8st9 tags" 0f a person is mentioned as a sub;ect at one point, then identified as a speaker later on, they should only get 8sp9 tags in the use case that they are the speaker, and get 8pe9 tags for the use case as a sub;ect" 8sp9 Add the speaker tags to the name of the speaker" Places where Speaker is identified either by name or screen text, that identification is to be used throughout the entire transcript, including places where a participant speaks before they are named in the video 6i"e", backfill names if speaker isnOt named until later in the video5" 0f the name is directly indicated in the original file 6e"g" screen text, ending credits5, key the name in upper>lower case" 0n cases where the name of the speaker is assumed or guessed at, key the name in A!! 3APS" Poice%over speaker' e"g", general narrator, 8p98sp91A22A$G28>sp9 text of speech8>p9 e"g", off%screen, name of speaker assumed, 8p98sp92G-.2$ 31A A2A8>sp9 text of speech8>p9 e"g", off%screen, known speaker 8p98sp92obert c1amara8>sp9 text of speech8>p9 Gn%screen speakers' e"g", verbal cue on speaker' 8p98sp92obert c1amara8>sp9 text of speech8>p9 e"g", on%screen text cue' 8p98st98sp92obert c1amara8>sp9 Secretary of State under !yndon Iohnson8>st98>p9 8p98sp92obert c1amara8>sp9 text of speech8>p9 e"g", speaker assumed from verbal cue' 8p98sp92G-.2$ 31A A2A8>sp9 text of speech8>p9 e"g", speaker unidentifiable' 8p98sp98>sp9 text of speech8>sp98>p9

Attribute Definition

Tagging

E ample

4 Alexander Street Press Proprietary & Confidential

*) +on,ersation in Transcription
Spoken dialogue should be transcribed as verbatim as possible" $he following guidelines and indicators should be used'
Au(ible ,ocali-ations common .or(s an( i(ioms /0oneticall' closest +rosstal1 2nau(ible Pocali)ations such as Kyou knowL, KuhmL, KahL, Kuh%huhL, and KhmmL should be included in the transcript to reflect speech as accurately as possible" #owever, when adding tags to a provided 6preformatted5 transcript, do not add this information" 0n connection to the above item, shortened words such as Kcu)L, KkindaL, KwannaL, KgonnaL, etc" should be transcribed as they are spoken" #owever, when adding tags to a provided 6preformatted5 transcript, do not correct these spellings" 6ph5 should be used if not C++Q sure on what the spoken word is" $he transcriber should use the word that is phonetically closest to what they hear in the video" -asically, a way of indicating that whatever is transcribed is ;ust the best guess" 6crosstalk5 should be used when two participants in a conversation speak over one another" 6inaudible <start time=5 should be used to mark where speech is inaudible to the transcriber for whatever reason, e"g" tape quality, voice volume is too low, crosstalk, etc" #owever, if more than C+Q of the transcript is inaudible, the video should be flagged to ASP before finishing transcription" 0n cases where the crosstalk makes both sentences inaudible, 6crosstalk5 should be used with 6inaudible5" 3apture non%.nglish speech with the indicator <non%.nglish narration=" 3apture non%.nglish song with <non%.nglish song= 3apture any music with the indicator <music=" !yrics of songs should not be captured, unless songs are being sung in .nglish by participants" 0n general, spell out (#G!. numbers )ero through nine, use numerals for C+ and above" E&+ <rather than KSeven hundred and fiftyL= is correct" E ceptions: 7se numerals in anything that is measurable, as the following' 7se numerals when numbers are directly used with symbols 6i"e" DQ, RC+, etc"5" 7se numerals when expressing exact ages, if it is an approximate age, spell it out" 7se numerals to express si)e and measurements" 7se numerals for everything metric" 3entimeters, millimeters, liters, etc" Grdinal numbers such as Cst, *nd, ?rd, etc will be spelled out as Birst, Second, $hird, etc" $ranscribers need to mark more than & seconds of silence as follows' 8p9<sil"=8>p9" $his is used to indicate sections of video where the voice%over or participant speech ceases, but the visual imagery and>or ambient noise continues" Bor example, if there is rocket launch footage but no accompanying narration for &:

+ombo crosstal1 3 inau(ible !on4Englis0 5usic !umbers

E ten(e( silence

5 Alexander Street Press Proprietary & Confidential

Ambient noise emotional ,ocali-ations

seconds, a <sil= indicator should be used" 4o not indicate random or ambient sound 6Sound of car driving, Sound of birds, etc"5" For +T26, emotional vocali)ations like laughing and crying should be captured if they occur in speech" $his should be in the format 6laughing5 or 6crying5"

5) Baseline Quality Standards for Transcription


$he error rate of each transcript must be %)78 or lo.er" o $his means one incorrect character out of *++ is permissible, if a word is misspelled by one letter, the characters of the entire word count as incorrect" $he <inaudible= tag must be used appropriately and sparingly" Spelling must be correct, particularly of easily researched names>places>central themes in the video, and must not interfere with the comprehensibility of the transcript" (hen unsure of the spelling of words or names, the transcriber must appropriately use 6ph5 tags" .ach transcript must be reread to ensure comprehensibility 6this includes correct use of homophones, tenses, word endings, etc"5" Punctuation must accurately reflect participant speech and not interfere with readability" Appropriate tags, including product%specific tags, must be used as laid out in the specifications as provided by ASP"

6 Alexander Street Press Proprietary & Confidential