I needed to transcribe some minutes from a meeting, and only one person was speaking during a particular three minute piece. So I copied that segement out to it’s own MP3 file.
I uploaded the file to s3:// and ran a default transcription job. Whoops.
By default, I mean that, mostly I clicked Next, Next, Next. I supplied a job name, an input file, and an output file. (That I used an output file location other than default means it wasn’t exactly default settings).
After the transcription job was done, because I had not specified the quantity of speakers, it left out the
'speaker_labels' data from the JSON file.
I have been using https://github.com/trhr/aws-transcribe-transcript/transcript.py to simplify the JSON into text, but it does not handle files with missing speaker labels.
Sigh. Now I have to re-do the transcription, which will incur another charge. Those speaker_labels are all over the file when present.
For what it is worth, the tasks were essentially:
- Upload the file to S3
aws s3 cp /home/david/Documents/some_path/review_of_previous_board_meeting.mp3 s3://some_s3_bucket/
- Log in to Amazon Transcribe and create a job
- Job name was
- Input file was
- Output file was
- This did require clicking the button “Customer specified S3 bucket”
- I used the AWS CLI commands to copy between my local machine and the S3 bucket, so it is easier if I name the bucket I want the files in.
- Click Next
- THE IMPORTANT PIECE: Audio Identification = On, and audio identification type = speaker identification
- Stupidly, you have to define the count of speakers, and 1 single speaker is an invalid minimum. So I have to tell it there were two speakers, when I had clipped the MP3 file to only contain one.
- Job name was
- Download the file from S3
aws s3 cp s3://some-s3-bucket/review_of_previous_board_meeting.json /home/david/Documents/some_path/
- Clean up the transcription
- And then transcript.py runs without errors. The result is file