Creating an Event-Driven Architecture for Audio File Transcription with AWS Transcribe

3 min readOct 6, 2024

In the world of cloud computing, automation and efficiency are key. One powerful way to achieve this is by using an event-driven architecture that seamlessly integrates different AWS services. In this blog, we will create an architecture that automatically converts audio files uploaded to an S3 bucket into text using the AWS Transcribe Service.

Architecture Overview

The architecture involves the following AWS services:

Amazon S3: To store audio files.
AWS Lambda: To trigger the transcription process when a new file is uploaded.
AWS Transcribe: To convert audio files to text.
Amazon SNS (Simple Notification Service): To notify when the transcription is complete.

Step-by-Step Implementation

Step 1: Create an S3 Bucket

Log in to your AWS Management Console.
Navigate to S3 and click on Create bucket.
Enter a unique bucket name and choose a region.
Click on Create bucket.

Step 2: Set Up AWS Lambda Function

Go to the AWS Lambda service.
Click on Create function.
Choose Author from scratch.
Enter a name for your function, e.g., TranscribeAudioFunction.
Choose a runtime (Node.js or Python is commonly used).
Under Permissions, select Create a new role with basic Lambda permissions.
Click Create function.

Step 3: Configure Lambda to Handle S3 Events

Scroll down to the Function code section and add the code that will handle the S3 event and start the transcription job. Here’s a sample code snippet in Python:

import json
import boto3

def lambda_handler(event, context):
    transcribe = boto3.client('transcribe')
    s3 = boto3.client('s3')

    # Extracting the bucket name and file name from the event
    bucket = event['Records'][0]['s3']['bucket']['name']
    key = event['Records'][0]['s3']['object']['key']

    # Transcription job details
    job_name = key.split('.')[0]  # Use the file name without extension
    job_uri = f"s3://{bucket}/{key}"

    # Start transcription job
    response = transcribe.start_transcription_job(
        TranscriptionJobName=job_name,
        Media={'MediaFileUri': job_uri},
        MediaFormat='mp3',  # Change as per your audio file format
        LanguageCode='en-US'  # Change as per your audio language
    )

    return {
        'statusCode': 200,
        'body': json.dumps('Transcription job started successfully!')
    }

Step 4: Set Up S3 Event Notification

Go back to your S3 bucket and select the Properties tab.
Scroll down to Event notifications and click Create event notification.
Enter a name for your event.
Under Event types, select PUT (to trigger when a file is uploaded).
Under Destination, choose Lambda Function and select the Lambda function you created earlier.
Click Save changes.

Step 5: IAM Permissions for Lambda

To ensure that your Lambda function has the necessary permissions to access S3 and Transcribe:

Go to the IAM service in AWS Management Console.
Select the role associated with your Lambda function.
Click on Attach policies and add the following policies:

AmazonS3ReadOnlyAccess
AmazonTranscribeFullAccess
AWSLambdaBasicExecutionRole

Step 6: (Optional) Set Up SNS for Notifications

Create a new SNS Topic in the SNS console.
Subscribe to the topic with your email to receive notifications.
Update your Lambda function to publish a message to the SNS topic upon transcription completion:

import boto3

sns = boto3.client('sns')

# After starting the transcription job
sns.publish(
    TopicArn='arn:aws:sns:your-region:your-account-id:your-topic-name',
    Message='Transcription job started: ' + job_name
)

Step 7: Test the Architecture

Upload an audio file to your S3 bucket.
Check the AWS Transcribe service to see if the transcription job has started.
If you’ve set up SNS, check your email for notifications.

Step 8: Retrieve Transcription Results

You can retrieve the transcription results by calling the get_transcription_job method in another Lambda function or manually in the AWS Management Console.

Here’s a simple example of how to retrieve the transcription results:

def get_transcription_result(job_name):
    response = transcribe.get_transcription_job(TranscriptionJobName=job_name)
    return response['TranscriptionJob']['Transcript']['TranscriptFileUri']

Conclusion

In this blog, we have created an event-driven architecture using AWS services to automate the transcription of audio files uploaded to S3. This setup not only saves time but also ensures efficiency in handling audio content.