Skip to content

How to unzip a .zip file at the same location on S3 using Lambda.

If there is an application, where the zip file is uploaded directly on S3, but later on all the content inside the zip file is needed. We can create a lambda function in python and add an S3 trigger so that whenever a zip file is uploaded in a folder, lambda will trigger and unzip the file content in the same folder.

First create the handled, and get the source and destination keys from the event received when the S3 triggers the lambda.

def handler(event, context):
    source_bucket, source_key = parse_s3_uri('s3://' + event['Records'][0]['s3']['bucket']['name'] + "/" + event['Records'][0]['s3']['object']['key'])
    destination_bucket, destination_key = parse_s3_uri('s3://' + event['Records'][0]['s3']['bucket']['name'] + "/" + event['Records'][0]['s3']['object']['key'].replace(event['Records'][0]['s3']['object']['key'].split('/')[-1], ''))

Here, we’ll get source_bucket and source_key from the S3 trigger event from which we’ll create the destination_bucket and destination_key.

We’ll be needing few installations, and imports in the python script to complete our script, i.e. boto3 for running the AWS actions and zipfile for performing actions on zip file.

import os
import sys
import re
import boto3
import zipfile

For each lambda, we get temporary storage in /tmp directory, where we’ll create a temporary zip location to perform our actions.

temp_zip = '/tmp/file.zip'

We’ll create s3 clients and download the uploaded file in temp location.

s3_client = boto3.client('s3')
#     OR
#     s3_client = boto3.client('s3',
#             aws_access_key_id='',
#             aws_secret_access_key='',
#             region_name = 'us-east-1')
  
s3_client.download_file(source_bucket, source_key, temp_zip)

As the file is now downloaded in the temp_zip location, now we’ll unzip the files, and get the list of all files to upload them one by one at the same S3 location.

zfile = zipfile.ZipFile(temp_zip)
    
file_list = [( name, 
               '/tmp/' + os.path.basename(name),
               destination_key + os.path.basename(name)) 
            for name in zfile.namelist()]

We used zipfile to unzip the files and return the output in zfile. And then create the file location array in file_list variable. This variable will be used to run the loop and then upload the files to the S3.

for file_name, local_path, s3_key in file_list:
        if local_path == '/tmp/':
            continue
        data = zfile.read(file_name)
        with open(local_path, 'wb') as f:
            f.write(data)
            del(data) # free up some memory
        s3_client.upload_file(local_path, destination_bucket, s3_key.replace(s3_key.split('/')[-1], '') + source_key.split('/')[-1].split('.')[0] + '/' + s3_key.split('/')[-1])
        os.remove(local_path)

Here, we will run the loop on the paths array, read the files, upload them, and the delete them from the temporary location to free up the space.

we are using upload_file to upload the files from local memory to the S3 destination.

We’ll return the complete output, what was uploaded on the S3 in JSON format.

return {"files": ['s3://' + destination_bucket + '/' + s.replace(s.split('/')[-1], '') + '' + source_key.split('/')[-1].split('.')[0] + '/' + s.split('/')[-1] for f,l,s in file_list]}

This is how we’ll unzip the .zip files on S3 without any hassle using lambda triggers.

1 thought on “How to unzip a .zip file at the same location on S3 using Lambda.”

  1. Pingback: How to create automated database backups on GitHub & S3. - Raman

Leave a Reply

Your email address will not be published. Required fields are marked *