Table of Contents

Move data from AWS S3 to Azure Storage Container

Summary: So consider the following use case, you have some data on an AWS S3 bucket which you need to automatically move to another cloud provider, in this case Azure, because of continuity requirements. I decided to use an Azure DevOps pipeline, and azcopy to transfer the files so follow me on how to set this up.
Date: 7 July 2023
Refactor: 15 December 2024: Checked links and formatting.

Get Access to the S3 Bucket

First step is to configure an IAM user with access to the S3 bucket:

Get Access to the Azure Storage Container

To get access to the storage container we'll use a Shared Access Signature:

Note: Keep track of the signing key you're using. If the SAS ever gets compromised you need to rotate the key which was used to create the SAS with

Configure the Storage Account and SAS

If you've copied the SAS URL you can notice it exists of three parts:

  1. The base url, consisting of the storage account name and the azure blob storage base url: 'https://s3backup.blob.core.windows.net/'
  2. The SAS which holds the configuration: '?sv=2022-11-02&ss=b&srt=c&sp=rwdlaciytfx&se=2023-07-07T15:21:13Z&st=2023-07-07T07:21:13Z&spr=https&sig='
  3. The signature, which is everything that comes after 'sig=' in the SAS. This is the sensitive part, so again, add it to your password manager

Before we can copy data to the container in the storage account we need to add a container to the storage account. In the azure portal, go to the storage account and then Data Storage and select Containers. Click add to add a container, and keep the public access level to private. Now add the name of the container (s3backup) to the base url you got from the SAS, so it will become something like this: 'https://s3backup.blob.core.windows.net/s3backup'

Setup the pipeline

Now go to Azure DevOps, but before we create the actual pipeline we will first add the AWS secret and the SAS signature as a secret to a variable group.

Now we got all the requirements in place and we can create the pipeline. The pipeline can be created like this:

Note the yaml below has the following confgurations (from above to below):

name: $(Build.DefinitionName)-$(Build.BuildId)
appendCommitMessageToRunName: false

trigger: none

pool:
  vmImage: 'windows-latest'

schedules:
- cron: "0 2 * * 0,2"
  displayName: Weekly backup on Sunday and Wednesday 2h
  branches:
    include:
    - main
  always: true

variables:
- group: Backups

stages:
- stage: S3Backup
  displayName: "Backups"
  condition: always()
  variables:
    SourceBucket          : 'https://s3.eu-central-1.amazonaws.com/backup'
    TargetContainer       : 'https://s3backup.blob.core.windows.net/s3backup'
    SasToken              : '?sv=2022-11-02&ss=b&srt=c&sp=rwdlaciytfx&se=2023-07-07T15:21:13Z&st=2023-07-07T07:21:13Z&spr=https&sig='
    AWSAccessKey          : 'AKIAXXXXXXXXXXXXXXXXXX'

  jobs:
  - job: RisktoolData
    displayName: "Data backup to Azure"
    steps:
    - checkout: self

    - task: PowerShell@2
      displayName: "Backup risktool data"
      condition: always()
      env:
        AWSKEY: $(AWS_KEY)
        SASSIG: $(SAS_SIG)
      inputs:
        pwsh: true
        targetType: 'inline'
        script: |
          Write-Host "`n##[section]Task: $env:TASK_DISPLAYNAME `n"
          azcopy --version
          $yesterday = Get-Date $((Get-Date).AddDays(-1)) -Format yyyyMMdd
          $env:AWS_ACCESS_KEY_ID=$env:AWSAccessKey
          $env:AWS_SECRET_ACCESS_KEY=$env:AWSKEY
          $targetUrl = "$($env:TargetContainer)$($env:SasToken)$($env:SASSIG)"
          Write-Host "`n##[section]Show target container objects: $targetUrl"
          azcopy list $targetUrl
          Write-Host "`n##[section]Start FS Backup from $yesterday"
          azcopy copy "$($env:SourceBucket)/PRD/FS_db_$($yesterday).all.out.gz.enc" $targetUrl --recursive=true
          Write-Host "`nLog:`n"
          $logfile = Get-ChildItem "C:\Users\VssAdministrator\.azcopy\" | sort LastWriteTime | select -last 1
          Get-Content -Path "C:\Users\VssAdministrator\.azcopy\$($logfile.Name)"
          Write-Host "`n##[section]Start DB Backup from $yesterday"
          azcopy copy "$($env:SourceBucket)/PRD/DB_db_$($yesterday).all.out.gz.enc" $targetUrl --recursive=true
          Write-Host "`nLog:`n"
          $logfile = Get-ChildItem "C:\Users\VssAdministrator\.azcopy\" | sort LastWriteTime | select -last 1
          Get-Content -Path "C:\Users\VssAdministrator\.azcopy\$($logfile.Name)"

Cleanup Copied Files in Azure Storage Container

Use this lifecycle management policy to cleanup files automatically after 30 days:

See lifecycle management overview and configure a lifecycle management policy for more information.

Troubleshooting

Common errors:

failed to perform copy command due to error: failed to initialize enumerator: cannot transfer individual files/folders to the root of a service. Add a container or directory to the destination URL

Solution: You forgot to create a container in the storage account and add it to the TargetContainer url in the stage variables

This request is not authorized to perform this operation using this resource type.. When Put Blob from URL.

Solution: Configure the allowed resource types for the SAS as 'Container' and 'Object'