Summary: So consider the following use case, you have some data on an AWS S3 bucket which you need to automatically move to another cloud provider, in this case Azure, because of continuity requirements. I decided to use an Azure DevOps pipeline, and azcopy to transfer the files so follow me on how to set this up.
Date: 7 July 2023
Refactor: 15 December 2024: Checked links and formatting.
First step is to configure an IAM user with access to the S3 bucket:
To get access to the storage container we'll use a Shared Access Signature:
Note: Keep track of the signing key you're using. If the SAS ever gets compromised you need to rotate the key which was used to create the SAS with
If you've copied the SAS URL you can notice it exists of three parts:
Before we can copy data to the container in the storage account we need to add a container to the storage account. In the azure portal, go to the storage account and then Data Storage and select Containers. Click add to add a container, and keep the public access level to private. Now add the name of the container (s3backup) to the base url you got from the SAS, so it will become something like this: 'https://s3backup.blob.core.windows.net/s3backup'
Now go to Azure DevOps, but before we create the actual pipeline we will first add the AWS secret and the SAS signature as a secret to a variable group.
Now we got all the requirements in place and we can create the pipeline. The pipeline can be created like this:
Note the yaml below has the following confgurations (from above to below):
name: $(Build.DefinitionName)-$(Build.BuildId) appendCommitMessageToRunName: false trigger: none pool: vmImage: 'windows-latest' schedules: - cron: "0 2 * * 0,2" displayName: Weekly backup on Sunday and Wednesday 2h branches: include: - main always: true variables: - group: Backups stages: - stage: S3Backup displayName: "Backups" condition: always() variables: SourceBucket : 'https://s3.eu-central-1.amazonaws.com/backup' TargetContainer : 'https://s3backup.blob.core.windows.net/s3backup' SasToken : '?sv=2022-11-02&ss=b&srt=c&sp=rwdlaciytfx&se=2023-07-07T15:21:13Z&st=2023-07-07T07:21:13Z&spr=https&sig=' AWSAccessKey : 'AKIAXXXXXXXXXXXXXXXXXX' jobs: - job: RisktoolData displayName: "Data backup to Azure" steps: - checkout: self - task: PowerShell@2 displayName: "Backup risktool data" condition: always() env: AWSKEY: $(AWS_KEY) SASSIG: $(SAS_SIG) inputs: pwsh: true targetType: 'inline' script: | Write-Host "`n##[section]Task: $env:TASK_DISPLAYNAME `n" azcopy --version $yesterday = Get-Date $((Get-Date).AddDays(-1)) -Format yyyyMMdd $env:AWS_ACCESS_KEY_ID=$env:AWSAccessKey $env:AWS_SECRET_ACCESS_KEY=$env:AWSKEY $targetUrl = "$($env:TargetContainer)$($env:SasToken)$($env:SASSIG)" Write-Host "`n##[section]Show target container objects: $targetUrl" azcopy list $targetUrl Write-Host "`n##[section]Start FS Backup from $yesterday" azcopy copy "$($env:SourceBucket)/PRD/FS_db_$($yesterday).all.out.gz.enc" $targetUrl --recursive=true Write-Host "`nLog:`n" $logfile = Get-ChildItem "C:\Users\VssAdministrator\.azcopy\" | sort LastWriteTime | select -last 1 Get-Content -Path "C:\Users\VssAdministrator\.azcopy\$($logfile.Name)" Write-Host "`n##[section]Start DB Backup from $yesterday" azcopy copy "$($env:SourceBucket)/PRD/DB_db_$($yesterday).all.out.gz.enc" $targetUrl --recursive=true Write-Host "`nLog:`n" $logfile = Get-ChildItem "C:\Users\VssAdministrator\.azcopy\" | sort LastWriteTime | select -last 1 Get-Content -Path "C:\Users\VssAdministrator\.azcopy\$($logfile.Name)"
Use this lifecycle management policy to cleanup files automatically after 30 days:
See lifecycle management overview and configure a lifecycle management policy for more information.
Common errors:
failed to perform copy command due to error: failed to initialize enumerator: cannot transfer individual files/folders to the root of a service. Add a container or directory to the destination URL
Solution: You forgot to create a container in the storage account and add it to the TargetContainer url in the stage variables
This request is not authorized to perform this operation using this resource type.. When Put Blob from URL.
Solution: Configure the allowed resource types for the SAS as 'Container' and 'Object'