How to download the entire contents of a folder in an S3 bucket with Ansible
Just ran into this issue today. I needed to be able to download all the files inside a folder stored in an S3 bucket with Ansible.
The aws_s3 module docs on the Ansible site didn't provide an example on how to do this, but I did find a couple of articles that pointed me in the right direction.
- name: Get list of files from S3
aws_s3:
mode: list
aws_access_key: "{{ aws_access_key_id }}"
aws_secret_key: "{{ aws_secret_access_key }}"
bucket: "{{ aws_storage_bucket_name }}"
prefix: "my-folder/"
marker: "my-folder/"
register: s3_bucket_items
- name: Download files from S3
aws_s3:
mode: get
aws_access_key: "{{ aws_access_key_id }}"
aws_secret_key: "{{ aws_secret_access_key }}"
bucket: "{{ aws_storage_bucket_name }}"
object: "{{ item }}"
dest: "destination/files/{{ item|basename }}"
with_items: "{{ s3_bucket_items.s3_keys }}"
What these 2 tasks basically do is first retrieve the list of S3 objects from the specified folder in the S3 bucket and then assign this list to a variable called s3_bucket_items. In the second task, we then use that variable to get a list of keys which we'll iterate through to download each object one by one.
One important thing to note also is in the first task, we specified the marker argument to be the same as the prefix (i.e. the folder name that contains the files). I had to add this as the module will include my-folder/ in the list of S3 bucket items by default, this argument basically excludes that item. If you don't add this marker argument you'll get an error that will say something like "attempted to take checksum of directory," as that initial item will be treated as if it's a directory.
Sources: