Howto resume large files transfer using rsync
In this mini post I’ll show you how to use rsync to transfer large files between servers, and resume the transfer if the connection interrupted. Any interruption in the established rsync connection between servers when transferring small files isn’t big deal “resuming rsync will start from the last successful transfer”, but on the other hand If you are transferring a big file, and almost finished 50% of it’s total size “for example”, and the established connections is interrupted, the uncompleted files will be removed from the destination server, and if you resume the rsync connection, it’ll start transferring the big file from it’s start. This is a big issue as it wastes your time, and network resources.
To overcome any connection interruption when using rsync to transfer of large files, we will use rsync with any of the following three options “–append-verify or –append or —partial“.
“–append-verify ” is the best of the above three options, it included in rsync v 3.0.0 and higher. This option will perform full-file check-sum verification step on the existing data on the receiving side, if this check-sum fails,the file will be resent again. This option will keep partially transferred files in the destination server/location.
“–append” This causes rsync to update a file by appending data onto the end of the file, which presumes that the data that already exists on the receiving side is identical with the start of the file on the sending side. If a file needs to be transferred and its size on the receiver is the same or longer than the size on the sender, the file is skipped. This option will keep partially transferred files in the destination server/location.
“–partial” will keep partially transferred files in the destination server/location, and when resuming the rsync transfer, it’ll start from the existing file end, and complete the transfer. This option is will double the size of the partially transferred file when resuming the rsync process, but it works with all version of rsync “older versions”
So, to resume the partially transferred rsync files, start with option "--append-verify" if rsync works fine, then your rsync version is higher 3.0.0. If the first option didn't work, use option "--append", this is the second option you will try if the first option failed, if it works then your partially transferred files will be completed without check-sum. Finally, if the above two options failed, then you are trying to use old version of rsync that only support "--partial" option, this option will double the size of the existing partially transferred file before it starts appending data to the end of the existing partially transferred files.
Here’s an example, I need to transfer a single 10GB log file from a remote server, Normally all of us will use scp command, or rsync command to transfer that file. For this situation let’s focus on rsync as follow:
rsync -av --rsh=ssh host:remote_file local_file
If any interruption happened, the partial transferred file existing on my server will be removed, you can go and terminate the established rsync connection to see by yourself.
But with using any option of “–append-verify or –append or —partial“, the partial transferred file will exists on my server “uncompleted, but exists”, and re-run the same command will auto-complete the existing partial transferred file, so in all your large files transfer use rsync with any of the following options,as follow:
rsync -av --append-verify --rsh=ssh host:remote_file local_file OR rsync -av --append --rsh=ssh host:remote_file local_file OR rsync -av --partial --rsh=ssh host:remote_file local_file
This will keep all uncompleted large files in their destination location for resuming the rsync connection in a later time. Always try the above examples in order, as the last one needs extra size for resuming the transfer.
Hints: 1. Always start with --append-verify option, it performs a check-sum but works only on newer version of rsync. 2. If --append-verify didn't work use the second option --append, it doesn't perform a check-sum. 3. You will need extra available space when using rsync with --partial option if the connection interrupted, for example if you have a file with 10GB of size, and you transferred almost 3 GB to it's destination server / location, now you want to resume the transfer you will notice that there is extra 3GB missing from your total available hard disk space, Now the 10GB file you want to transfer, will need 13GB of your hard disk space to be transferred, and so on. 4. --partial will copy the existing uncompleted file "double it's disk space", and make it a hidden file, and complete it, the previously transferred part of the file will not be hidden. 5. Once the file is totally transferred, the temporary space will be removed, and the total space used by the file will be it's actual size.
If You Appreciate What We Do Here On Mimastech, You Should Consider:
- Stay Connected to: Facebook | Twitter | Google+
- Support us via PayPal Donation
- Subscribe to our email newsletters.
- Tell other sysadmins / friends about Us - Share and Like our posts and services
We are thankful for your never ending support.