Anybody using rclone?

I saw a tech news article about a cloud storage provider reducing their rates ( https://techcrunch.com/2023/06/02/dropbox-like-cloud-storage-service-shadow-... ) and this reminded me that I've been thinking about using cloud storage as backup ... so long as it's encrypted with a key that's on my side (only). A quick search pointed me to "rclone" ( https://rclone.org/ ) which sounds outstanding if their self-promotion is to be believed. The fact that rclone is included as a standard package in Debian stable goes a long way to convincing me. Ironically, "Shadow Drive" isn't one of the providers that rclone lists themselves as working with. This seems like a good starting guide (and makes rclone look fairly straight-forward). I haven't used it so I'm uncertain of its accuracy: https://www.linuxuprising.com/2020/05/how-to-encrypt-cloud-storage-files-wit... My current backup systems live and die by 'rsync', so I'm quite familiar with a program rclone seems to be partially based on. I've been trying/hoping to move to 'rsnapshot', although it's kind of a PITA (but good). I use Fedora occasionally, but mostly Debian. This has left me with so many questions, to which I would happily take any and all answers: - is rclone good? - is rclone easy to use? - does rclone handle encryption of remotes (mostly) transparently? - in particular, is mounting remote encrypted cloud drives as local drives fairly easy? - what cloud storage providers have you used rclone with? - do you recommend a particular cloud storage provider? Why? - do you disrecommend a particular cloud storage provider? Why? -- Giles https://www.gilesorr.com/ gilesorr@gmail.com

I use rclone in production. Let me try to answer some of your questions. rclone aims to be an rsync like utility for interacting with commercial/proprietary cloud services but it also supports all of the usual stuff, such as SFTP, SMB, etc. Here is a list of supported services: https://rclone.org/overview/
- is rclone good? I think it's "good". I've used it since 2019 to transfer an average of 1TB of data per 24h to object storage providers (AWS S3, Backblaze B2).
- is rclone easy to use? As easy as rsync.
- does rclone handle encryption of remotes (mostly) transparently? In my case the data is encrypted at rest with the object storage provider. In transit the data is protected with TLS. I have not looked into the using the rclone crypt backend.
- in particular, is mounting remote encrypted cloud drives as local drives fairly easy? I've never tried that.
- what cloud storage providers have you used rclone with? SFTP, Backblaze B2, Amazon AWS S3, Storj.
- do you recommend a particular cloud storage provider? Why? I use mostly B2 for large object storage needs. The speeds are not great, but it doesn't matter much for my use case. They have been otherwise reliable and very competitive on pricing (storage and egress are two separate costs). It's the company of the hard drive reliability stats fame.
I use Storj for backing up logs.
On Jun 3, 2023, at 12:18, Giles Orr via talk <talk@gtalug.org> wrote:
I saw a tech news article about a cloud storage provider reducing their rates ( https://techcrunch.com/2023/06/02/dropbox-like-cloud-storage-service-shadow-... ) and this reminded me that I've been thinking about using cloud storage as backup ... so long as it's encrypted with a key that's on my side (only). A quick search pointed me to "rclone" ( https://rclone.org/ ) which sounds outstanding if their self-promotion is to be believed. The fact that rclone is included as a standard package in Debian stable goes a long way to convincing me. Ironically, "Shadow Drive" isn't one of the providers that rclone lists themselves as working with.
This seems like a good starting guide (and makes rclone look fairly straight-forward). I haven't used it so I'm uncertain of its accuracy: https://www.linuxuprising.com/2020/05/how-to-encrypt-cloud-storage-files-wit...
My current backup systems live and die by 'rsync', so I'm quite familiar with a program rclone seems to be partially based on. I've been trying/hoping to move to 'rsnapshot', although it's kind of a PITA (but good). I use Fedora occasionally, but mostly Debian.
This has left me with so many questions, to which I would happily take any and all answers: - is rclone good? - is rclone easy to use? - does rclone handle encryption of remotes (mostly) transparently? - in particular, is mounting remote encrypted cloud drives as local drives fairly easy? - what cloud storage providers have you used rclone with? - do you recommend a particular cloud storage provider? Why? - do you disrecommend a particular cloud storage provider? Why?
-- Giles https://www.gilesorr.com/ gilesorr@gmail.com --- Post to this mailing list talk@gtalug.org Unsubscribe from this mailing list https://gtalug.org/mailman/listinfo/talk

On Sat, Jun 03, 2023 at 12:18:25PM -0400, Giles Orr via talk wrote:
I saw a tech news article about a cloud storage provider reducing their rates ( https://techcrunch.com/2023/06/02/dropbox-like-cloud-storage-service-shadow-... ) and this reminded me that I've been thinking about using cloud storage as backup ... so long as it's encrypted with a key that's on my side (only). A quick search pointed me to "rclone" ( https://rclone.org/ ) which sounds outstanding if their self-promotion is to be believed. The fact that rclone is included as a standard package in Debian stable goes a long way to convincing me. Ironically, "Shadow Drive" isn't one of the providers that rclone lists themselves as working with.
This seems like a good starting guide (and makes rclone look fairly straight-forward). I haven't used it so I'm uncertain of its accuracy: https://www.linuxuprising.com/2020/05/how-to-encrypt-cloud-storage-files-wit...
My current backup systems live and die by 'rsync', so I'm quite familiar with a program rclone seems to be partially based on. I've been trying/hoping to move to 'rsnapshot', although it's kind of a PITA (but good). I use Fedora occasionally, but mostly Debian.
This has left me with so many questions, to which I would happily take any and all answers: - is rclone good? - is rclone easy to use? - does rclone handle encryption of remotes (mostly) transparently? - in particular, is mounting remote encrypted cloud drives as local drives fairly easy? - what cloud storage providers have you used rclone with? - do you recommend a particular cloud storage provider? Why? - do you disrecommend a particular cloud storage provider? Why?
Personally I use rshapshot for backups with a target being a linux server at my parents house and then they backup to mine the same way. No cloud providers, nothing complicated, it just works and it's automatically offsite. Of course it's great for making a backup of your data, it is not for making a system backup should you need to restore the system, but I don't consider that to my a big task in general. I also tend to use at least RAID1. -- Len Sorensen

On 2023-06-04 09:48, Lennart Sorensen via talk wrote:
Personally I use rshapshot for backups with a target being a linux server at my parents house and then they backup to mine the same way.
No cloud providers, nothing complicated, it just works and it's automatically offsite.
Of course it's great for making a backup of your data, it is not for making a system backup should you need to restore the system, but I don't consider that to my a big task in general. I also tend to use at least RAID1.
I found that rsnapshot does a lot of filesystem churn on the target system, which can start to be an issue with data that has a huge number of small files that wanted hourly snapshots all hitting RAID6. It will do an rm -rf of older snapshots, resulting in each directory having to be cleaned out entry-by-entry, all the files having their link counts decremented, lots and lots of inode activity, and then it builds new snapshots and has to increment link counts and build new directories. The solution I found was to recycle older snapshots, letting rsync bring a recently-retired snapshot up to date making only the new changes before calling that the current snapshot. For absolutely vital data there's also a lot to be said for keeping it in a git repo to track changes and being able to revert damage, and then git push of a packfile can be your backup. There's still a lot to be said for tarball backups, since they hit backup target storage as a single coherent file and don't do the disk thrashing. Nowadays something in a squashfs image could let you mount that and copy out individual files without having to restore the whole tarball, so that's an interesting direction too. Anthony

rclone looks interesting and given that object storage is becoming a ubiquitous storage technology it is something I will be looking into. Since we are onto the subject of backups. I have a client with a multi Tbyte file system that has close to 100 million files. Any kind of file based backup would take days if not weeks to backup. A simple find on the file system took 4 days to run. Currently there is a veeam solution in place that is doing an image backup of the LVM and somehow tracking disk writes but I am not a fan of the solution because it requires custom kernel mods and has cause the system to crash on occasion. Does anybody know of an volume based backup solution that can work in an incremental manner? -- Alvin Starr || land: (647)478-6285 Netvel Inc. || Cell: (416)806-0133 alvin@netvel.net ||

On 2023-06-05 09:14, Alvin Starr via talk wrote:
Does anybody know of an volume based backup solution that can work in an incremental manner?
This questions has big unstated conditional. Are you looking for A) 'volume based backup that agnostic of the volumes it is backing up' For which I have no answers. B) 'An alternative filesystem / volume solution that can support incremental backup' ZFS snapshots fit the bill here. Most folks will jump to the conclusion that you have to stream from one ZFS to another. But in reality zfs send is just standard out, that is only connected to a zfs receive by convention. You can just dump the incremental stream as a file/object that doesn't need to be applied to a receiving ZFS immediately. You are then into the same know problem set of full vs incremental offline database backups. -- Scott Sullivan

On 2023-06-05 11:16, Scott Sullivan via talk wrote:
On 2023-06-05 09:14, Alvin Starr via talk wrote:
Does anybody know of an volume based backup solution that can work in an incremental manner?
This questions has big unstated conditional. Are you looking for
A) 'volume based backup that agnostic of the volumes it is backing up'
For which I have no answers. That is kind of my preference.
B) 'An alternative filesystem / volume solution that can support incremental backup'
ZFS snapshots fit the bill here. Most folks will jump to the conclusion that you have to stream from one ZFS to another. But in reality zfs send is just standard out, that is only connected to a zfs receive by convention. You can just dump the incremental stream as a file/object that doesn't need to be applied to a receiving ZFS immediately. You are then into the same know problem set of full vs incremental offline database backups.
Changing the filesystem would be a major lift and likely take months to copy the data over. This is also the underlying storage for a Gluster volume so I am not sure of the ZFS support for Gluster. But it is food for thought. We are in this situation because some "wizard" said "don't store PDF data in a data base keep it on a file system and just link using the paths". Well that works well till you get tens of millions of files. Backing up a database in the TB size is orders of magnitude faster than trying to back up that same data in a filesystem If I had my way I would convert the data to something like an Elastic database were the numbers of files are more manageable. -- Alvin Starr || land: (647)478-6285 Netvel Inc. || Cell: (416)806-0133 alvin@netvel.net ||

On 05/06/2023 12:03, Alvin Starr via talk wrote:
On 2023-06-05 11:16, Scott Sullivan via talk wrote:
On 2023-06-05 09:14, Alvin Starr via talk wrote:
Does anybody know of an volume based backup solution that can work in an incremental manner?
This questions has big unstated conditional. Are you looking for
A) 'volume based backup that agnostic of the volumes it is backing up'
For which I have no answers. That is kind of my preference.
This is not incremental but is agnostic: dd if=/dev/sdaX of=/mountpoint/folder/file -- This email has been checked for viruses by Avast antivirus software. www.avast.com

On 2023-06-05 14:41, Aurelian Melinte via talk wrote:
On 05/06/2023 12:03, Alvin Starr via talk wrote:
On 2023-06-05 11:16, Scott Sullivan via talk wrote:
On 2023-06-05 09:14, Alvin Starr via talk wrote:
Does anybody know of an volume based backup solution that can work in an incremental manner?
This questions has big unstated conditional. Are you looking for
A) 'volume based backup that agnostic of the volumes it is backing up'
For which I have no answers. That is kind of my preference.
This is not incremental but is agnostic:
dd if=/dev/sdaX of=/mountpoint/folder/file Well. On a live file system that would result in a possibly corrupt backup copy. It is also local only.
For just a simple copy you could use deltacp. I have a version on one of my systems but it seems to have disappeared from the internet at large. -- Alvin Starr || land: (647)478-6285 Netvel Inc. || Cell: (416)806-0133 alvin@netvel.net ||

On Mon, 5 Jun 2023 at 09:14, Alvin Starr via talk <talk@gtalug.org> wrote:
Does anybody know of an volume based backup solution that can work in an incremental manner?
I am speaking here of a particular solution, not a general one. In Proxmox VE, backups of VMs can be performed in three modes: stop mode, suspend mode and snapshot mode. They differ by the length of VM downtime and the risk of losing consistency: https://pve.proxmox.com/wiki/Backup_and_Restore Then, there is Borg, a highly performant non-snapshotting deduplicating backup tool: https://borgbackup.readthedocs.io/en/stable/deployment/image-backup.html There is also a wrapper for Borg, called Borgmatic, for ease of configuration and execution: https://torsion.org/borgmatic/

On 2023-06-05 11:52, Val Kulkov wrote:
On Mon, 5 Jun 2023 at 09:14, Alvin Starr via talk <talk@gtalug.org> wrote:
Does anybody know of an volume based backup solution that can work in an incremental manner?
I am speaking here of a particular solution, not a general one. In Proxmox VE, backups of VMs can be performed in three modes: stop mode, suspend mode and snapshot mode. They differ by the length of VM downtime and the risk of losing consistency: https://pve.proxmox.com/wiki/Backup_and_Restore
Then, there is Borg, a highly performant non-snapshotting deduplicating backup tool: https://borgbackup.readthedocs.io/en/stable/deployment/image-backup.html There is also a wrapper for Borg, called Borgmatic, for ease of configuration and execution: https://torsion.org/borgmatic/
I can snapshot the volume and then backup the snapshot but that is a 40TB image. Veeam tries to take a look at the file systems and zero unused space, like Borg appears to do, but that feature had to be disabled because it was causing random system crashes. I will take a closer look at Borg. There was once a tool called lvmsync but it seems to be a dead project. There was also wyng-backup but that also looks to have gone away. -- Alvin Starr || land: (647)478-6285 Netvel Inc. || Cell: (416)806-0133 alvin@netvel.net ||

On 2023-06-05 6:45 PM, Alvin Starr via talk wrote:
I can snapshot the volume and then backup the snapshot but that is a 40TB image. Veeam tries to take a look at the file systems and zero unused space, like Borg appears to do, but that feature had to be disabled because it was causing random system crashes.
I will take a closer look at Borg.
At Canonical we used an in-house tool called Turku to handle sharded backups for many thousands of systems: https://canonical.com/blog/introducing-turku-cloud-friendly-backups-for-your... I think in 2018 when I was there, we had 4-5 storage nodes with 12-16TB backup storage in each. Any VM that wanted a backup just had to run the agent (python app to invoke rsync), and have a copy of the storage system's public key. The original lives here: https://bazaar.launchpad.net/~turku/turku/turku-storage/files and my past colleague who wrote it has a fork of all three components: https://github.com/rfinnie/turku-storage and so on. For object storage with deduplication and B2 (backblaze) support, I use restic. I've got ~500k files in about 100GB of deduped space stored for less than $1USD/month. Restic is fast (standalone Go binary), encrypted in transit and at rest, supports compression and deduplication, and also handles many different storage backends. I can't vouch for it scaling beyond 1-10TB though, but I would be looking at some kind of incremental+sharding solution for anything larger than that anyways. Cheers, Jamon
participants (9)
-
Alex Kink
-
Alvin Starr
-
Anthony de Boer
-
Aurelian Melinte
-
Giles Orr
-
Jamon Camisso
-
Lennart Sorensen
-
Scott Sullivan
-
Val Kulkov