Wishes for 2022 #896

vt-alt · 2021-12-22T01:30:39Z

Not to blame, but list of weakness of burp we sometimes getting. (Btw it seems development is stalled?)

Sometimes big trees (like unpacked Linux kernel source) is backed up very very slowly (many hours) with very small cpu load. We have this on 1/10 basis on notebooks. I wish to debug it more, but it's hard to reproduce (its 100% reproducible for the people when it's started to occur for them, but I cannot take their notebook for experiments).
On normal circumstances (and most important to first backups) - backup speed is limited with a single 100% cpu load by zlib compression. I would suggest to use better fast and parallelizable compression algorithms like zstd.
Restore of particular directory is very slow. Maybe this is related to that we can only restore by a regexp.

Recently I wanted to restore package database for several days like this:

$ burp -ar -b 0000687 -d 2021-11-21 -r '^/var/[^/]+/(apt|rpm)/' -v

It's ~400M, but one restore taking about a hour. Plus, when I wanted to relaunch command with time I cannot re-run restore quickly, because of repository lock and I should still wait a hour when server process finishes. Inability to parallel restore is bad.

I only use protocol 1. Protocol 2 is permanently not production ready. While competitors are already and for a long time use chunking deduplicated backups.

The text was updated successfully, but these errors were encountered:

grke · 2021-12-22T05:13:40Z

Hello,

Yes I am a bit stalled at the moment, due to lack of time, and I am the only developer.
I intend to keep working on burp when I get some time.

Thank you for the suggestions.
I don't think implementing zstd is as simple as you might think. It requires parallel threads, which I think would basically require rewriting most of the internals of burp. And that wouldn't help if you had multiple clients backing up at the same time.
Actually - which part of the backup are you talking about here - phase2 or something else?

Some ideas for two of the speed issues above, if you are not doing these already:

If you have lots of small files to back up, you might want to turn off librsync (set librsync=0).

For faster restores, you might want to try using hardlinked_archive=1.
Backups that are hardlinked means that the restore doesn't have to apply any diffs when it comes to restoring a file, so it can just feed the bytes straight off the disk. You can see which backups are already hardlinked by standing in the client's storage directory on the server and doing an 'ls */hardlinked'.

vt-alt · 2021-12-22T05:27:46Z

Thanks for the reply and suggestions!

phase2 or something else?

Yes, where file transfer occurs.

grke · 2021-12-22T21:12:59Z

Yes, where file transfer occurs.

Do you see the 100% cpu on the client, or server, or both?

pagalba-com · 2022-03-10T20:00:04Z

I think if this is Windows clients, it can face Windows Task Scheduler reduced priority issue. Please look at https://aavtech.site/2018/01/windows-task-scheduler-changing-task-priority/
After some update, Windows changed default task priority.

pagalba-com · 2022-03-11T15:26:58Z

One more thing @vt-alt, while using rsync library for large files, low CPU and network usage can be seen on both client and server, while it is in progress of finding differences, especially for large files. So if there is already duplicate data, it is not sent, as well it is not processed. In some cases it is faster to set rsync library file size cut off in config file.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Wishes for 2022 #896

Wishes for 2022 #896

vt-alt commented Dec 22, 2021

grke commented Dec 22, 2021

vt-alt commented Dec 22, 2021

grke commented Dec 22, 2021

pagalba-com commented Mar 10, 2022

pagalba-com commented Mar 11, 2022

Wishes for 2022 #896

Wishes for 2022 #896

Comments

vt-alt commented Dec 22, 2021

grke commented Dec 22, 2021

vt-alt commented Dec 22, 2021

grke commented Dec 22, 2021

pagalba-com commented Mar 10, 2022

pagalba-com commented Mar 11, 2022