Skip to content

Conversation

ig11987
Copy link
Contributor

@ig11987 ig11987 commented May 23, 2025

Closes: #16

@ig11987 ig11987 changed the title Draft: Add incremtntal transfer Draft: Add incremental transfer May 23, 2025
@ig11987 ig11987 force-pushed the 16-differential branch 8 times, most recently from 77988c7 to 8d1505c Compare May 24, 2025 13:29
@ig11987 ig11987 force-pushed the 16-differential branch 3 times, most recently from 085add4 to 589871c Compare May 24, 2025 21:05
@ig11987 ig11987 marked this pull request as ready for review May 24, 2025 21:05
@ig11987 ig11987 changed the title Draft: Add incremental transfer Add incremental transfer May 24, 2025
@ig11987 ig11987 force-pushed the 16-differential branch from 589871c to 6fd174f Compare May 24, 2025 21:13
@ig11987 ig11987 requested a review from ruuda May 26, 2025 10:06
@ruuda
Copy link
Contributor

ruuda commented May 26, 2025

Ooh, nice! I didn’t review in depth yet, but one thing that comes to mind, our goal for this is to sync a directory, where the sending side may still be producing new files, right? So we sync once, but by the time it’s complete, there may be new files, so we do it again, etc., until the difference is very small, then we stop the process that’s writing files, we do a final sync, and at that point we know the copy is complete.

With the current implementation, would we need to execute fastsync for multiple iterations? Because if this is the goal anyway, and if the receiver and sender negotiate anyway, then maybe it makes sense to build that loop right into fastsync itself, and make the protocol more interactive. After the sender sends its last piece, it could re-scan its input files to see if any of them changed, and if so go for a new round. Not sure if that would be a major complication though. What do you think? I don’t mean to scope-creep things, we can also leave it for a later version, but if it’s a few-line modification I think that could improve the practical usability a lot.

@ig11987 ig11987 force-pushed the 16-differential branch from 6fd174f to f449afc Compare May 27, 2025 00:37
@ig11987
Copy link
Contributor Author

ig11987 commented May 27, 2025

With the current implementation, would we need to execute fastsync for multiple iterations?

In my use-case it's 2 times.
a: fastsync without incremental (target directory is empty, so does not matter if it gets deleted before accepting packages - current behavior). (1)
b: Then stop the workload that produces the files that are transferring
c: Sync with incremental for any leftover files. (2)

So we sync once, but by the time it’s complete, there may be new files, so we do it again, etc., until the difference is very small, then we stop the process that’s writing files, we do a final sync, and at that point we know the copy is complete.

The transfer is usually much faster than the speed of producing files, so I have not observed this being very useful for the workloads I have worked with so far.
But I do see the potential of continuous mode being useful.

There is are some potential risks:

  1. the process never ending - we can offload this to the user
  2. how to deal with deletes on sender (I don't think it's currently handled) - could produce many files piling up on the receiver.
  3. the code becoming too complicated for the added utility when compared to incremental.

but if it’s a few-line modification I think that could improve the practical usability a lot.

Yes, considering we would not keep the incremental (one-time continuous) mode, the changes (wip) compared to the current PR would not be too much.
Allow me some further time to properly test this use-case and implementation.

Edit: this is now done and tested.

@ig11987 ig11987 force-pushed the 16-differential branch from 0467bcd to a9fc92a Compare May 28, 2025 19:40
@ig11987 ig11987 marked this pull request as draft May 28, 2025 19:40
@ig11987 ig11987 force-pushed the 16-differential branch 6 times, most recently from 84969ff to 67b7bbe Compare May 28, 2025 22:06
@ig11987 ig11987 changed the title Add incremental transfer Add continuous transfer mode May 28, 2025
@ig11987 ig11987 force-pushed the 16-differential branch from 67b7bbe to 1c320a9 Compare May 28, 2025 22:19
@ig11987 ig11987 marked this pull request as ready for review May 28, 2025 22:19
@ig11987 ig11987 requested a review from patrickjeremic May 29, 2025 11:37
@ig11987 ig11987 force-pushed the 16-differential branch from 1c320a9 to ed0ec5a Compare May 31, 2025 14:25
@ig11987 ig11987 marked this pull request as draft June 3, 2025 15:34
@ig11987 ig11987 force-pushed the 16-differential branch 8 times, most recently from d9a1c4e to 737e2db Compare June 5, 2025 14:33
@ig11987 ig11987 force-pushed the 16-differential branch from 737e2db to 716c0e8 Compare June 5, 2025 16:18
@ig11987 ig11987 marked this pull request as ready for review June 14, 2025 00:06
@ig11987 ig11987 force-pushed the 16-differential branch 2 times, most recently from 1ede88c to 7c49b26 Compare July 19, 2025 13:57
src/main.rs Outdated
Comment on lines 101 to 100
// Check for auto-accept environment variable (useful for testing)
if std::env::var("FASTSYNC_AUTO_ACCEPT").is_ok() {
println!("Receiving will overwrite existing files with those names. Continue? [y/N] y");
return Ok(());
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be a command line option or documented env variable? I think option would be better choice and proper documentation. I would make it separate commit though.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let’s not mix env vars and CLI flags for configuration, I would put it in the CLI flags then.

Copy link
Contributor Author

@ig11987 ig11987 Jul 27, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for highlighting.

I've removed this option now and using #[cfg(not(test))] for the part of the code where the tests would be blocked by expecting user input.
This reduces the size of the current diff.

With my use-case with fastsync, I don't use it in non-interactive environments (as it requires synchronized action on 2 hosts).
If you think it would be a benefit to have this CLI flag now, I can add it back here or in another PR.

Comment on lines +155 to +150
fn set_file_mtime(path: &str, mtime: u64) -> Result<()> {
let file = OpenOptions::new().write(true).open(path)?;

let system_time = std::time::UNIX_EPOCH + std::time::Duration::from_secs(mtime);
let times = FileTimes::new()
.set_accessed(system_time)
.set_modified(system_time);

file.set_times(times)?;
Ok(())
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, I wrote above that ctime should be checked too, but ctime cannot be set directly, so I guess check should be if src ctime is newer than dest ctime.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

However I think as-is with mtime and no ctime it's fine too. I don't have strong feelings. I would just write it that way myself as I know some (I don't remember which) backup software checks ctime and not mtime, but backup software has different constrains, requirements and the nature of files and directory sets.

Copy link
Contributor Author

@ig11987 ig11987 Jul 27, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you, I agree that the fact that ctime can't be changed is useful for some archival / security perspective.

With fastsync continuous mode I was mostly concerned with keeping the 2 hosts "in sync continously".

The main reason to use mtime was that it can be set on the file (matching source and destination).
In combination with file size (and filename) this gives an acceptable signal that the files will be the same (without having to rery on extra metadata).

@ig11987 ig11987 force-pushed the 16-differential branch 5 times, most recently from 9e87bd6 to 99334f2 Compare July 27, 2025 15:28
Allows the transfer to happen continuously.
Files can change on the source and will be picked up
and transferred as long as fastsync is running.
This is also means fastsync now supports incremental
transfers i.e. stopping the operation and resuming it later.

Closes: #16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Differential (incremental) transfer
3 participants