Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature request: Optimize Performance by Asking only Posters relevant to Library #204

Open
LordMartron94 opened this issue Jan 10, 2025 · 4 comments

Comments

@LordMartron94
Copy link

No idea if this is possible but this would improve syncing with the Google Drive's immensely faster. Only retrieve that which you need.

@Drazzilb08
Copy link
Owner

This would be a huge optimization. But I'm not sure exactly how that would be accomplished through Rclone as I'm not sure that you can get the names of all the files within a given gdrive and only select those which you want to download.

@LordMartron94
Copy link
Author

This would be a huge optimization. But I'm not sure exactly how that would be accomplished through Rclone as I'm not sure that you can get the names of all the files within a given gdrive and only select those which you want to download.

I'll do some research into this when I have time.

@zhdenny
Copy link
Collaborator

zhdenny commented Jan 15, 2025

This would be a huge optimization. But I'm not sure exactly how that would be accomplished through Rclone as I'm not sure that you can get the names of all the files within a given gdrive and only select those which you want to download.

It is possible to do this. rclone is very robust and the documentation is very detailed. I'm not sure about the performance impacts of this approach. It would certainly improve the time it takes to perform the very first pull of posters from gdrives....but I'm not sure about any major performance improvements after that.

From the top of my head (without looking at the documentation), I know at the least this is possible:

  1. rclone lsf or rclone lsjson -- this will just spit out the entire list of files. You can list it out in human-readable list or json
  2. Compare list against what you have/need to create a new list of items to download.
  3. Utilize filtering, include, or exclude features in the rclone sync command. These features are capable of utilizing a file to filter against.

Looking ahead, if this sorta feature were to be introduced into DAPS, running sync_gdrive script standalone would no longer serve any purpose and its functions should be absorbed into poster_renamerr. This would alleviate some confusion from newcomers in the config file about the close link between these two scripts.

Relevant documentation:
https://rclone.org/commands/rclone_ls/
https://rclone.org/filtering/
https://rclone.org/commands/rclone_sync/

@LordMartron94
Copy link
Author

This would be a huge optimization. But I'm not sure exactly how that would be accomplished through Rclone as I'm not sure that you can get the names of all the files within a given gdrive and only select those which you want to download.

It is possible to do this. rclone is very robust and the documentation is very detailed. I'm not sure about the performance impacts of this approach. It would certainly improve the time it takes to perform the very first pull of posters from gdrives....but I'm not sure about any major performance improvements after that.

From the top of my head (without looking at the documentation), I know at the least this is possible:

1. `rclone lsf` or `rclone lsjson` -- this will just spit out the entire list of files. You can list it out in human-readable list or json

2. Compare list against what you have/need to create a new list of items to download.

3. Utilize filtering, include, or exclude features in the `rclone sync` command. These features are capable of utilizing a file to filter against.

Looking ahead, if this sorta feature were to be introduced into DAPS, running sync_gdrive script standalone would no longer serve any purpose and its functions should be absorbed into poster_renamerr. This would alleviate some confusion from newcomers in the config file about the close link between these two scripts.

Relevant documentation: https://rclone.org/commands/rclone_ls/ https://rclone.org/filtering/ https://rclone.org/commands/rclone_sync/

Awesome, that would be exactly what I'd look for. I agree that the performance benefits are probably only in first pulls, although also if you don't need anything from drives in the future (until new shows are added). I think it would be a huge overall optimization.

Unless either of you beat me, I might look into implementing this within now and a couple weeks, lol.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants