-
-
Notifications
You must be signed in to change notification settings - Fork 81
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support for >260 char filenames on Windows #146
Conversation
Some more information including an alternative approach (not really appropriate as it would require the user to change group policy and an appmanifest to be added to the exe opting in): |
Good catch! And thanks for the write-up, I appreciate it. The reason I didn't experience this issue myself is probably related to the Python installer for Windows -- it offers an option to disable this path length limitation, which I've always enabled.
That's because nxdumptool itself truncates individual path elements to 255 bytes (not UTF-8 codepoints) whenever possible. So the filename is indeed transferred like that to the host device. Furthermore, if the truncated size for any given path element ends right in the middle of a UTF-8 codepoint, the entire codepoint is removed, regardless of its size (2, 3 or 4 bytes). |
So, what's the story there, if you don't mind my asking? What problem is being avoided?
Ah I see, in my case usually use choco to install Python(s) and don't think to specify any arguments, and anyway in this case was using a relatively new machine (for now, can't really afford it and will likely need to flip it, sob) that I didn't even install Python onto as yet. But even if you set such a key, either manually or via a group policy, apparently this feature needs to be opted into on an app-by-app basis via an AppManifest (per the second MS link above). I guess the PyInstaller actually does this? Any case, if we make this change, we needn't worry about how nxdt_host is run. That said! There are some drawbacks to bypassing this limitation, regardless of how we do it: it appears that at least some tools that respect the 256+4 limit will revert to using DOS style "8.3" folder and file names when referencing a file of such length. This happened to me with the Hashcheck shell extension. Unfortunately these 8.3 names are not constant--if you copy them to another computer or drive, say, it won't necessarily stick, rendering any hash you've made useless (well, the hash is correct, the file just won't be found when rechecking if you move to another directory or machine) until you've renamed the file to something reasonable and updated the hash file with the new filename. I have not had trouble copying the file via Windows Explorer generally/copying to network drive via SMB fwiw (I have not tried rsync yet), and other community tools (hactoolnet, nsz) can also access the file without renaming it (well, I enabled the keyarea appending feature you mentioned, so I have to wait for nsz to fully support that, but they're working on it, and anyway the point is it can be read). Still, dealing with what happens to the file after it has successfully been written out by nxdt seems like it might be beyond the scope of what we might want to deal with. An alternative might involve taking into account the entire path to the output file, doing a much more severe truncation of the filename to make sure it is not longer than 260 characters in total (might want to change the parent dir name to 'gc' instead of "Gamecard" or something like that), and perhaps saving what the full file name would have been elsewhere, maybe in a companion text file or a logfile that is unconditionally written to the output directory. This seems kind of inelegant but I can't come up with anything better, if you did want to try to get ahead of this. Otherwise, adopting this PR as is and then adding some note in the docs/faq/readme re long filenames on Windows (and perhaps on other platforms, if there are any) may be something to consider. (Actually, an option for the session log to be written out to the output directory is something I'd like regardless; I've been manually copying it at the end of each session, but anyway that's a story for another day!) |
255 bytes is the maximum name length for each path component in multiple filesystem types (even under Linux's EXT). The idea is simplifying the logic on the host side as much as possible by letting nxdumptool itself take care of the bulk of the work. This would also make it easier for anyone to quickly come up with their own host-side implementation.
Yeah, I'm familiar with the DOS 8.3 naming scheme. I'd say it's not really that much of a problem, since the Windows API is capable of transparently dealing with these filesystem entries under most circumstances, even if a FAT filesystem is used. If a particular program has issues with them, I'd say it'd be in the best interest of their devs to find a way to fix that.
Agreed, yeah.
Honestly, I'd just add a note about it and call it a day. This is because nxdumptool also offers multiple options for virtual FS extraction, including RomFS sections with full directory trees that may contain entries with absolute paths that could easily exceed the 256+4 limit. An option to instantly transfer the logfile via USB sounds good -- feel free to create an issue for it. |
Yesterday I had trouble dumping a cartridge with a very long filename, itself longer than the 260 character limit of Windows using standard pathnames and APIs (before even considering how deep or long the output directory was):
The following is 264 characters long by itself:
/Gamecard/FINAL FANTASY [01000EA014150000][v0] + FINAL FANTASY III [01002E2014158000][v0] + FINAL FANTASY IV 1.0.1 [01004B301415A000][v65536] + FINAL FANTASY II [01006B7014156000][v0] + FINAL FANTASY VI 1.0.1 [0100AA001415E000][v65536] + FINAL FAN [KA][NC][NT].xci
This immediately fails, with
usbHandleSetFileProperties()
just hanging. The filename can be truncated before being set, however, it's clear that it already is for some other reason (Final Fantasy V is rendered above as merely FINAL FAN). And of course, this wouldn't help if the output directory is already somewhere rather deep. Fortunately there is a workaround: Windows isn't actually limited to 260 character path lengths, but Windows' default APIs seem to be. An alternative syntax that prepends\\?\
to 'proper' Windows paths (so using backslashes throughout instead of slashes as the separator) will trigger the codepath necessary to exceed the maximum path length limitation. See here for more info on this; apparently this also disables all string parsing, but if this has any negative effect I haven't come across it.In any case, I've tested this with Python
3.11.6
and3.12.0
under Windows 11 and...it worked fine!I've also added an extra unconditional conversion of all forward-slashes to backslashes as well in order to accommodate compatibility with Python running under msys2, because the default separator under that is a forward slash, which will error out as an invalid path. This works under
msys2/mingw64
,msys2/ucrt64
, andmsys2/clang64
.I have /not/ tested this with Cygwin as I don't use it, but would also hope that it doesn't represent itself as Windows to
platform.system()
's check. If that's a problem I can try to make time for it, but this will solve this issue for most users, particularly on Windows, where they'll most likely be using the packaged version. (I have also tested this, and it worked fine as expected, though not on a separate machine. I had some trouble creating a valid package at first, but as you do not seem to have the same issue I would rather discuss it first/separately). I also have not tested these changes on *nix, BSD, Mac, or anything else, but unless any of those systems represent themselves as Windows toplatform.system()
I don't anticipate that being an issue.Anyway, hopefully this is acceptable, there really isn't much going on here and what I did was pretty straightforward, but if not/if you need anything else, let me know. I did save the logs and take a few screenshots along the way but have no other proof; that said it shouldn't be too difficult to verify it works.