-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
implement FileNodeData classes #19
base: master
Are you sure you want to change the base?
Conversation
This is indeed quite big! I'm also getting some segfaults but we'll see that on the way, I don't find it that important for the moment. Let's do that bit by bit. I started with the first commit (6da0f78): could you split that up to have some intermediate steps which would help in the digestibility of the changes? I'd imagine it being split up in a list of commits like this:
This will need to be rebased anyway as I only started reading this. I can help you in splitting it up if you get stuck! Don't hesitate to get in touch :) As the changes are also quite big, it would be nice to have more descriptive commit messages telling what is being done. (but that's not important for the moment, as this is WIP anyway) I'll continue getting the hang of the rest of the changes (and hope that my initial suggestions won't be invalidated by later changes) :) |
At a more thorough look, this will need rebasing anyway. At a high level, the code itself seems good structure-wise (I'm quite fond of this interface usage and derived types implementing it). I'd like to see commits which at least build on every step ("atomic changes"?), which would help in bisecting regressions afterwards. I'll have a go at rebasing all the commits after 6da0f78 myself and will get back some results, if that's okay with you. |
I did a pass on rebasing this myself (including 6da0f78), I've got a (very) crude branch up in https://github.com/tshikaboom/libone/tree/filenode-rebased, on which the diff result should be minimal compared to this branch. That branch is going to be subject to force-pushes. The method I followed was to try to separate logical changes and get any/all commits to be compilable (this helps a lot with bisecting). I did some progress, but it is still not in an usable state. I saw there were changes not directly related to this branch (but which this branch uses), I think a good starting point would be to extract any changes not directly related to this feature and get them independently merged, so as to minimize the diff on the rest of this branch (ie. the interesting stuff). This can include all the stuff I mentioned in #19 (comment), but would not be limited to that I'll continue on cleaning this up tomorrow. |
So, I found a bug in FileNode: the
This gets me to the the file node skipping where you asked what it is used for. I originally implemented it to test FileNodeListFragment parsing, i.e. to skip any "unsupported" node. This was convenient to jump past the node to the next one. As far as I could tell, it did not skip every second file node, as the function did seek exactly past the file node itself (i.e. the start of the next file node, which can be inferred as being I also got to a certain point in extracting useful changes in the branch, to cut it to smaller changes easier to review. I think the first four commits in my branch (https://github.com/tshikaboom/libone/tree/filenode-rebased) are in a good enough shape to be used/reviewed, you can check these out if you want. (although I think the Stp/Cb refactoring could use some more work...) I also noticed you're adding the |
Hi Oskar, thanks for looking into this! =) I am a bit lost though, where would you like me to continue? It's obvious you have put already work into disecting the commits in this branch https://github.com/tshikaboom/libone/tree/filenode-rebased. Would you like to have PRs to that branch, or would you prefer to have approved commits in here?
Thanks for your insight. I'll look into this. Maybe i missed a seek somewhere and come to the impression. But which FNCR do you mean? Not every FileNode does reference to a remote location. So the I'd imagine it being split up in a list of commits like this:
I intended to make the enum more accessible, so i defined a enum class. And since it's a class, CamelCase is suggested
Ok, this will be fixed. I had the intention to extrapolate the case of the other directory names, which also are lower case at the beginning. Moreover, i continued to play with the source a bit. I think, the seg fault is caused by deleting the child object pointers after the parents, which again try to delete their children's pointer. I am not enough experienced yet to deal with pointers smoothly. Maybe this could be addressed with more smart pointer usage... but again, this makes thinks more complicated again. I'll need to review this topic in more in-depth, though. |
Anything you prefer! For ease, I'd say let's keep this branch for new commits and as a playground of sorts, to see what's working and what isn't, and extract anything useful in a more proper form to other branches which would get PR'd as we go. Eventually, this PR would get closed as everything would then be integrated, in one way or another. I dunno, let's see how it goes, it will surely need some rebasing at one point too. For the moment, extracting the diffs does not seem too cumbersome for me. But really, anywhere you push or PR is fine by me, just keep in mind that there will be rebases somewhere at one point :)
Yeah, I'm wondering myself too about this. If i understand you correctly, we could actually get rid of the
This is a path both of us will have to go through, it seems :) I'm not that fluent in C++ pointers either. We'll get there one day! |
I'll have a go at refactoring FileNode like this, on top of what you did with the FileNodeChunkReference refactoring, to get a more clear idea about all this, maybe I'll have something before this weekend. Just keep me posted if you're already on it, so we don't step on each others' toes and do redundant work :) |
5327412
to
fd73115
Compare
This branch now reflects/picked most rebases you proposed. I tried to keep commits cleaner now. (Btw, there was a small bug in ObjectDeclarationWithRefBody, initially it read 8 bytes of data after A few changes from the original PR are not yet recovered. I'll likely finish this tomorrow evening. |
I had some progress on my side in rebasing too: I've got the fourth commit (FNCR/FileNode changes) to a point where there are no "functional" changes anymore, and kept them in separate commits afterwards, which I'll use as inspiration to simplify the code of Doing this, I also got rid of the FileNode offset bug introduced before and reactivated file node skipping. And weirdly enough I don't get any segfaults anymore! I also get the same number of lines when doing Nothing has changed (I think) in the other commits (besides those to-be-redone commits, about which I'll do something later on), so you won't have to look at them. |
d70e458
to
4606453
Compare
Oh well, this took way more energy than anticipated... I tried to keep the commits cleaner now, as suggested. Moreover, i tried to keep most logic intact. The current state also does not change all occurrences of There are also two bugs fixed in this PR. The first one was initially present: 6bd80cd, the However, i still have an error introduced which i haven't fully located yet. Something puts the parsing to the wrong offset after the first I had put some more time into libmson. Again, i see this as playground, or quick-and-dirty prototyping. However, i managed to proceed a good deal. It has now an executable which converts a OneNote file into a xml structure. Trying this tool out might be interesting for you, too, i guess. At least as comparison. There are a number of issues though. As much as libmson is a toy, it brought me closer to understanding the MS-ONESTORE spec. And, we can already discover undocumented PropertyIDs. Furthermore, did you read about the update of qt-creator to 4.13? It now has a native meson project manager. Although i often use vim, i do like qt-creator using to some extend. |
There is a chance, you might not like this latest commits. I did not succeed in fully understanding the flow of parsing the The segfaults i got originated from the destructor of This required to deep copy One FileNodeData class also needed its own copy constructor since it is holding an raw array. Maybe this can be replaced by a After that, the latest commits propose a different parsing structure with fewer side loops and also an entry point when we have fully implemented the transactionlog map. |
Yeah, there is a lot of stuff going on in here, although I've started reading this, I won't get to the bottom of that by this evening. FYI, |
This was actually quite straightforward to look at! Thanks for rebasing this and separating the commits in a more logical way than before, this made it much easier to follow through. I thought it would take a longer time, but at this point, 90% of the newly added code consists of classes which do (some) parsing and have getters and setters. These are "simple" enough, I must say I also like the conciseness of streaming the input pointer into the respective members of the class in the The FileNodeListFragment changes also look good. (I'm not a grumpy guy wrt. changes to my code, I just like non-regressions :) but one can always argue that this library was not functional anyway, to begin with) Regarding I'll have a fresh look at this branch tomorrow again to see if there'd be anything "big" to do, but at this point I'd be more inclined to merge this and deal with the rest later, otherwise all the rebasing and merging and resolving and combining is going to get tiresome (as you'd imagine yourself).. If we get stuck in the details, we'll never get out of this. And there is nothing screaming "obviously wrong" either in the code, so, yeah :) I'll have another look at the fragment parsing anyway tomorrow to see where it gets used, and how we could dedup some code (if any), but, say, I can already say that the |
I added this, because in libmson i also had about three occurrences of that same loop. When I fixed a bug at one of these occurrences, and continued to work one something else, i came across another bug which was familiar... it was the same bug at a different location. we'll need this in the |
Hey, sorry, I got caught up into stuff and haven't had the energy to at least do a high-level todo-list (to not forget anything) before merging this, I'll try to do that this weekend. |
4e1e30e
to
dfe927e
Compare
This commit replaces most occurances of librevenge::RVNGInputStream *input with the shared_ptr specified in libone_util.h, which is typedefed as libone::RVNGInputStreamPtr_t
dfe927e
to
872b60f
Compare
…ush getters into header
while refactoring FileNode, the wrong FileChunkReference has been inserted. That wrong FCR references the FileNode, not the Chunk. Now the respective FileNodeData is casted to get the correct Ref.
This doubles as marker when no FileNodes are left in a FileNodeListFragment
this removes the structures which checks for the end of the FileNodeListFragment. Moreove, it removes FileNode skipping, since all types of Nodes can be parsed now. And finally, the FileNodeListFragment's padding skipper is removed because its prone to found invalid FileNodes when the padding is not all zerod. Additionally a switch for TransactionEntry's count for specific FileNodeListFragment is added, though, there is no direct way to get that information yet. Maybe, the transaction log could be parsed for each FileNodeListFragment if the overhead is small, since it's location is given in the header also found in the input stream which is given to FileNodeListFragment.
The previous commit modified the parsing structure for fileNodeListFragments. There is no more walker for zeroed padding. That's why FileNode can parse a zeroed head - which allowed d to become zero, which is not valid. However, this validity check only applies to valid filenode structures.
the respective switch structure interferes with the discreet FileNodeData. The previous FileChunkReference m_fnd is now contained in the respective FileNodeData (if applicable).
872b60f
to
310f672
Compare
I had another go at rebasing to get rid of the issues i had missed the previous time. Whenever possible, i tested building and running one2raw. Most steps seem to not break anything. Regarding From another perspective, keeping Let me know whether to drop 797b8b5. |
Large Chunk ahead =)
This implements the
FileNode
Types regarding #15 specified in Seciton 2.5 of the MS-ONESTORE spec.Fortunately, I could transfer a lot of code from my playground.
Furthermore, this PR contains some changes to related classes:
FileNode
:FIleNodeData
classes, and renamed theFileNodeChunkReference
to FNCR,get_size()
andget_location()
tocb()
andstp()
, respectively, to match the labeling from the spec.JCID
:libone_utils
:operator>>
foruint
s andint
sint
s.Moreover, this PR contains the changes presented in #17. I used those changes for conveniently implement the parsing. However, this is not absolutely necessary. To keep the option of not using overloaded operators, all parsing methods are kept public and can be used in conventional fashion. This will require to modify the parsing methods only, but should be straight forward.
This PR definitely contains bugs, since i noticed a seg fault when running the converters. I haven't tested the PR enough yet, but wanted to give some material to discuss. That way, we can implement an other coding style, if you feel this PR's to be insufficiently compatible.
I won't mind if assessing the code is going to take a month or two or whatever; no stressing-out!