-
Notifications
You must be signed in to change notification settings - Fork 1.2k
Support DataInput as source for StoredField #14213
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
SummaryI am opening this proposed change to support writing a stored field from a byte source which does not require a contiguous array allocation. The reason I am proposing this is because there are times when we would like to store large stored fields and the requirement to provide a fully contiguous byte array can cause issues on smaller heaps. Particularly when the original data is already on heap in a non-contiguous source. I took a stab at this using a I wrapped the If this approach has support I will continue to refine the PR. In particular, I was uncertain whether Lucene would want Finally, would we want to modify AlternativesDataInput is only one potential approach. I took it because there was already some work around A Any of these approaches are fine for my use case and I would be happy to work on whichever has the most support and consensus. |
Indeed, the change that introduced this capability had a similar motivation to yours: #12581.
I'm curious if we should make it an actual record?
I like what you did better. I don't want to have to deal with the case when a value is provided through a
It's slightly awkward to have an argument called "value" that doesn't fully encapsulate all the information that is necessary to know what the actual value is, so I like your proposal to modify it.
This looks like the best approach to me, I like the symmetry with the read API in In general the change makes sense to me, I'm curious to get @iverase 's opinion since he's worked on the same change on the read side ( |
Haha probably. I actually did not check Lucene's language level when producing the PR. I'll continue to refine this a bit more based on the premise that |
...pache/lucene/backward_codecs/lucene50/compressing/Lucene50CompressingStoredFieldsReader.java
Outdated
Show resolved
Hide resolved
+1 I like the idea of encapsulating the DataInput and length inside |
lucene/core/src/java/org/apache/lucene/index/StoredFieldVisitor.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
@Tim-Brooks Could you add an entry in CHANGES.txt? It should be under the 10.2 version, thanks! |
I made this change and added some more tests. Let me know if any additional tests make sense. |
Thanks @Tim-Brooks, tests look good. I will be merging this soon. |
Allowing indexing stored-only StoredField directly from DataInput.
Thank you @Tim-Brooks ! |
Allowing indexing stored-only StoredField directly from DataInput.
Introduces StoredFieldDataInput record to associate a length with
DataInput.