Ignore conversion errors when building StructuredData#5
Ignore conversion errors when building StructuredData#5cjohnson78 wants to merge 8 commits intogoogle-cloudsearch:masterfrom
Conversation
…dData by setting structuredData.ignoreConversionErrors=true
indexing/src/main/java/com/google/enterprise/cloudsearch/sdk/indexing/StructuredData.java
Outdated
Show resolved
Hide resolved
indexing/src/main/java/com/google/enterprise/cloudsearch/sdk/indexing/StructuredData.java
Outdated
Show resolved
Hide resolved
|
More generally, this should be doing the same for both repeatable and non repeatable data. |
cjohnson78
left a comment
There was a problem hiding this comment.
The change I made will ignore errors thrown for single and repeatable data. Do you propose a different way of handling repeating values? I feel like in general, people that use this settings have a goal of preventing the connector from stopping because of one bad value (regardless of single vs. multi.). We know there is bad data sometimes and we just need to push past it in some cases. In other cases, we would leave this disabled and we would work through the errors until we have added adequate date formats, for example.
indexing/src/main/java/com/google/enterprise/cloudsearch/sdk/indexing/StructuredData.java
Outdated
Show resolved
Hide resolved
indexing/src/main/java/com/google/enterprise/cloudsearch/sdk/indexing/StructuredData.java
Outdated
Show resolved
Hide resolved
|
Is there any further feedback at this point? |
indexing/src/main/java/com/google/enterprise/cloudsearch/sdk/indexing/StructuredData.java
Outdated
Show resolved
Hide resolved
indexing/src/main/java/com/google/enterprise/cloudsearch/sdk/indexing/StructuredData.java
Show resolved
Hide resolved
TanmayVartak
left a comment
There was a problem hiding this comment.
Thanks for your contribution. Can you also add some unit tests to validate this?
…ndexing/StructuredData.java
Changed logic for repeating properties, so only bad values are rejected, the rest are kept.
…-errors' into StructuredData-ignore-conversion-errors # Conflicts: # indexing/src/main/java/com/google/enterprise/cloudsearch/sdk/indexing/StructuredData.java
…tor-sdk into StructuredData-ignore-conversion-errors
I had to add setter for the configuration member because I could not call init(Schema) and initFromConfiguration(indexingService) in the same test case.
|
I have added the test case. |
| StructuredData.setIgnoreConversionErrors(true); | ||
| StructuredDataObject expected = | ||
| new StructuredDataObject() | ||
| .setProperties( |
There was a problem hiding this comment.
the indentation looks different. please use two spaces.
https://google.github.io/styleguide/javaguide.html#s4.2-block-indentation
There was a problem hiding this comment.
IntelliJ was having trouble auto-detecting 2-spaces. I have become aware of the issue and will correct the code.
|
|
||
| import java.io.File; | ||
| import java.io.IOException; | ||
| import java.util.*; |
There was a problem hiding this comment.
same here, please refrain from using wildcard imports. some IDE like intellij has such default setting but can be easily configured.
There was a problem hiding this comment.
Found that IntelliJ cannot disable the feature entirely, but you can set the threshold to 99 instances before wildcard is applied.
Added the ability to ignore conversion errors when building StructuredData by setting structuredData.ignoreConversionErrors=true