Skip to content
This repository was archived by the owner on May 15, 2025. It is now read-only.

Ignore conversion errors when building StructuredData#5

Open
cjohnson78 wants to merge 8 commits intogoogle-cloudsearch:masterfrom
sadasystems:StructuredData-ignore-conversion-errors
Open

Ignore conversion errors when building StructuredData#5
cjohnson78 wants to merge 8 commits intogoogle-cloudsearch:masterfrom
sadasystems:StructuredData-ignore-conversion-errors

Conversation

@cjohnson78
Copy link

Added the ability to ignore conversion errors when building StructuredData by setting structuredData.ignoreConversionErrors=true

…dData by setting structuredData.ignoreConversionErrors=true
@cjohnson78 cjohnson78 closed this Feb 13, 2019
@cjohnson78 cjohnson78 deleted the StructuredData-ignore-conversion-errors branch February 13, 2019 03:48
@cjohnson78 cjohnson78 restored the StructuredData-ignore-conversion-errors branch February 13, 2019 03:51
@cjohnson78 cjohnson78 reopened this Feb 13, 2019
@Sdwz
Copy link

Sdwz commented Feb 13, 2019

More generally, this should be doing the same for both repeatable and non repeatable data.

Copy link
Author

@cjohnson78 cjohnson78 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The change I made will ignore errors thrown for single and repeatable data. Do you propose a different way of handling repeating values? I feel like in general, people that use this settings have a goal of preventing the connector from stopping because of one bad value (regardless of single vs. multi.). We know there is bad data sometimes and we just need to push past it in some cases. In other cases, we would leave this disabled and we would work through the errors until we have added adequate date formats, for example.

@cjohnson78
Copy link
Author

Is there any further feedback at this point?

@TanmayVartak TanmayVartak self-assigned this Feb 26, 2019
@TanmayVartak TanmayVartak self-requested a review February 26, 2019 03:26
@TanmayVartak TanmayVartak removed their assignment Feb 26, 2019
Copy link
Contributor

@TanmayVartak TanmayVartak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your contribution. Can you also add some unit tests to validate this?

Changed logic for repeating properties, so only bad values are rejected, the rest are kept.
…-errors' into StructuredData-ignore-conversion-errors

# Conflicts:
#	indexing/src/main/java/com/google/enterprise/cloudsearch/sdk/indexing/StructuredData.java
I had to add setter for the configuration member because I could not call init(Schema)
and initFromConfiguration(indexingService) in the same test case.
@cjohnson78
Copy link
Author

I have added the test case.
I had to add setter for the configuration member because I could not call init(Schema)
and initFromConfiguration(indexingService) in the same test case.

StructuredData.setIgnoreConversionErrors(true);
StructuredDataObject expected =
new StructuredDataObject()
.setProperties(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the indentation looks different. please use two spaces.
https://google.github.io/styleguide/javaguide.html#s4.2-block-indentation

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IntelliJ was having trouble auto-detecting 2-spaces. I have become aware of the issue and will correct the code.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed and pushed


import java.io.File;
import java.io.IOException;
import java.util.*;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same here, please refrain from using wildcard imports. some IDE like intellij has such default setting but can be easily configured.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Found that IntelliJ cannot disable the feature entirely, but you can set the threshold to 99 instances before wildcard is applied.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed and pushed

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants