Skip to content
This repository was archived by the owner on May 15, 2023. It is now read-only.
This repository was archived by the owner on May 15, 2023. It is now read-only.

Getting error while committing docs to google cloud search datastore. Below is the error example and this is happening all of a sudden #19

@sudeshna-majumder

Description

@sudeshna-majumder

INFO [HttpCrawler] 2 start URLs identified.
INFO [CrawlerEventManager] CRAWLER_STARTED
INFO [AbstractCrawler] bayer-default: Crawling references...
INFO [CrawlerEventManager] REJECTED_REDIRECTED: https://www.bayer.com.tw/
INFO [CrawlerEventManager] DOCUMENT_FETCHED: https://www.bayer.com.tw/zh-hant/
INFO [CrawlerEventManager] CREATED_ROBOTS_META: https://www.bayer.com.tw/zh-hant/
INFO [CrawlerEventManager] REJECTED_FILTER: https://www.bayer.com.tw/sites/bayer_com_tw/files/styles/280x160/public/2020-09/hr.jpg?h=341981b4&itok=cjW2HXv9
INFO [CrawlerEventManager] REJECTED_FILTER: https://www.bayer.com.tw/sites/bayer_com_tw/files/styles/280x160/public/2020-09/teaser-nav-commit.jpg?h=fd24c189&itok=_2ttr7tf
INFO [CrawlerEventManager] REJECTED_FILTER: https://www.bayer.com.tw/sites/bayer_com_tw/files/styles/16_9_aspect_ratio/public/2020-11/movingimages05.jpg?h=d19103a9&itok=8abowRlj
INFO [CrawlerEventManager] DOCUMENT_FETCHED: https://www.bayer.com.tw/zh-hant/bayer-innovation
INFO [CrawlerEventManager] REJECTED_FILTER: https://www.bayer.com.tw/sites/bayer_com_tw/files/styles/16_9_aspect_ratio/public/2020-11/movingimages02.jpg?h=bf3ccb75&itok=-iKRgKQx
INFO [CrawlerEventManager] CREATED_ROBOTS_META: https://www.bayer.com.tw/zh-hant/bayer-innovation
INFO [CrawlerEventManager] REJECTED_FILTER: https://www.bayer.com.tw/sites/bayer_com_tw/files/styles/16_9_small/public/2020-11/Receptionist%20talking%20phone_426.jpg?h=656682cd&itok=oVEc4lU6
INFO [CrawlerEventManager] REJECTED_FILTER: https://www.bayer.com.tw/sites/bayer_com_tw/files/styles/16_9_aspect_ratio/public/2020-11/movingimages01.jpg?h=d19103a9&itok=PoKOHP28
INFO [CrawlerEventManager] REJECTED_FILTER: https://www.bayer.com.tw/sites/bayer_com_tw/files/styles/16_9_small/public/2020-08/consumer-health.jpg?h=c397aecc&itok=WrOpXWdU
INFO [CrawlerEventManager] REJECTED_FILTER: https://www.bayer.com.tw/sites/bayer_com_tw/files/styles/280x160/public/2020-08/taiwan.png?h=fd24c189&itok=Wf0Gpu5H
INFO [CrawlerEventManager] URLS_EXTRACTED: https://www.bayer.com.tw/zh-hant/conditions-of-use
INFO [CrawlerEventManager] DOCUMENT_IMPORTED: https://www.bayer.com.tw/zh-hant/conditions-of-use
INFO [CrawlerEventManager] DOCUMENT_IMPORTED: https://www.bayer.com.tw/zh-hant/bayer-innovation
INFO [CrawlerEventManager] DOCUMENT_IMPORTED: https://www.bayer.com.tw/zh-hant/
INFO [CrawlerEventManager] DOCUMENT_COMMITTED_ADD: https://www.bayer.com.tw/zh-hant/bayer-innovation (GoogleCloudSearchCommitter[queueSize=100,docCount=62872,queue=FileSystemCommitter[directory=../workdir/queue],commitBatchSize=10,maxRetries=0,maxRetryWait=0,operations=[],targetReferenceField=,sourceReferenceField=,keepSourceReferenceField=false,targetContentField=,sourceContentField=,keepSourceContentField=false])
INFO [CrawlerEventManager] DOCUMENT_COMMITTED_ADD: https://www.bayer.com.tw/zh-hant/ (GoogleCloudSearchCommitter[queueSize=100,docCount=62872,queue=FileSystemCommitter[directory=../workdir/queue],commitBatchSize=10,maxRetries=0,maxRetryWait=0,operations=[],targetReferenceField=,sourceReferenceField=,keepSourceReferenceField=false,targetContentField=,sourceContentField=,keepSourceContentField=false])
INFO [CrawlerEventManager] DOCUMENT_COMMITTED_ADD: https://www.bayer.com.tw/zh-hant/conditions-of-use (GoogleCloudSearchCommitter[queueSize=100,docCount=62873,queue=FileSystemCommitter[directory=../workdir/queue],commitBatchSize=10,maxRetries=0,maxRetryWait=0,operations=[],targetReferenceField=,sourceReferenceField=,keepSourceReferenceField=false,targetContentField=,sourceContentField=,keepSourceContentField=false])
INFO [CrawlerEventManager] REJECTED_REDIRECTED: https://www.bayer.com.tw/node/
INFO [CrawlerEventManager] REJECTED_REDIRECTED: https://www.bayer.com.tw/rss
INFO [CrawlerEventManager] DOCUMENT_FETCHED: https://www.bayer.com.tw/sites/bayer_com_tw/files/bayer-organizational-structure-2020-08-21.pdf
INFO [CrawlerEventManager] CREATED_ROBOTS_META: https://www.bayer.com.tw/sites/bayer_com_tw/files/bayer-organizational-structure-2020-08-21.pdf
INFO [CrawlerEventManager] DOCUMENT_FETCHED: https://www.bayer.com.tw/themes/custom/bayer_cpa/logo.svg
INFO [CrawlerEventManager] CREATED_ROBOTS_META: https://www.bayer.com.tw/themes/custom/bayer_cpa/logo.svg
INFO [CrawlerEventManager] REJECTED_IMPORT: https://www.bayer.com.tw/themes/custom/bayer_cpa/logo.svg
INFO [CrawlerEventManager] REJECTED_REDIRECTED: https://www.bayer.com.tw/en/node/2
INFO [CrawlerEventManager] DOCUMENT_FETCHED: https://www.bayer.com.tw/zh-hant/advanced-search
INFO [CrawlerEventManager] CREATED_ROBOTS_META: https://www.bayer.com.tw/zh-hant/advanced-search
INFO [CrawlerEventManager] URLS_EXTRACTED: https://www.bayer.com.tw/en/node/56
INFO [CrawlerEventManager] DOCUMENT_IMPORTED: https://www.bayer.com.tw/en/node/556
Dec 09, 2020 10:50:11 PM com.google.enterprise.cloudsearch.sdk.indexing.IndexingServiceImpl getSchema
WARNING: Schema lookup failed. Using empty schema
javax.net.ssl.SSLHandshakeException: PKIX path building failed: sun.security.provider.certpath.SunCertPathBuilderException: unable to find valid certification path to requested target
at sun.security.ssl.Alert.createSSLException(Alert.java:131)
at sun.security.ssl.TransportContext.fatal(TransportContext.java:324)
at sun.security.ssl.TransportContext.fatal(TransportContext.java:267)
at sun.security.ssl.TransportContext.fatal(TransportContext.java:262)
at sun.security.ssl.CertificateMessage$T12CertificateConsumer.checkServerCerts(CertificateMessage.java:654)
at sun.security.ssl.CertificateMessage$T12CertificateConsumer.onCertificate(CertificateMessage.java:473)
at sun.security.ssl.CertificateMessage$T12CertificateConsumer.consume(CertificateMessage.java:369)
at sun.security.ssl.SSLHandshake.consume(SSLHandshake.java:377)
at sun.security.ssl.HandshakeContext.dispatch(HandshakeContext.java:444)
at sun.security.ssl.HandshakeContext.dispatch(HandshakeContext.java:422)
at sun.security.ssl.TransportContext.dispatch(TransportContext.java:182)
at sun.security.ssl.SSLTransport.decode(SSLTransport.java:149)
at sun.security.ssl.SSLSocketImpl.decode(SSLSocketImpl.java:1143)
at sun.security.ssl.SSLSocketImpl.readHandshakeRecord(SSLSocketImpl.java:1054)
at sun.security.ssl.SSLSocketImpl.startHandshake(SSLSocketImpl.java:394)
at sun.net.www.protocol.https.HttpsClient.afterConnect(HttpsClient.java:559)
at sun.net.www.protocol.https.AbstractDelegateHttpsURLConnection.connect(AbstractDelegateHttpsURLConnection.java:185)
at sun.net.www.protocol.http.HttpURLConnection.getOutputStream0(HttpURLConnection.java:1340)
at sun.net.www.protocol.http.HttpURLConnection.getOutputStream(HttpURLConnection.java:1315)
at sun.net.www.protocol.https.HttpsURLConnectionImpl.getOutputStream(HttpsURLConnectionImpl.java:264)
at com.google.api.client.http.javanet.NetHttpRequest.execute(NetHttpRequest.java:77)
at com.google.api.client.http.HttpRequest.execute(HttpRequest.java:981)
at com.google.api.client.auth.oauth2.TokenRequest.executeUnparsed(TokenRequest.java:283)
at com.google.api.client.auth.oauth2.TokenRequest.execute(TokenRequest.java:307)
at com.google.api.client.googleapis.auth.oauth2.GoogleCredential.executeRefreshToken(GoogleCredential.java:394)
at com.google.api.client.auth.oauth2.Credential.refreshToken(Credential.java:489)
at com.google.api.client.auth.oauth2.Credential.intercept(Credential.java:217)
at com.google.api.client.http.HttpRequest.execute(HttpRequest.java:868)
at com.google.api.client.googleapis.services.AbstractGoogleClientRequest.executeUnparsed(AbstractGoogleClientRequest.java:499)
at com.google.api.client.googleapis.services.AbstractGoogleClientRequest.executeUnparsed(AbstractGoogleClientRequest.java:432)
at com.google.api.client.googleapis.services.AbstractGoogleClientRequest.execute(AbstractGoogleClientRequest.java:549)
at com.google.enterprise.cloudsearch.sdk.BaseApiService.executeRequest(BaseApiService.java:429)
at com.google.enterprise.cloudsearch.sdk.indexing.IndexingServiceImpl.getSchema(IndexingServiceImpl.java:1143)
at com.google.enterprise.cloudsearch.sdk.indexing.StructuredData.initFromConfiguration(StructuredData.java:199)
at com.norconex.committer.googlecloudsearch.GoogleCloudSearchCommitter.init(GoogleCloudSearchCommitter.java:204)
at com.norconex.committer.googlecloudsearch.GoogleCloudSearchCommitter.commitBatch(GoogleCloudSearchCommitter.java:234)
at com.norconex.committer.core.AbstractBatchCommitter.commitAndCleanBatch(AbstractBatchCommitter.java:179)
at com.norconex.committer.core.AbstractBatchCommitter.cacheOperationAndCommitIfReady(AbstractBatchCommitter.java:208)
at com.norconex.committer.core.AbstractBatchCommitter.commitDeletion(AbstractBatchCommitter.java:148)
at com.norconex.committer.core.AbstractFileQueueCommitter.commit(AbstractFileQueueCommitter.java:225)
at com.norconex.committer.core.AbstractCommitter.commitIfReady(AbstractCommitter.java:146)
at com.norconex.committer.core.AbstractCommitter.add(AbstractCommitter.java:97)
at com.norconex.collector.core.pipeline.committer.CommitModuleStage.execute(CommitModuleStage.java:34)
at com.norconex.collector.core.pipeline.committer.CommitModuleStage.execute(CommitModuleStage.java:27)
at com.norconex.commons.lang.pipeline.Pipeline.execute(Pipeline.java:91)
at com.norconex.collector.http.crawler.HttpCrawler.executeCommitterPipeline(HttpCrawler.java:380)
at com.norconex.collector.core.crawler.AbstractCrawler.processImportResponse(AbstractCrawler.java:600)
at com.norconex.collector.core.crawler.AbstractCrawler.processNextQueuedCrawlData(AbstractCrawler.java:541)
at com.norconex.collector.core.crawler.AbstractCrawler.processNextReference(AbstractCrawler.java:419)
at com.norconex.collector.core.crawler.AbstractCrawler$ProcessReferencesRunnable.run(AbstractCrawler.java:829)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: sun.security.validator.ValidatorException: PKIX path building failed: sun.security.provider.certpath.SunCertPathBuilderException: unable to find valid certification path to requested target
at sun.security.validator.PKIXValidator.doBuild(PKIXValidator.java:456)
at sun.security.validator.PKIXValidator.engineValidate(PKIXValidator.java:323)
at sun.security.validator.Validator.validate(Validator.java:271)
at sun.security.ssl.X509TrustManagerImpl.validate(X509TrustManagerImpl.java:315)
at sun.security.ssl.X509TrustManagerImpl.checkTrusted(X509TrustManagerImpl.java:223)
at sun.security.ssl.X509TrustManagerImpl.checkServerTrusted(X509TrustManagerImpl.java:129)
at sun.security.ssl.CertificateMessage$T12CertificateConsumer.checkServerCerts(CertificateMessage.java:638)
... 48 more
Caused by: sun.security.provider.certpath.SunCertPathBuilderException: unable to find valid certification path to requested target
at sun.security.provider.certpath.SunCertPathBuilder.build(SunCertPathBuilder.java:141)
at sun.security.provider.certpath.SunCertPathBuilder.engineBuild(SunCertPathBuilder.java:126)
at java.security.cert.CertPathBuilder.build(CertPathBuilder.java:280)
at sun.security.validator.PKIXValidator.doBuild(PKIXValidator.java:451)
... 54 more

INFO [GoogleCloudSearchCommitter] Indexing Service reference count: 1
INFO [GoogleCloudSearchCommitter] Sending 10 documents to Google Cloud Search for addition/deletion.
INFO [CrawlerEventManager] DOCUMENT_COMMITTED_ADD: https://www.bayer.com.tw/en/node/556 (GoogleCloudSearchCommitter[queueSize=100,docCount=62911,queue=FileSystemCommitter[directory=../workdir/queue],commitBatchSize=10,maxRetries=0,maxRetryWait=0,operations=[],targetReferenceField=,sourceReferenceField=,keepSourceReferenceField=false,targetContentField=,sourceContentField=,keepSourceContentField=false])
INFO [GoogleCloudSearchCommitter] Document deleted (38ms): https://www.cropscience.bayer.ca/en/Products/Fungicides/Prosaro-west/Quality
INFO [GoogleCloudSearchCommitter] Document deleted (0ms): https://www.cropscience.bayer.ca/en/Products/Fungicides/Prosaro-west/Quantity
INFO [GoogleCloudSearchCommitter] Document deleted (0ms): https://www.cropscience.bayer.ca/en/Products/Fungicides/Scala
INFO [GoogleCloudSearchCommitter] Indexing Service release reference count: 1
INFO [GoogleCloudSearchCommitter] Stopping indexingService: 0
INFO [CrawlerEventManager] DOCUMENT_FETCHED: https://www.bayer.com.tw/en/node/571
Dec 09, 2020 10:50:11 PM com.google.enterprise.cloudsearch.sdk.BatchRequestService shutDown
INFO: Shutting down batching service. flush on shutdown: true
INFO [CrawlerEventManager] CREATED_ROBOTS_META: https://www.bayer.com.tw/en/node/571
INFO [CrawlerEventManager] DOCUMENT_IMPORTED: https://www.bayer.com.tw/en/node/56
INFO [CrawlerEventManager] REJECTED_FILTER: https://www.bayer.com.tw/sites/bayer_com_tw/files/styles/280x160/public/2020-09/duty_170x100.jpg?h=88f562ca&itok=8eEavwXI
INFO [CrawlerEventManager] REJECTED_FILTER: https://www.bayer.com.tw/sites/bayer_com_tw/files/2020-08/hospital-science-01.jpg
INFO [CrawlerEventManager] REJECTED_FILTER: https://www.bayer.com.tw/sites/bayer_com_tw/files/styles/280x160/public/2020-08/taiwan.png?h=fd24c189&itok=Wf0Gpu5H
INFO [CrawlerEventManager] REJECTED_FILTER: https://www.bayer.com.tw/sites/bayer_com_tw/files/inline-images/hospital-science-02.png
INFO [CrawlerEventManager] REJECTED_FILTER: https://www.bayer.com.tw/sites/bayer_com_tw/files/styles/280x160/public/2020-11/Newspaper_production.jpg?h=78276bf5&itok=5SH9XPoW
INFO [CrawlerEventManager] REJECTED_FILTER: https://www.bayer.com.tw/sites/bayer_com_tw/files/styles/280x160/public/2020-09/teaser-nav-commit.jpg?h=fd24c189&itok=_2ttr7tf
INFO [CrawlerEventManager] REJECTED_FILTER: https://www.bayer.com.tw/sites/bayer_com_tw/files/styles/280x160/public/2020-09/teaser-nav-news.jpg?h=a0a0c8ec&itok=_QhViWhe
Dec 09, 2020 10:50:12 PM com.google.enterprise.cloudsearch.sdk.BatchRequestService$SnapshotRunnable getGoogleJsonError
WARNING: Retrying request failed with exception:
javax.net.ssl.SSLHandshakeException: PKIX path building failed: sun.security.provider.certpath.SunCertPathBuilderException: unable to find valid certification path to requested target
at sun.security.ssl.Alert.createSSLException(Alert.java:131)
at sun.security.ssl.TransportContext.fatal(TransportContext.java:324)
at sun.security.ssl.TransportContext.fatal(TransportContext.java:267)
at sun.security.ssl.TransportContext.fatal(TransportContext.java:262)
at sun.security.ssl.CertificateMessage$T12CertificateConsumer.checkServerCerts(CertificateMessage.java:654)
at sun.security.ssl.CertificateMessage$T12CertificateConsumer.onCertificate(CertificateMessage.java:473)
at sun.security.ssl.CertificateMessage$T12CertificateConsumer.consume(CertificateMessage.java:369)
at sun.security.ssl.SSLHandshake.consume(SSLHandshake.java:377)
at sun.security.ssl.HandshakeContext.dispatch(HandshakeContext.java:444)
at sun.security.ssl.HandshakeContext.dispatch(HandshakeContext.java:422)
at sun.security.ssl.TransportContext.dispatch(TransportContext.java:182)
at sun.security.ssl.SSLTransport.decode(SSLTransport.java:149)
at sun.security.ssl.SSLSocketImpl.decode(SSLSocketImpl.java:1143)
at sun.security.ssl.SSLSocketImpl.readHandshakeRecord(SSLSocketImpl.java:1054)
at sun.security.ssl.SSLSocketImpl.startHandshake(SSLSocketImpl.java:394)
at sun.net.www.protocol.https.HttpsClient.afterConnect(HttpsClient.java:559)
at sun.net.www.protocol.https.AbstractDelegateHttpsURLConnection.connect(AbstractDelegateHttpsURLConnection.java:185)
at sun.net.www.protocol.http.HttpURLConnection.getOutputStream0(HttpURLConnection.java:1340)
at sun.net.www.protocol.http.HttpURLConnection.getOutputStream(HttpURLConnection.java:1315)
at sun.net.www.protocol.https.HttpsURLConnectionImpl.getOutputStream(HttpsURLConnectionImpl.java:264)
at com.google.api.client.http.javanet.NetHttpRequest.execute(NetHttpRequest.java:77)
at com.google.api.client.http.HttpRequest.execute(HttpRequest.java:981)
at com.google.api.client.auth.oauth2.TokenRequest.executeUnparsed(TokenRequest.java:283)
at com.google.api.client.auth.oauth2.TokenRequest.execute(TokenRequest.java:307)
at com.google.api.client.googleapis.auth.oauth2.GoogleCredential.executeRefreshToken(GoogleCredential.java:394)
at com.google.api.client.auth.oauth2.Credential.refreshToken(Credential.java:489)
at com.google.api.client.auth.oauth2.Credential.intercept(Credential.java:217)
at com.google.api.client.googleapis.batch.BatchRequest$BatchInterceptor.intercept(BatchRequest.java:300)
at com.google.api.client.http.HttpRequest.execute(HttpRequest.java:868)
at com.google.api.client.googleapis.batch.BatchRequest.execute(BatchRequest.java:241)
at com.google.enterprise.cloudsearch.sdk.BatchRequestService$BatchRequestHelper.executeBatchRequest(BatchRequestService.java:447)
at com.google.enterprise.cloudsearch.sdk.BatchRequestService$SnapshotRunnable.execute(BatchRequestService.java:308)
at com.google.enterprise.cloudsearch.sdk.BatchRequestService$SnapshotRunnable.run(BatchRequestService.java:238)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: sun.security.validator.ValidatorException: PKIX path building failed: sun.security.provider.certpath.SunCertPathBuilderException: unable to find valid certification path to requested target
at sun.security.validator.PKIXValidator.doBuild(PKIXValidator.java:456)
at sun.security.validator.PKIXValidator.engineValidate(PKIXValidator.java:323)
at sun.security.validator.Validator.validate(Validator.java:271)
at sun.security.ssl.X509TrustManagerImpl.validate(X509TrustManagerImpl.java:315)
at sun.security.ssl.X509TrustManagerImpl.checkTrusted(X509TrustManagerImpl.java:223)
at sun.security.ssl.X509TrustManagerImpl.checkServerTrusted(X509TrustManagerImpl.java:129)
at sun.security.ssl.CertificateMessage$T12CertificateConsumer.checkServerCerts(CertificateMessage.java:638)
... 31 more
Caused by: sun.security.provider.certpath.SunCertPathBuilderException: unable to find valid certification path to requested target
at sun.security.provider.certpath.SunCertPathBuilder.build(SunCertPathBuilder.java:141)
at sun.security.provider.certpath.SunCertPathBuilder.engineBuild(SunCertPathBuilder.java:126)
at java.security.cert.CertPathBuilder.build(CertPathBuilder.java:280)
at sun.security.validator.PKIXValidator.doBuild(PKIXValidator.java:451)
INFO [GoogleCloudSearchCommitter] Indexing Service release reference count: 1
INFO [GoogleCloudSearchCommitter] Stopping indexingService: 0
Dec 09, 2020 10:53:58 PM com.google.enterprise.cloudsearch.sdk.BatchRequestService shutDown
INFO: Shutting down batching service. flush on shutdown: true
INFO [GoogleCloudSearchCommitter] Shutting down (took: 2ms)!
INFO [GoogleCloudSearchCommitter] Indexing Service reference count: 0
INFO [AbstractCrawler] bayer-default: Crawler executed in 8 minutes 2 seconds.
INFO [SitemapStore] bayer-default: Closing sitemap store...
ERROR [JobSuite] Execution failed for job: bayer-default
INFO [JobSuite] Running bayer-default: END (Wed Dec 09 22:45:55 UTC 2020

I have checked for jre keystore, no certificate has expired recenly. Also updated my SDK to latest version.
But nothing worked. I am getting this error irrespective of domains I am trying to index.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions