You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I have tried to set up a pipeline following the instructions on Andy Halterman's blog. Everything goes well until I try to run the Stanford pipeline on a collection of articles in MongoDB.
It seems at first that the process goes well, but after ~20 minutes an error occurs:
RuntimeError: maximum recursion depth exceeded in cmp
INFO:StanfordSocketWrap:Subprocess seems to be stopped, exit code 1
INFO:StanfordSocketWrap:Subprocess seems to be stopped, exit code 1
I have thought that it might simply surpass the recursion limit due to the number of articles your custom scraper gets from the DW (984 in my case). The same thing happens to a collection with 409 articles in it.
Though stanford.log shows that the operation was successful:
INFO 2018-10-04 13:18:18,371: Getting today's unparsed stories from db 'event_scrape', collection 'dw_test'
INFO 2018-10-04 13:18:18,371: Querying for all unparsed stories added within the last day
INFO 2018-10-04 13:18:18,373: Returning 984 total stories.
INFO 2018-10-04 13:18:18,375: Setting up CoreNLP.
INFO 2018-10-04 17:00:42,344: Running.
I've checked the database in mongo shell with db.dw_test.findOne() - it still contains unparsed text.
Just in case, I've also tried running the collection through the Phoenix pipeline, but it got 0 sentences coded.
UPD: (just in case) I've tried to test it on mongo a collection with only 1 article and it still is the same - after numerous retries, it still throws an error I've specified above. Here is a chunk of the terminal output if it might say something in particular.
The text was updated successfully, but these errors were encountered:
I have tried to set up a pipeline following the instructions on Andy Halterman's blog. Everything goes well until I try to run the Stanford pipeline on a collection of articles in MongoDB.
It seems at first that the process goes well, but after ~20 minutes an error occurs:
I have thought that it might simply surpass the recursion limit due to the number of articles your custom scraper gets from the DW (984 in my case). The same thing happens to a collection with 409 articles in it.
Though stanford.log shows that the operation was successful:
I've checked the database in mongo shell with db.dw_test.findOne() - it still contains unparsed text.
Just in case, I've also tried running the collection through the Phoenix pipeline, but it got 0 sentences coded.
UPD: (just in case) I've tried to test it on mongo a collection with only 1 article and it still is the same - after numerous retries, it still throws an error I've specified above. Here is a chunk of the terminal output if it might say something in particular.
The text was updated successfully, but these errors were encountered: