Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

related_queries NULL, data is not loaded #465

Open
pricerices opened this issue Aug 25, 2024 · 12 comments
Open

related_queries NULL, data is not loaded #465

pricerices opened this issue Aug 25, 2024 · 12 comments

Comments

@pricerices
Copy link

hi,

The scrapping process is working properly. But related_queries data is not loaded (NULL).

I check Google Trends website manually and it shows keywords as usual.

could you solve this or is it from google side?

thx
IMG_20240825_083450

@PMassicotte
Copy link
Owner

Please provide a reproducible example.

@pricerices
Copy link
Author

I run this:

gtrends(c("what","where","when","who","why"), geo = "", cat = "0", time = "now 7-d")

and as simple as:

gtrends("what")

@eddelbuettel
Copy link
Collaborator

What happens when you run the known working example from the README.md and documentation?

@pricerices
Copy link
Author

I can't get related_queries data. This happened since 5 days ago.

@pricerices
Copy link
Author

for more detail example, I run this:

allcat <- gtrends(c("what","where","when","who","why"), geo = "", cat = "0", time = "now 7-d")

allcatrq <- allcat$related_queries %>% filter(related_queries == "rising") %>% select(value)

and the error result in console:

Error in UseMethod("filter") : no applicable method for 'filter' applied to an object of class "NULL"

Experiencing this for the first time. Even though keywords/related_queries appear normally in google trends web.

@eddelbuettel
Copy link
Collaborator

I can confirm with a test query. We don't seem to be getting that data back even though the web interface has it for the same query. :-/

@PMassicotte
Copy link
Owner

I will take a look tomorrow. Likely something changed in they payload options.

@PMassicotte
Copy link
Owner

Just did a quick search and there are not much difference between us and Google Trends

[ins] r$>   waldo::compare(fromJSON(p1), fromJSON(p2), x_arg = "gtrendsR", y_arg = "web")
`gtrendsR$restriction$time`: "2024-08-18T14\\:05\\:57 2024-08-25T14\\:05\\:57"
`web$restriction$time`:      "2024-08-18T14\\:43\\:56 2024-08-25T14\\:43\\:56"

gtrendsR$restriction$complexKeywordsRestriction$keyword vs web$restriction$complexKeywordsRestriction$keyword
                                                               value
- gtrendsR$restriction$complexKeywordsRestriction$keyword[1, ]   NHL
+ web$restriction$complexKeywordsRestriction$keyword[1, ]        nhl

`gtrendsR$restriction$complexKeywordsRestriction$keyword$value`: "NHL"
`web$restriction$complexKeywordsRestriction$keyword$value`:      "nhl"

`gtrendsR$trendinessSettings$compareTime`: "2024-08-11T14\\:05\\:57 2024-08-18T14\\:05\\:57"
`web$trendinessSettings$compareTime`:      "2024-08-11T14\\:43\\:56 2024-08-18T14\\:43\\:56"

`gtrendsR$userConfig$userType`: "USER_TYPE_SCRAPER"   
`web$userConfig$userType`:      "USER_TYPE_LEGIT_USER"

@nicolacaravaggio
Copy link

nicolacaravaggio commented Sep 8, 2024

I am following this thread since (I guess) it is connected to my problem. I am having trouble downloading data from GT through gtrendsR, results are always empty while when I am doing the exact same query with pytrends I am getting the results with no problem. I checked with some colleagues of mine and they have the same problem.

Just as an example, where I am having NULL as result:

library(gtrendsR)

gt_data <- gtrends(
  keyword = "Taylor Swift",         
  geo = "US",               
  time = "today 12-m"        
)

head(gt_data$interest_over_time)

@PMassicotte
Copy link
Owner

As previously said, this is not something easily fixable, as it appears that Google knows that we are not using the official web page and thus seems to block some content (#465 (comment)).

@cmp-nct
Copy link

cmp-nct commented Oct 8, 2024

there is more happening on browser side
When the /api/explore request is made then they send a 23kb POST object in addition to the json encoded request.
It starts like this FEWyJzZXRva2VuIiwiMDNBRmNXZUE2ZTBUaEE1X1Q

Looking a bit deeper, this is a slightly obfuscated base64 string.
If you remove the first two characters, base64 decode it then it becomes a json object ["setoken","03AFcWeA6e0ThA5.....

So it appears that google has sneaked in a huge token in addition to the small token.

Update:
It's probably google recaptcha v3 data, I guess this is dead without using a full browser in the future.

@PMassicotte
Copy link
Owner

Nice detective work. It looks like it won't be easy to fool Googl.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants