-
Notifications
You must be signed in to change notification settings - Fork 658
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Is there a way to extract a model's download stats (e.g., last 30 days) in times series format? #2390
Comments
Hi @frr717, thanks for your interest. There is currently no way to get this data as time series. The only information you can get is the downloads in the last 30 days and overall downloads. What would be your use case for a timeseries format? |
Thank you for you reply. I am in a research project that needs to use this times series to conduct some regression analysis on companies that those models belong to. Hence I am interested to know whether your team has a plan to implement it? |
BTW, I want to kindly ask you another questions regarding the image (a SVG element in the html of a model, such as [https://huggingface.co/openbmb/MiniCPM-Llama3-V-2_5] on the right top corner, besides the "Downloads last month": |
it's each day in the last 30 days |
thank you! |
hi, @julien-c Could you kindly tell me the formula it uses? Thank you! |
0 means 0 download, ie. we don't move the origin. So yes, you can get the daily downloads from the last30days total + the graph. It's a bit hacky but it'll work. |
Hi @julien-c @Wauplin , I am research assistant at LSE for professor Jonathon Hazell. We actually encountered the same research need in Economics project as the one above. In our set up, it would be great to have some proxy for AI models frequency of use in regressions for macroeconomics research at daily frequency historically. And it looks like it the best (if only) source available. It is probably a very brave and long shot and I understand you dont have this feature publicly, but may it be somehow possible to have some discussion whether this data can be available for some models for research needs (we can make data confidential etc and only show regressions, cite Hugging Face in the research, share research design). |
Hi @pilipentseva, thanks for reaching out. Unfortunately we do not provide this level of detail. The best dataset we have for research purposes is this one https://huggingface.co/datasets/cfahlgren1/hub-stats from @cfahlgren1. It does not provide as much granularity as you would like but it is updated regularly and always available. Go to the Data Studio tab to inspect the data. Hope this can help you move forward in your research 🤗 |
Is your feature request related to a problem? Please describe.
Currently, I can only find code like below to get a static data point (last 30 day download count from today):
info = model_info("bert-base-uncased") model_info(info.modelId).downloads
Describe the solution you'd like
I wonder whether huggingface can provide methods with an input specifying the date? such as
model_info(info.modelId).get_downloads('20240131')
Describe alternatives you've considered
currently no...
I appreciate any help from all of you!
The text was updated successfully, but these errors were encountered: