You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
At Grafana in developer advocacy we use a simple set of scripts to fetch all YouTube transcripts and check the textual transcripts of what's in videos into a repo. You can see that here
We do this for a number of reasons:
We use it to draft early first drafts when we're working on some new docs; basically, interview engineers for an hour, and then use the rich back & forth as grist for a blog post, a new docs page, whatever
We have "expert Q&A LLM answer agents" which can be fed everything that the technical experts said on video. This expands the reach of automatic Q&A techniques like this.
It would not be that hard to contribute this code to OTel for its YouTube channel if that's desirable. What you'd get is a small set of python scripts & instructions, plus a directory of the community's choosing where all the transcripts would be checked in (effectively the output of the scripts).
This makes "video greppable" for the community, and enables any subsequent downstream LLM approaches community may want to use.
$ cat otel/2023-09-28T04\:00\:36Z-otel-end-user-discussions-amer-january-2023.md | aichat --prompt "Please create a list of questions that are raised in this video transcript; omitting the answers"
1. How do you deal with helping in languages you're not an expert in?
2. How do you define an enabler in a company?
3. Have other companies experienced this struggle with language expertise?
4. Do you create champions in those languages you're not familiar with?
5. Have those people you worked with come back with questions or seek information on their own?
6. How long did it take for individuals to get up to speed with observability concepts?
7. Is auto-instrumentation a good approach to help people get started with OpenTelemetry?
8. What strategies can improve understanding of the purpose and value of OpenTelemetry within a team?
9. Would having real-world case documentation around OpenTelemetry be helpful?
10. Should there be a periodic forum for pointed Q&A with experts in the community?
11. How are you dealing with clock/time drift in data?
12. Is it useful to create a processor that detects data points from the future?
13. What are best practices for bifurcating data in a pipeline using filters and conditions?
14. What would be the advantage of using a router solution versus a filter processor for data separation?
15. How do you scale collector deployments correctly?
16. When should the configuration settings like the number of consumers be modified for optimal performance?
17. How many collectors should be scaled horizontally to meet traffic demands?
18. Is using a number of consumers setting purely for non-scalable deployments?
Benefits to the OpenTelemetry community
Better content reuse & google-ability of technical resources OTel is already publishing; faster process of improving docs & writing blog posts for everyone in the community.
Reasons for donation
I don't know if "donation" is the best concept here; the actual code itself is really not complicated, it's just an offer to put this in place if people like the idea. I've found it very useful, and it's frankly not that hard and doesn't impact any other components of the ecosystem, it's just focused on trying to get more value out of the good stuff people are already doing.
Thanks @moxious! On a related note, something like this could also be useful for meeting summaries. OTel has a lot of zoom meetings. I don't think we should feed those into an answer agent, but it would be nice to have high quality meeting summaries. We've been testing the zoom summary feature at the GC meeting and so far it's been pretty underwhelming.
Description
At Grafana in developer advocacy we use a simple set of scripts to fetch all YouTube transcripts and check the textual transcripts of what's in videos into a repo. You can see that here
We do this for a number of reasons:
It would not be that hard to contribute this code to OTel for its YouTube channel if that's desirable. What you'd get is a small set of python scripts & instructions, plus a directory of the community's choosing where all the transcripts would be checked in (effectively the output of the scripts).
This makes "video greppable" for the community, and enables any subsequent downstream LLM approaches community may want to use.
Mini POC
I took this really cool OTel video (thanks @reese-lee!) and pulled its transcript
Benefits to the OpenTelemetry community
Better content reuse & google-ability of technical resources OTel is already publishing; faster process of improving docs & writing blog posts for everyone in the community.
Reasons for donation
I don't know if "donation" is the best concept here; the actual code itself is really not complicated, it's just an offer to put this in place if people like the idea. I've found it very useful, and it's frankly not that hard and doesn't impact any other components of the ecosystem, it's just focused on trying to get more value out of the good stuff people are already doing.
Repository
https://github.com/grafana/developer-advocacy/
Existing usage
(Covered above)
Maintenance
As new videos are published the script needs to be periodicially re-run + git commit + git push
Actual script maintenance is minimal unless the YouTube API changes
Licenses
I'm the author of this code, and I think I can clear it to be licensed Apache 2.0 but I need to verify this if/when proposal is accepted
Trademarks
N/A
Other notes
No response
The text was updated successfully, but these errors were encountered: