Skip to content

Commit aecc304

Browse files
NimaBoscarinoosansevierostevhliu
committed
Docs Revamp: new "Repositories" page for hub-docs (#92)
Add a "Repositories" page with several subpages, outlining how repos are used on the hub, and best practices for creating and maintaining them. Some content remains WIP. Co-authored-by: Omar Sanseviero <[email protected]> Co-authored-by: Steven Liu <[email protected]>
1 parent 17b4d9b commit aecc304

9 files changed

+251
-0
lines changed

docs/assets/hub/empty_repo.png

35.2 KB
Loading

docs/assets/hub/new_repo.png

-34.6 KB
Loading

docs/assets/hub/repo_history.png

82.1 KB
Loading

docs/assets/hub/repo_with_files.png

-29.2 KB
Loading

docs/hub/_sections.yml

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,9 @@
11
- local: hugging-face-hub
22
title: Hugging Face Hub
33

4+
- local: repositories-main
5+
title: Repositories
6+
47
- local: main
58
title: Hub documentation
69

Lines changed: 25 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,25 @@
1+
---
2+
title: Getting Started with Repositories
3+
---
4+
5+
<h1>Best practices with repositories</h1>
6+
7+
Here are some additional best practices to help you get the most out of your repository.
8+
9+
## Private repositories
10+
11+
You can choose a repository's visibility when you create it, and any repository that you own can have its visibility toggled between *public* and *private* in the **Settings** tab. Unless your repository is owned by an organization (more about that [**here!**](TODO)), you are the only user that can make changes to your repo or upload any code. Setting your visibility to *private* will:
12+
13+
- Ensure your repo is not discoverable by other users by searching the Hub.
14+
- Other users who visit the URL of your private repo will receive a `404 - Repo not found` error.
15+
- Other users will not be able to clone your repo.
16+
17+
## Handling multiple experiments
18+
### TODO
19+
Can use content from https://github.com/huggingface/huggingface_hub/issues/769 and https://github.com/huggingface/hub-docs/issues/53
20+
21+
## Licenses
22+
23+
You are able to add a license to any repo that you create on the Hugging Face Hub to let other users know about the permissions that you want to attribute to your code. The license can also be added to your repository's `README.md` file, known as a *card* on the Hub, in the card's metadata section. Remember to seek out and respect a project's license if you're considering using their code.
24+
25+
A [**full list of the available licenses**](TODO) is available in these docs.
Lines changed: 114 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,114 @@
1+
---
2+
title: Getting Started with Repositories
3+
---
4+
5+
<h1>Getting Started with Repositories</h1>
6+
7+
This beginner-friendly guide will help you get the basic skills you need to create and manage your repository on the Hub. Each section builds on the previous one, so feel free to choose where to start!
8+
9+
## Requirements
10+
11+
If you do not have `git` available as a CLI command yet, you will need to [install Git](https://git-scm.com/downloads) for your platform. You will also need to [install Git LFS](https://git-lfs.github.com/), which will be used to handle large files such as images and model weights.
12+
13+
To be able to push your code to the Hub, you'll need to authenticate somehow. The easiest way to do this is by installing the [`huggingface_hub` CLI](https://huggingface.co/docs/huggingface_hub/index) and running the login command:
14+
15+
```bash
16+
python -m pip install huggingface_hub
17+
huggingface-cli login
18+
```
19+
20+
The content in the **Getting Started** section of this document is also available as a video!
21+
22+
<iframe width="560" height="315" src="https://www.youtube-nocookie.com/embed/rkCly_cbMBk" title="Managing a repo" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture" allowfullscreen></iframe>
23+
24+
## Creating a repository
25+
26+
Using the Hub's web interface you can easily create repositories, add files (even large ones!), explore models, visualize diffs, and much more. There are three kinds of repositories on the Hub, and in this guide we'll be creating a **model repository** for demonstration purposes. For information on creating and managing models, datasets, and Spaces, refer to their respective documentation.
27+
28+
1. To create a new repository, visit [huggingface.co/new](http://huggingface.co/new):
29+
30+
![/docs/assets/hub/new_repo.png](/docs/assets/hub/new_repo.png)
31+
32+
2. First, specify the owner of the repository: this can be either you or any of the organizations you’re affiliated with.
33+
34+
3. Next, enter your model’s name. This will also be the name of the repository. Finally, you can specify whether you want your model to be public or private.
35+
36+
You can leave the *License* field blank for now. To learn about licenses, visit the **Licenses** (TODO: LINK TO LICENSES) section of this document.
37+
38+
After creating your model repository, you should see a page like this:
39+
40+
![/docs/assets/hub/empty_repo.png](/docs/assets/hub/empty_repo.png)
41+
42+
Note that the Hub prompts you to create a *Model Card*, which you can learn about in the **Model Cards documentation** (TODO: LINK). Including a Model Card in your model repo is best practice, but since we're only making a test repo at the moment we can skip this.
43+
44+
45+
## Cloning repositories
46+
47+
Downloading repositories to your local machine is called *cloning*. You can use the following commands to load the repo that we made and navigate to it:
48+
```bash
49+
git clone https://huggingface.co/<your-username>/<your-model-id>
50+
cd <your-model-id>
51+
```
52+
53+
## Adding files to a repository
54+
55+
Now's the time, you can add any files you want to the repository! 🔥
56+
57+
58+
Do you have files larger than 10MB? Those files should be tracked with `git-lfs`, which you can initialize with:
59+
60+
```bash
61+
git lfs install
62+
```
63+
64+
Note that if your files are larger than **5GB** you'll also need to run:
65+
66+
```bash
67+
huggingface-cli lfs-enable-largefiles
68+
```
69+
70+
When you use Hugging Face to create a repository, we automatically provide a list of common file extensions for these files in the `.gitattributes` file, which `git-lfs` uses to efficiently track changes to your large files. However, you might need to add new extensions if your file types are not already handled. You can do so with `git lfs track "*.your_extension"`.
71+
72+
73+
You can use Git to save new files and any changes to already existing files as a bundle of changes called a *commit*, which can be thought of as a "revision" to your project. To create a commit, we have to `add` the files to let Git know that we're planning on saving the changes and then `commit` those changes. In order to sync the new commit with the Hugging Face Hub, we then `push` the commit to the Hub.
74+
75+
```bash
76+
# Create any files you like! Then...
77+
git add .
78+
git commit -m "First model version" # You can choose any descriptive message
79+
git push
80+
```
81+
82+
And we're done! You can check your repository on Hugging Face with all the recently added files. For example, in the screenshot below the user added a number of files. Note that one of the files in this example has a size of `413 MB`, so the repo uses Git LFS to track it.
83+
84+
![/docs/assets/hub/repo_with_files.png](/docs/assets/hub/repo_with_files.png)
85+
86+
87+
## Viewing a repo's history
88+
Every time you go through the `add`-`commit`-`push` cycle, the repo will keep track of every change you've made to your files. The UI allows you to explore the model files and commits and to see the difference (also known as *diff*) introduced by each commit. To see the history, you can click on the **History: X commits** link.
89+
90+
![/docs/assets/hub/repo_history.png](/docs/assets/hub/repo_history.png)
91+
92+
You can click on an individual commit to see what changes that commit introduced:
93+
94+
![/docs/assets/hub/explore_history.gif](/docs/assets/hub/explore_history.gif)
95+
96+
97+
## Renaming or transferring a repo
98+
99+
100+
If you own a repository, you will be able to visit the **Settings** tab to manage the name and ownership. Note that there are certain limitations in terms of use cases.
101+
102+
Moving can be used in these use cases ✅
103+
- Renaming a repository within same user.
104+
- Renaming a repository within same organization. The user must be part of the organization and have "write" or "admin" rights in the organization.
105+
- Transferring repository from user to an organization. The user must be part of the organization and have "write" or "admin" rights in the organization.
106+
- Transferring a repository from an organization to yourself. You must be part of the organization, and have "admin" rights in the organization.
107+
- Transferring a repository from a source organization to another target organization. The user must have "admin" rights in the source organization **and** either "write" or "admin" rights in the target organization.
108+
109+
Moving does not work for ❌
110+
- Transferring a repository from an organization to another user who is not yourself.
111+
- Transferring a repository from a source organization to another target organization if the user does not have both "admin" rights in the source organization **and** either "write" or "admin" rights in the target organization.
112+
- Transferring a repository from user A to user B.
113+
114+
If these are use cases you need help with, please send us an email at **website at huggingface.co**.

docs/hub/repositories-main.md

Lines changed: 15 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,15 @@
1+
---
2+
title: Repositories
3+
---
4+
5+
<h1>Repositories</h1>
6+
7+
Models, Spaces, and datasets are hosted on the Hugging Face Hub as [Git repositories](https://git-scm.com/about), which means that version control and collaboration are core elements of the Hub. In a nutshell, a repository (also known as a **repo**) is a place where code and assets can be stored to back up your work, share it with the community, and work in a team.
8+
9+
In these pages, we will go over the basics of getting started with Git and interacting with repositories on the Hub. Once you get the hang of it, you can explore the best practices and next steps that we've compiled for effective repository usage.
10+
11+
## Contents
12+
13+
- [Getting Started](./repositories-getting-started)
14+
- [Best Practices](./repositories-best-practices)
15+
- [Next Steps](./repositories-next-steps)

docs/hub/repositories-next-steps.md

Lines changed: 94 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,94 @@
1+
---
2+
title: Next Steps
3+
---
4+
5+
<h1>Next steps</h1>
6+
7+
These next sections highlight features and additional information that you may find useful to make the most out of the Git repositories on the Hugging Face Hub.
8+
9+
## Learning more about Git
10+
11+
A good place to visit if you want to continue learning about Git is [this Git tutorial](https://learngitbranching.js.org/). For even more background on Git, you can take a look at [GitHub's Git Guides](https://github.com/git-guides).
12+
13+
## How to use branches
14+
15+
To effectively use Git repos collaboratively and to work on features without releasing premature code you can use **branches**. Branches allow you to separate your "work in progress" code from your "production-ready" code, with the additional benefit of letting multiple people work on a project without frequently conflicting with each others' contributions. You can use branches to isolate experiments in their own branch, and even [adopt team-wide practices for managing branches](https://ericmjl.github.io/essays-on-data-science/workflow/gitflow/).
16+
17+
To learn about Git branching, you can try out the [Learn Git Branching interactive tutorial](https://learngitbranching.js.org/).
18+
19+
## Using tags
20+
21+
Git allows you to *tag* commits so that you can easily note milestones in your project. As such, you can use tags to mark commits in your Hub repos! To learn about using tags, you can visit [this DevConnected post](https://devconnected.com/how-to-create-git-tags/).
22+
23+
Beyond making it easy to identify important commits in your repo's history, using Git tags also allows you to [clone a repository at a specific tag](https://www.techiedelight.com/clone-specific-tag-with-git/). The `huggingface_hub` library also supports working with tags, such as [downloading files from a specific tagged commit](https://huggingface.co/docs/huggingface_hub/main/en/how-to-downstream#hfhuburl).
24+
25+
## How to duplicate or fork a repo (including LFS pointers)
26+
27+
If you'd like to copy a repository, depending on whether you want to preserve the Git history there are two options.
28+
29+
### Duplicating without Git history
30+
31+
In many scenarios, if you want your own copy of a particular codebase you might not be concerned about the previous Git history. In this case, you can quickly duplicate a repo with the handy [Repo Duplicator](https://huggingface.co/spaces/osanseviero/repo_duplicator)! You'll have to create a User Access Token, which you can read more about in the [security documentation](TODO).
32+
33+
### Duplicating with the Git history (Fork)
34+
35+
A duplicate of a repository with the commit history preserved is called a *fork*. You may choose to fork one of your own repos, but it also common to fork other people's projects if you would like to tinker with them.
36+
37+
**Note that you will need to [install Git LFS](https://git-lfs.github.com/) and the [`huggingface_hub` CLI](https://huggingface.co/docs/huggingface_hub/index) to follow this process**. When you want to fork or [rebase](https://git-scm.com/docs/git-rebase) a repository with LFS files you cannot use the usual Git approach that you might be familiar with since you need to be careful to not break the LFS pointers. Forking can take time depending on your bandwidth because you will have to fetch and re-upload all the LFS files in your fork.
38+
39+
For example, say you have an upstream repository, **upstream**, and you just created your own repository on the Hub which is **myfork** in this example.
40+
41+
1. Create a destination repository (e.g. **myfork**) in https://huggingface.co
42+
43+
2. Clone your fork repository:
44+
45+
```
46+
git lfs clone https://huggingface.co/me/myfork.git
47+
```
48+
49+
3. Fetch non LFS files:
50+
51+
```
52+
cd myfork
53+
git lfs install --skip-smudge --local # affects only this clone
54+
git remote add upstream https://huggingface.co/friend/upstream.git
55+
git fetch upstream
56+
```
57+
58+
4. Fetch large files. This can take some time depending on your download bandwidth:
59+
60+
```
61+
git lfs fetch --all upstream # this can take time depending on your download bandwidth
62+
```
63+
64+
4.a. If you want to completely override the fork history (which should only have an initial commit), run:
65+
66+
```
67+
git reset --hard upstream/main
68+
```
69+
70+
4.b. If you want to rebase instead of overriding, run the following command and resolve any conflicts:
71+
72+
```
73+
git rebase upstream/main
74+
```
75+
76+
5. Prepare your LFS files to push:
77+
78+
```
79+
git lfs install --force --local # this reinstalls the LFS hooks
80+
huggingface-cli lfs-enable-largefiles . # needed if some files are bigger than 5Gb
81+
```
82+
83+
6. And finally push:
84+
85+
```
86+
git push --force origin main # this can take time depending on your upload bandwidth
87+
```
88+
89+
Now you have your own fork or rebased repo in the Hub!
90+
91+
92+
## How to programmatically manage repositories
93+
94+
So far, we've looked at using the Git CLI and the Hugging Face Hub to work with our repos. But Hugging Face also supports accessing repos with Python via the [`huggingface_hub` library](https://huggingface.co/docs/huggingface_hub/index). The operations that we've explored such as downloading repositories and uploading files are available through the library, as well as other useful functions!

0 commit comments

Comments
 (0)