CSGHub Server
is a part of the open source and reliable large model assets management platform - CSGHub. It focuses on management of models、datasets and other LLM assets through REST API。
- Creation and Management of users and orgnizations
- Auto-tagging of model and dataset labels
- Search for users, organizations, models, and data
- Online preview of dataset files, like
.parquet
file - Content moderation for both text and image
- Download of individual files, including LFS files
- Tracking of model and dataset activity data, such as downloads and likes volume
In order to help users to quickly understand the features and usage of CSGHub, we have recorded a demo video. You can watch this video to get a quick understanding of the main features and operation procedures of this program.
Please visit the OpenCSG website to experience the powerful management features.
System resource requirements: 4c CPU/8GB memory
Please install Docker yourself. This project has been tested in Ubuntu22 environment.
You can quickly deploy the localized CSGHub Server
service through docker-compose:
# The API token should be at least 128 characters long, and HTTP requests to csghub-server require the API token to be sent as a Bearer token for authentication.
export STARHUB_SERVER_API_TOKEN=<API token>
mkdir -m 777 gitea minio_data
curl -L https://raw.githubusercontent.com/OpenCSGs/csghub-server/main/docker-compose.yml -o docker-compose.yml
docker-compose -f docker-compose.yml up -d
CSGHub supports TOML format for config files. When starting any service from the command line, you can specify the config file with the --config
option:
go run cmd/csghub-server/main.go start server --config local.toml
go run cmd/csghub-server/main.go deploy runner --config local.toml
...
We provide an example config file, you can rename it, modify as needed and use. All available configurations are defined in this Go file. The TOML configuration uses snake_case naming convention, and names automatically map to corresponding struct field names.
- Supports different git servers, such as Gitea, GitLab, etc.
- Supports flexible configuration of the LFS storage system, and you can choose to use local or any third-party cloud storage service that is compatible with the S3 protocol.
- Enable content moderation on demand, and choose any third-party content moderation service.
- Support more Git Servers: Currently supports Gitea, and plans to support mainstream Git repositories in the future.
- Git LFS: Git LFS supports large files, and supports Git command operations and online download through the Web UI.
- DataSet online viewer: Data set preview, supports the Top20/TopN loading preview of LFS format data sets.
- Model/Dataset AutoTag: Supports custom metadata and automatic extraction of model/dataset tags.
- S3 Protocol Support: Supports S3 (MinIO) storage protocol, providing higher reliability and storage cost-effectiveness.
- Model format convert: Conversion of mainstream model formats.
- Model oneclick deploy: Supports integration with OpenCSG llm-inference, one-click to start model inference.
We use the Apache 2.0 license, the content of which is detailed in the LICENSE
file.
If you wish to contribute, please follow the Contribution Guidelines. We are very excited about your contributions!
Before you begin development, we highly recommend checking out our Backend Developer Guides, which provide helpful information to ensure a smooth development process.
This project is based on open source projects such as Gin, DuckDB, minio, and Gitea. We would like to express our sincere gratitude to them for their open source contributions!
If you meet any problem during usage, you can contact with us by any following way:
- initiate an issue in github
- join our WeChat group by scaning wechat helper qrcode
- join our offical discord channel: OpenCSG Discord Channel
- join our slack workspace:OpenCSG Slack Channel