Skip to content
This repository has been archived by the owner on May 12, 2023. It is now read-only.

Commit

Permalink
migration to pygpt4all
Browse files Browse the repository at this point in the history
  • Loading branch information
absadiki committed Apr 23, 2023
1 parent 7ef5910 commit 84fc0ec
Show file tree
Hide file tree
Showing 269 changed files with 52 additions and 61,413 deletions.
48 changes: 0 additions & 48 deletions .github/workflows/conda.yml

This file was deleted.

24 changes: 0 additions & 24 deletions .github/workflows/docs.yml

This file was deleted.

22 changes: 0 additions & 22 deletions .github/workflows/format.yml

This file was deleted.

3 changes: 2 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -7,14 +7,15 @@ _generate/
*.egg-info
*env*

mtests



# custom
.idea
_docs
_examples
src/.idea
mtests

# Byte-compiled / optimized / DLL files
__pycache__/
Expand Down
9 changes: 0 additions & 9 deletions .gitmodules
Original file line number Diff line number Diff line change
@@ -1,9 +0,0 @@
[submodule "pybind11"]
path = pybind11
url = https://github.com/pybind/pybind11.git
branch = master

[submodule "llama.cpp"]
path = llama.cpp
url = https://github.com/ggerganov/llama.cpp.git
branch = master
78 changes: 0 additions & 78 deletions .pre-commit-config.yaml

This file was deleted.

116 changes: 25 additions & 91 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,125 +1,59 @@
# PyLLaMACpp
Official supported Python bindings for [llama.cpp](https://github.com/ggerganov/llama.cpp) + gpt4all
# PyGPT4All
Official Python CPU inference for [GPT4All](https://github.com/nomic-ai/gpt4all) language models based on [llama.cpp](https://github.com/ggerganov/llama.cpp) and [ggml](https://github.com/ggerganov/ggml).

[![License: MIT](https://img.shields.io/badge/license-MIT-blue.svg)](https://opensource.org/licenses/MIT)
[![PyPi version](https://badgen.net/pypi/v/pyllamacpp)](https://pypi.org/project/pyllamacpp/)


For those who don't know, `llama.cpp` is a port of Facebook's LLaMA model in pure C/C++:

<blockquote>

- Without dependencies
- Apple silicon first-class citizen - optimized via ARM NEON
- AVX2 support for x86 architectures
- Mixed F16 / F32 precision
- 4-bit quantization support
- Runs on the CPU

</blockquote>

# Table of contents
<!-- TOC -->
* [Installation](#installation)
* [Usage](#usage)
* [Supported model](#supported-model)
* [GPT4All](#gpt4all)
* [Discussions and contributions](#discussions-and-contributions)
* [License](#license)
<!-- TOC -->
**NB: Under active development**

# Installation
1. The easy way is to use the prebuilt wheels
```bash
pip install pyllamacpp
pip install pygpt4all
```

However, the compilation process of `llama.cpp` is taking into account the architecture of the target `CPU`,
so you might need to build it from source:
2. Build it from source:

```shell
git clone --recursive https://github.com/nomic-ai/pyllamacpp && cd pyllamacpp
git clone --recursive https://github.com/nomic-ai/pygpt4all && cd pygpt4all
pip install .
```

# Usage

A simple `Pythonic` API is built on top of `llama.cpp` C/C++ functions. You can call it from Python as follows:

```python
from pyllamacpp.model import Model

def new_text_callback(text: str):
print(text, end="", flush=True)
### GPT4All model

model = Model(ggml_model='./models/gpt4all-model.bin', n_ctx=512)
model.generate("Once upon a time, ", n_predict=55, new_text_callback=new_text_callback, n_threads=8)
```
If you don't want to use the `callback`, you can get the results from the `generate` method once the inference is finished:
Download a GPT4All model from https://the-eye.eu/public/AI/models/nomic-ai/gpt4all/. The easiest approach is download a file whose name ends in ggml.bin

```python
generated_text = model.generate("Once upon a time, ", n_predict=55)
print(generated_text)
```

## Interactive Mode

If you want to run the program in interactive mode you can add the `grab_text_callback` function and set `interactive` to True in the generate function. `grab_text_callback` should always return a string unless you wish to signal EOF in which case you should return None.
from pygpt4all.models.gpt4all import GPT4All

```py
from pyllamacpp.model import Model
def new_text_callback(text):
print(text, end="")

def new_text_callback(text: str):
print(text, end="", flush=True)

def grab_text_callback():
inpt = input()
# To signal EOF, return None
if inpt == "END":
return None
return inpt

model = Model(ggml_model='./models/gpt4all-model.bin', n_ctx=512)

# prompt from https://github.com/ggerganov/llama.cpp/blob/master/prompts/chat-with-bob.txt
prompt = """
Transcript of a dialog, where the User interacts with an Assistant named Bob. Bob is helpful, kind, honest, good at writing, and never fails to answer the User's requests immediately and with precision. To do this, Bob uses a database of information collected from many different sources, including books, journals, online articles, and more.
User: Hello, Bob.
Bob: Hello. How may I help you today?
User: Please tell me the largest city in Europe.
Bob: Sure. The largest city in Europe is Moscow, the capital of Russia.
User:"""

model.generate(prompt, n_predict=256, new_text_callback=new_text_callback, grab_text_callback=grab_text_callback, interactive=True, repeat_penalty=1.0, antiprompt=["User:"])
model = GPT4All('./models/ggml-gpt4all-j.bin')
model.generate("Once upon a time, ", n_predict=55, new_text_callback=new_text_callback)
```

* You can pass any `llama context` [parameter](https://nomic-ai.github.io/pyllamacpp/#pyllamacpp.constants.LLAMA_CONTEXT_PARAMS_SCHEMA) as a keyword argument to the `Model` class
* You can pass any `gpt` [parameter](https://nomic-ai.github.io/pyllamacpp/#pyllamacpp.constants.GPT_PARAMS_SCHEMA) as a keyword argument to the `generarte` method
* You can always refer to the [short documentation](https://nomic-ai.github.io/pyllamacpp/) for more details.

### GPT4All-J model

# Supported model
Download the GPT4All-J model from https://gpt4all.io/models/ggml-gpt4all-j.bin

### GPT4All
```python
from pygpt4all.models.gpt4all_j import GPT4All_J

Download a GPT4All model from https://the-eye.eu/public/AI/models/nomic-ai/gpt4all/.
The easiest approach is download a file whose name ends in `ggml.bin`--older model versions require conversion.
def new_text_callback(text):
print(text, end="")

If you have an older model downloaded that you want to convert, in your terminal run:
```shell
pyllamacpp-convert-gpt4all path/to/gpt4all_model.bin path/to/llama_tokenizer path/to/gpt4all-converted.bin
model = GPT4All_J('./models/ggml-gpt4all-j.bin')
model.generate("Once upon a time, ", n_predict=55, new_text_callback=new_text_callback)
```

# FAQs
* Where to find the llama tokenizer? [#5](https://github.com/nomic-ai/pyllamacpp/issues/5)

# Discussions and contributions
If you find any bug, please open an [issue](https://github.com/nomic-ai/pyllamacpp/issues).
[//]: # (* You can always refer to the [short documentation]&#40;https://nomic-ai.github.io/pyllamacpp/&#41; for more details.)

If you have any feedback, or you want to share how you are using this project, feel free to use the [Discussions](https://github.com/nomic-ai/pyllamacpp/discussions) and open a new topic.

# License

This project is licensed under the same license as [llama.cpp](https://github.com/ggerganov/llama.cpp/blob/master/LICENSE) (MIT [License](./LICENSE)).


This project is licensed under the MIT [License](./LICENSE).

Binary file removed docs/demo.gif
Binary file not shown.
9 changes: 2 additions & 7 deletions docs/index.md
Original file line number Diff line number Diff line change
@@ -1,10 +1,5 @@
# PyLLaMaCpp API Reference
# PyGPT-J API Reference


::: pyllamacpp.model
::: pygpt4all.models

::: pyllamacpp.constants
options:
show_if_no_docstring: true

::: pyllamacpp.utils
Loading

0 comments on commit 84fc0ec

Please sign in to comment.