-
Notifications
You must be signed in to change notification settings - Fork 12.3k
llama: add initial support for Falcon-H1 model family #14534
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
+585
−9
Merged
Changes from all commits
Commits
Show all changes
112 commits
Select commit
Hold shift + click to select a range
991de6c
v1
younesbelkada f897efd
push more fixes
younesbelkada 71a6848
another fix
younesbelkada 03568c9
fix
younesbelkada 0c93ef6
more fixes
younesbelkada fdd5cff
minor fix
younesbelkada 14c37ec
more cleaning on python code
younesbelkada 8bea922
python fixes
ibrahimkhadraoui 071f4b7
changed precision for multipliers float 32->64
ibrahimkhadraoui 50eadc7
fixes
younesbelkada a39a842
merge
younesbelkada 1415cd8
another fix
younesbelkada 243e4d1
fix
younesbelkada cce3549
pre-norm -> norm
younesbelkada 22de62c
fix
younesbelkada 2fe057c
Revert "fix"
ibrahimkhadraoui d22b4ea
Merge branch 'add-fh1-rebased' of https://github.com/tiiuae/llama.cpp…
ibrahimkhadraoui 6c7d9e2
fix
younesbelkada 15138df
small fix ffn_norm
ibrahimkhadraoui a6d0067
Merge branch 'add-fh1-rebased' of https://github.com/tiiuae/llama.cpp…
ibrahimkhadraoui 1fd0574
try
younesbelkada 250b4f1
mix instead of max
younesbelkada 3ee7983
fix vocab size
ibrahimkhadraoui 2aa48dd
Merge branch 'add-fh1-rebased' of https://github.com/tiiuae/llama.cpp…
ibrahimkhadraoui 9760c8b
conflict solve
ibrahimkhadraoui 7a25441
fixed multipliers
ibrahimkhadraoui 280dd2d
falcon-h1 specefic vocab resolved
ibrahimkhadraoui c56ec07
read arch from gguf.MODEL_ARCH
ibrahimkhadraoui c4af0f3
mamba_d_ssm added to d_inner find_hparam
ibrahimkhadraoui 53304c8
remove unused functions from gguf_writer.py
ibrahimkhadraoui 441d8d6
override modify_tensors instead of get_tensors
ibrahimkhadraoui 6c39e77
fix conversion and d_inner
younesbelkada 8c50893
added some cb functions for debugging puposes
ibrahimkhadraoui 49d7420
inp_out_ids moved outside of layers loop
ibrahimkhadraoui 97011d7
mup_vec create as float64
ibrahimkhadraoui 286e1fa
fix rope_theta
ibrahimkhadraoui b3bc1fb
Merge branch 'add-fh1-rebased' of https://github.com/tiiuae/llama.cpp…
ibrahimkhadraoui a9f3a63
injected mup
younesbelkada e96cc73
clean ups
younesbelkada 3afb2a8
Merge pull request #1 from tiiuae/injected-mup
ibrahimkhadraoui 0ad3502
rm extra space
ibrahimkhadraoui 53446f7
rm unused MAMBA_CHUNK_SIZE
ibrahimkhadraoui ae937f4
rm unused key
ibrahimkhadraoui b6df0a4
add bos False
ibrahimkhadraoui 935d46f
changed ROPE_TYPE
ibrahimkhadraoui 624699c
cleaning debugging stuff
ibrahimkhadraoui 042e5ff
cleaning debug quant
ibrahimkhadraoui f74e266
fix comment
younesbelkada 632861e
some cleanups
younesbelkada 084873c
some cleanups
younesbelkada fd20330
Update src/llama-model-loader.cpp
younesbelkada 68cb784
more cleanups
younesbelkada d2f46f1
moe cleanuips
younesbelkada 7d7da0b
d_ssm -> d_inner;
younesbelkada 67b2664
cleaning unused hparams
ibrahimkhadraoui da8a338
Merge branch 'add-fh1-rebased' of https://github.com/tiiuae/llama.cpp…
ibrahimkhadraoui e63ee46
cleanup
ibrahimkhadraoui d473d42
more cleanups
younesbelkada 8555ee8
more cleanups on python conversion;
younesbelkada 7846c67
minor cleanups
ibrahimkhadraoui 2dee7cf
Apply suggestions from code review
younesbelkada a846d02
remove todo
younesbelkada f028a43
Merge branch 'add-fh1-rebased' of https://github.com/tiiuae/llama.cpp…
ibrahimkhadraoui d41f111
Merge branch 'add-fh1-rebased' of https://github.com/tiiuae/llama.cpp…
ibrahimkhadraoui f266d14
added falcon-h1
ibrahimkhadraoui 4bc9e0c
tensor not required
younesbelkada 2834a4a
clean
ibrahimkhadraoui 823696b
remove unneeded attributes
younesbelkada adff470
more cleanups and fixed conversion
younesbelkada 097df0e
remove final_norm
younesbelkada 9a048d8
flake8 fixes
ibrahimkhadraoui 52d1ef3
Merge branch 'add-fh1-rebased' of https://github.com/tiiuae/llama.cpp…
ibrahimkhadraoui 58e3866
Update src/llama-model.cpp
younesbelkada d28c31a
Merge branch 'master' into add-fh1-rebased
younesbelkada 9b92648
flake8 fixes
ibrahimkhadraoui 7fe1794
Update src/llama-hparams.cpp
ibrahimkhadraoui 40058c0
Update src/llama-model.cpp
ibrahimkhadraoui debf4e5
Update src/llama-model.cpp
ibrahimkhadraoui 212edff
Update src/llama-arch.cpp
ibrahimkhadraoui 90ddf24
Update convert_hf_to_gguf.py
ibrahimkhadraoui 7edf380
Merge branch 'add-fh1-rebased' of https://github.com/tiiuae/llama.cpp…
ibrahimkhadraoui c3c5d51
added hashes
ibrahimkhadraoui f8d7c97
Update src/llama-arch.cpp
younesbelkada 4610ee2
Update src/llama-vocab.cpp
younesbelkada 082ab4a
update the update file
younesbelkada c5515e3
Revert "update the update file"
younesbelkada 1ef53b3
fix: address suggestions
younesbelkada d5efbd0
fix: update convert_hf_to_gguf.py
younesbelkada a5afc8b
Update gguf-py/gguf/constants.py
younesbelkada 99f9a3d
Update src/llama-model-loader.cpp
younesbelkada c3c64c3
d_inner fixed
ibrahimkhadraoui 63e3afc
Update src/llama-model.cpp
younesbelkada d758578
reshaping ssm_norm for 34B
ibrahimkhadraoui 8972c15
Merge branch 'add-fh1-rebased' of https://github.com/tiiuae/llama.cpp…
ibrahimkhadraoui 7897c21
removing generate_mup
ibrahimkhadraoui 6403caa
remove duplicates metadata keys
ibrahimkhadraoui 710630a
rm comment
ibrahimkhadraoui 7b9aa7b
Merge branch 'master' into add-fh1-rebased
younesbelkada ecc5253
final comment
younesbelkada bbca33e
fix unused args
younesbelkada 9f514e3
fix constants
younesbelkada 34c5d83
fix bad merge
younesbelkada 521e823
Update src/llama-model.cpp
younesbelkada 6943f4e
falcon-h1: remove unused ssm_in_b and bad merge
younesbelkada 4d2c94b
Update src/llama-model.cpp
younesbelkada b7c9a99
falcon-h1: fix last comment
younesbelkada 9fd308d
Update convert_hf_to_gguf.py
younesbelkada 51f50bf
falcon-h1: revert add_add_bos(False)
younesbelkada 367d8c5
falcon-h1: fix tied weights
younesbelkada 1fa361b
falcon-h1: remove whitespace
younesbelkada 6dde986
falcon-h1: fix wrong size param
younesbelkada 94ab3a8
falcon-h1: fix whitespace issues
younesbelkada File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One thing I've noticed while working through merge conflicts with GR4: it looks like the Falcon H1 entries in the various model architecture lists are inconsistent in their order (next to
FALCON
in one place and afterERNIE_4_5
in two places inconstants.py
, afterMAMBA2
on the c++ side). Do we want to make this consistent everywhere? I suspect you'll hit this with your merge conflict resolution @compilade