-
Notifications
You must be signed in to change notification settings - Fork 1
Instruction token masking [WIP] #1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Closed
Closed
+19
−0
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Fix minor issues
Disable row-parallelism for now
…927) * [bug-fix] enable finetuning option(set optimizer params correctly) * change load_checkpoint --------- Co-authored-by: logan.eo <[email protected]>
[Bug] Make Configs Consistent
* fix list[tensor] typing in both scripts * Update NeoXArgs docs automatically * add bf16 saving to conversion scripts * make precision check more complex for v1.0 * Update NeoXArgs docs automatically * Update NeoXArgs docs automatically --------- Co-authored-by: haileyschoelkopf <[email protected]> Co-authored-by: github-actions <[email protected]> Co-authored-by: Hailey Schoelkopf <[email protected]>
remove password based login for root
* add bf16 configuration Signed-off-by: Dashiell Stander <[email protected]> * Update NeoXArgs docs automatically * pre commit Signed-off-by: Dashiell Stander <[email protected]> * Update NeoXArgs docs automatically * Rework deriving precision Signed-off-by: Dashiell Stander <[email protected]> * Update NeoXArgs docs automatically * Belt and suspenders Signed-off-by: Dashiell Stander <[email protected]> * Update NeoXArgs docs automatically * Make the default setup (of only using fp16 dict) work Signed-off-by: Dashiell Stander <[email protected]> * Update NeoXArgs docs automatically * Got rid of bf16 argument Signed-off-by: Dashiell Stander <[email protected]> * Update NeoXArgs docs automatically * Update NeoXArgs docs automatically * Update NeoXArgs docs automatically * Re-add detailed bf16 message * Update NeoXArgs docs automatically * Remove unused import * Update NeoXArgs docs automatically * remove useless newline * Update NeoXArgs docs automatically * re-add detailed bf16 message to deepspeed_args * Update NeoXArgs docs automatically --------- Signed-off-by: Dashiell Stander <[email protected]> Co-authored-by: github-actions <[email protected]> Co-authored-by: Quentin Anthony <[email protected]>
* update torch and cuda * Update NeoXArgs docs automatically * Update NeoXArgs docs automatically --------- Co-authored-by: github-actions <[email protected]> Co-authored-by: Quentin Anthony <[email protected]>
Remove duplicate deepspeed config and allow forced multinode
* Pre-commit Signed-off-by: Dashiell Stander <[email protected]> * Update NeoXArgs docs automatically * Do not check for overflow if not using fp16 Signed-off-by: Dashiell Stander <[email protected]> * Update NeoXArgs docs automatically --------- Signed-off-by: Dashiell Stander <[email protected]> Co-authored-by: github-actions <[email protected]>
…arding (#907) * added HF to NeoX 2.0 conversion script with mp and pp sharding * (1) added missing curly brace to pythial/1-4B config; (2) fixed a bug related to a hardcoded value withing the conversion script (3) fixed possible bugs in the conversion script wrt the mp sharding convention --------- Co-authored-by: Quentin Anthony <[email protected]>
* remove row parallelism * Update NeoXArgs docs automatically --------- Co-authored-by: Quentin-Anthony <[email protected]> Co-authored-by: github-actions <[email protected]>
…arguments (#948) * base64 encode the megatron config as well Signed-off-by: Dashiell Stander <[email protected]> * base64 encode the megatron config as well Signed-off-by: Dashiell Stander <[email protected]> * Update NeoXArgs docs automatically * Update NeoXArgs docs automatically --------- Signed-off-by: Dashiell Stander <[email protected]> Co-authored-by: github-actions <[email protected]> Co-authored-by: Quentin Anthony <[email protected]>
Fix yml error
* added a simple script for multi-node data preparation. * added a simple script for multi-node data preparation. * fixed minor bugs regarding prefixing of the .bin and .idx files * fixed minor bugs regarding prefixing of the .bin and .idx files * fixed minor bugs regarding prefixing of the .bin and .idx files
…heck (#959) * update conversion script instructions in readme * rename v1.0 script (now default for 2.0) to module_to_hf * Update NeoXArgs docs automatically --------- Co-authored-by: github-actions <[email protected]>
* added HF to NeoX 2.0 conversion script with mp and pp sharding * (1) added missing curly brace to pythial/1-4B config; (2) fixed a bug related to a hardcoded value withing the conversion script (3) fixed possible bugs in the conversion script wrt the mp sharding convention * fill in minimal possible mask values * initialize tensor on the target device --------- Co-authored-by: Quentin Anthony <[email protected]>
* added HF to NeoX 2.0 conversion script with mp and pp sharding * (1) added missing curly brace to pythial/1-4B config; (2) fixed a bug related to a hardcoded value withing the conversion script (3) fixed possible bugs in the conversion script wrt the mp sharding convention * added GeLU fast for HF model, added barriers to enable conversion across multiple nodes, removed partially hardcoded pythia model name * commented unecessary logging and timers --------- Co-authored-by: Quentin Anthony <[email protected]>
* add an optional `label` field passed in parallel with training data. * minor fix; Add doc * fix * fix data can be None * prevent loading optimizer * add script * Remove some print() stmts, make mask documentation clearer * Add documentation for preprocess_data_with_mask.py --------- Co-authored-by: Hailey Schoelkopf <[email protected]>
* - Add KTO Post-training example * fix reward not finalizing
* readded RM training removed during merge conflict in KTO * - parallel output updated
…283) * preliminary epoch setting * first working iteration * train_epochs_special_case * handle flags * fix bugs * working single path case * working multi-path without eval * remove unused files * additional checks * remove print statement * apply precommit * add lr_decay_fraction * spelling --------- Co-authored-by: Quentin Anthony <[email protected]>
* hotfix * precommit --------- Co-authored-by: Quentin Anthony <[email protected]>
* add asserts and fix post training readme * precommit --------- Co-authored-by: Quentin Anthony <[email protected]>
* fix typo * fix neoxargs usage test * skip conversion test due to multiprocessing issue * precommit --------- Co-authored-by: Quentin Anthony <[email protected]>
* Add ERROR logging prefix and sort alphabetically * fix comment
- do not create a fake head dim and split the 'mixed_x_layer' into QKV layers directly.
…ype' option was removed (#1309) * fix 'intermediate_size' in Llama configuration files after the 'mlp_type' option was removed * config adjustments for llama and gated activations * pre-commit --------- Co-authored-by: jahatef <[email protected]> Co-authored-by: Quentin Anthony <[email protected]>
* Python 3.10 support In this issue Python 3.10 support was added EleutherAI/gpt-neox#1122 * update wording on torch and python --------- Co-authored-by: Quentin Anthony <[email protected]>
* adds pyproject files and tests * formatting and add dev packages to dev req files * improve req testing --------- Co-authored-by: Quentin Anthony <[email protected]>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
DO NOT MERGE