Issue search results

Filter by

6 results

(60 ms)inGeneZC/MiniMA (press backspace or delete to remove)

GeneZC/MiniMA
Distill granite-7b-base?

granite-7b-base is a reproduction of llama-2-7b, but with a more permissive license.

linux-leo

Opened
on Jun 27, 2024

GeneZC/MiniMA
The inference latency of model MiniMA-3B

@GeneZC Why the inference latency of model MiniMA-3B is longer than model Llama-7B: image image

qxpBlog

Opened
on Apr 15, 2024

GeneZC/MiniMA
Inconsistent response from interactive MiniChat-3B

Hi, happy new year!! Good work, first of all!! I am trying to use MiniChat-3B as an interactive Chatbot in my application. However, the response from the model either returns 1. ? or nothing 2. I am ...

rsong0606

Opened
on Jan 6, 2024

GeneZC/MiniMA
Code for Training MiniMoE

Can you please release code for upcycling LLMs to make MoEs? I have a use-case for multi-lingual LLMs where this would be incredibly helpful!

enhancement

ojus1

Opened
on Dec 31, 2023

GeneZC/MiniMA
Distill Mistral 7B?

Mistral-7b is a much better model (and perhaps a teacher) than Llama-2-7b. Would you kindly release checkpoints for a distilled mistral? Would greatly appreciate it!

enhancement

wontfix

ojus1

Opened
on Dec 30, 2023

GeneZC/MiniMA
Getting errors when trying to replicate the distilling operation

Trying with the llama2 base weights. I get the following error: File /root/MiniMA/minima/modules/flash_attn_monkey_patch_sparsellama.py , line 47, in forward assert not use_cache, use_cache is ...

good first issue

l3utterfly

Opened
on Dec 2, 2023

Learn how you can use GitHub Issues to plan and track your work.

Save views for sprints, backlogs, teams, or releases. Rank, sort, and filter issues to suit the occasion. The possibilities are endless.Learn more about GitHub Issues

ProTip!

Press the

key to activate the search input again and adjust your query.

Learn how you can use GitHub Issues to plan and track your work.

Save views for sprints, backlogs, teams, or releases. Rank, sort, and filter issues to suit the occasion. The possibilities are endless.Learn more about GitHub Issues

ProTip!

Restrict your search to the title by using the in:title qualifier.

Languages

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Filter by

State

Advanced

GeneZC/MiniMA
Distill granite-7b-base?

GeneZC/MiniMA
The inference latency of model MiniMA-3B

GeneZC/MiniMA
Inconsistent response from interactive MiniChat-3B

GeneZC/MiniMA
Code for Training MiniMoE

GeneZC/MiniMA
Distill Mistral 7B?

GeneZC/MiniMA
Getting errors when trying to replicate the distilling operation

Learn how you can use GitHub Issues to plan and track your work.

Learn how you can use GitHub Issues to plan and track your work.

issues Search Results · repo:GeneZC/MiniMA language:Python

Filter by

State

Advanced

6 results

GeneZC/MiniMADistill granite-7b-base?

GeneZC/MiniMAThe inference latency of model MiniMA-3B

GeneZC/MiniMAInconsistent response from interactive MiniChat-3B

GeneZC/MiniMACode for Training MiniMoE

GeneZC/MiniMADistill Mistral 7B?

GeneZC/MiniMAGetting errors when trying to replicate the distilling operation

Learn how you can use GitHub Issues to plan and track your work.

Learn how you can use GitHub Issues to plan and track your work.

GeneZC/MiniMA
Distill granite-7b-base?

GeneZC/MiniMA
The inference latency of model MiniMA-3B

GeneZC/MiniMA
Inconsistent response from interactive MiniChat-3B

GeneZC/MiniMA
Code for Training MiniMoE

GeneZC/MiniMA
Distill Mistral 7B?

GeneZC/MiniMA
Getting errors when trying to replicate the distilling operation