-
Notifications
You must be signed in to change notification settings - Fork 87
Tweaks to improve cleanliness of code; removes warnings that appear in Eclipse #11
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from 1 commit
c142ec7
7d0a5a5
ea956f6
fdd1a59
6385ded
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,10 @@ | ||
<?xml version="1.0" encoding="UTF-8"?> | ||
<classpath> | ||
<classpathentry kind="src" path="src"/> | ||
<classpathentry exported="true" kind="con" path="org.eclipse.jdt.launching.JRE_CONTAINER"> | ||
<attributes> | ||
<attribute name="module" value="true"/> | ||
</attributes> | ||
</classpathentry> | ||
<classpathentry kind="output" path="bin"/> | ||
</classpath> |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,17 @@ | ||
<?xml version="1.0" encoding="UTF-8"?> | ||
<projectDescription> | ||
<name>llama3.java</name> | ||
<comment></comment> | ||
<projects> | ||
</projects> | ||
<buildSpec> | ||
<buildCommand> | ||
<name>org.eclipse.jdt.core.javabuilder</name> | ||
<arguments> | ||
</arguments> | ||
</buildCommand> | ||
</buildSpec> | ||
<natures> | ||
<nature>org.eclipse.jdt.core.javanature</nature> | ||
</natures> | ||
</projectDescription> |
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Please remove, there's already a .gitignore at the root |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1 @@ | ||
/com/ |
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Do not move the Llama3.java file outside the root. I understand that this is the way IDEs work, but the project is meant to be cloned and executed with the minimum friction possible. |
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -85,44 +85,45 @@ static void runInteractive(Llama model, Sampler sampler, Options options) { | |
conversationTokens.addAll(chatFormat.encodeMessage(new ChatFormat.Message(ChatFormat.Role.SYSTEM, options.systemPrompt()))); | ||
} | ||
int startPosition = 0; | ||
Scanner in = new Scanner(System.in); | ||
while (true) { | ||
System.out.print("> "); | ||
System.out.flush(); | ||
String userText = in.nextLine(); | ||
if (List.of("quit", "exit").contains(userText)) { | ||
break; | ||
} | ||
if (state == null) { | ||
state = model.createNewState(); | ||
} | ||
conversationTokens.addAll(chatFormat.encodeMessage(new ChatFormat.Message(ChatFormat.Role.USER, userText))); | ||
conversationTokens.addAll(chatFormat.encodeHeader(new ChatFormat.Message(ChatFormat.Role.ASSISTANT, ""))); | ||
Set<Integer> stopTokens = chatFormat.getStopTokens(); | ||
List<Integer> responseTokens = Llama.generateTokens(model, state, startPosition, conversationTokens.subList(startPosition, conversationTokens.size()), stopTokens, options.maxTokens(), sampler, options.echo(), token -> { | ||
if (options.stream()) { | ||
if (!model.tokenizer().isSpecialToken(token)) { | ||
System.out.print(model.tokenizer().decode(List.of(token))); | ||
} | ||
} | ||
}); | ||
// Include stop token in the prompt history, but not in the response displayed to the user. | ||
conversationTokens.addAll(responseTokens); | ||
startPosition = conversationTokens.size(); | ||
Integer stopToken = null; | ||
if (!responseTokens.isEmpty() && stopTokens.contains(responseTokens.getLast())) { | ||
stopToken = responseTokens.getLast(); | ||
responseTokens.removeLast(); | ||
} | ||
if (!options.stream()) { | ||
String responseText = model.tokenizer().decode(responseTokens); | ||
System.out.println(responseText); | ||
} | ||
if (stopToken == null) { | ||
System.err.println("Ran out of context length..."); | ||
break; | ||
} | ||
} | ||
try (Scanner in = new Scanner(System.in)) { | ||
while (true) { | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The indentation here is too high, please follow the same as the rest of the file (4 spaces). There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It looks a bit long in this snippet, but I'm looking at the code right now and the indentation seems to be fine. It was formatted this way by my IDE automatically. You may need to tweak this on your end to get whatever specific formatting you'd prefer, since I tried the two spaces thing and it made the code look very strange IMO. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. You are using tabs instead of spaces, GitHub render tabs as 8 spaces. Your IDE should have a setting to set it. Please keep 4 spaces as the rest of the file. |
||
System.out.print("> "); | ||
System.out.flush(); | ||
String userText = in.nextLine(); | ||
if (List.of("quit", "exit").contains(userText)) { | ||
break; | ||
} | ||
if (state == null) { | ||
state = model.createNewState(); | ||
} | ||
conversationTokens.addAll(chatFormat.encodeMessage(new ChatFormat.Message(ChatFormat.Role.USER, userText))); | ||
conversationTokens.addAll(chatFormat.encodeHeader(new ChatFormat.Message(ChatFormat.Role.ASSISTANT, ""))); | ||
Set<Integer> stopTokens = chatFormat.getStopTokens(); | ||
List<Integer> responseTokens = Llama.generateTokens(model, state, startPosition, conversationTokens.subList(startPosition, conversationTokens.size()), stopTokens, options.maxTokens(), sampler, options.echo(), token -> { | ||
if (options.stream()) { | ||
if (!model.tokenizer().isSpecialToken(token)) { | ||
System.out.print(model.tokenizer().decode(List.of(token))); | ||
} | ||
} | ||
}); | ||
// Include stop token in the prompt history, but not in the response displayed to the user. | ||
conversationTokens.addAll(responseTokens); | ||
startPosition = conversationTokens.size(); | ||
Integer stopToken = null; | ||
if (!responseTokens.isEmpty() && stopTokens.contains(responseTokens.getLast())) { | ||
stopToken = responseTokens.getLast(); | ||
responseTokens.removeLast(); | ||
} | ||
if (!options.stream()) { | ||
String responseText = model.tokenizer().decode(responseTokens); | ||
System.out.println(responseText); | ||
} | ||
if (stopToken == null) { | ||
System.err.println("Ran out of context length..."); | ||
break; | ||
} | ||
} | ||
} | ||
} | ||
|
||
static void runInstructOnce(Llama model, Sampler sampler, Options options) { | ||
|
@@ -346,7 +347,8 @@ public int byteSize() { | |
} | ||
} | ||
|
||
private void loadModelImpl(FileChannel fileChannel) throws IOException { | ||
@SuppressWarnings("preview") | ||
private void loadModelImpl(FileChannel fileChannel) throws IOException { | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Too much indentation. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. |
||
// The header of the file. | ||
readHeader(fileChannel); // gguf_header_t header; | ||
// Tensor infos, which can be used to locate the tensor data. | ||
|
@@ -726,7 +728,7 @@ private static Tokenizer createTokenizer(Map<String, Object> metadata, Vocabular | |
|
||
int allTokens = vocabulary.size(); | ||
int baseTokens = 128000; // assume all tokens after the base ones are special. | ||
int reservedSpecialTokens = allTokens - baseTokens; | ||
//int reservedSpecialTokens = allTokens - baseTokens; | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This is indeed unused, do not comment it out, remove instead. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Gotcha. |
||
List<String> specialTokensList = Arrays.stream(vocabulary.tokens(), baseTokens, allTokens).toList(); | ||
|
||
assert specialTokensList.stream().allMatch(token -> vocabulary.getIndex(token).isPresent()); | ||
|
@@ -1866,11 +1868,15 @@ public static Pair<float[], float[]> precomputeFreqsCis(int contextLength, int h | |
if (ropeScaling) { | ||
// Llama 3.1 scaling | ||
float loFreqWavelen = oldContextLength / loFreqFactor; | ||
float hiFreqWavelen = oldContextLength / hiFreqFactor; | ||
//float hiFreqWavelen = oldContextLength / hiFreqFactor; | ||
float wavelen = (float) (2.0 * Math.PI / freq); | ||
if (wavelen < hiFreqWavelen) { | ||
|
||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'm not arguing that, it's just that you're doing this after that check:
This doesn't do anything, technically. You basically comment it out with no change to functionality. That said, I agree that having it there (even if commented out) is crucial to understanding the code. I'll reset the code to how you had it though. I'm curious- what program are you writing this in? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I ported it from the Python snippet, you can refactor it into something equivalent e.g. substitute the freq = freq; by a comment stating that no modification is done in that case, although you may get a warning about an empty code block. It must remain functionally equivalent to the original (this is a hard constraint); and ideally close to the original as well, so it can be easily understood and compared by others.
I use IntelliJ and casually (neo)vim and run it from the command-line via |
||
//This doesn't do anything, so it triggers a warning. | ||
/*if (wavelen < hiFreqWavelen) { | ||
freq = freq; | ||
} else if (wavelen > loFreqWavelen) { | ||
} else */ | ||
|
||
if (wavelen > loFreqWavelen) { | ||
freq = freq / scaleFactor; | ||
} else { | ||
float smooth = (oldContextLength / wavelen - loFreqFactor) / (hiFreqFactor - loFreqFactor); | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please do not include any project files, having the whole project as a single file at the root is deliberate.