Enhancing Human-Like Responses in Large Language Models
| 🤗 Models | 📊 Dataset | 📄Paper |
🚀 Human-Like-Llama3-8B-Instruct
This model is a fine-tuned version of meta-llama/Meta-Llama-3-8B-Instruct, specifically optimized to generate more human-like and conversational responses.
The fine-tuning process employed both Low-Rank Adaptation (LoRA) and Direct Preference Optimization (DPO) to enhance natural language understanding, conversational coherence, and emotional intelligence in interactions.
The proccess of creating this models is detailed in the research paper “Enhancing Human-Like Responses in Large Language Models”.
🛠️ Training Configuration
- Base Model: Llama3-8B-Instruct
- Framework: Axolotl v0.4.1
- Hardware: 2x NVIDIA A100 (80 GB) GPUs
- Training Time: ~2 hours 20 minutes
- Dataset: Synthetic dataset with ≈11,000 samples across 256 diverse topics
See axolotl config
axolotl version: 0.4.1
base_model: meta-llama/Meta-Llama-3-8B-Instruct
model_type: LlamaForCausalLM
tokenizer_type: AutoTokenizer
load_in_8bit: true
load_in_4bit: false
strict: false
chat_template: llama3
rl: dpo
datasets:
- path: HumanLLMs/humanish-dpo-project
type: llama3.prompt_pairs
chat_template: llama3
dataset_prepared_path:
val_set_size: 0.05
output_dir: ./humanish-llama3-8b-instruct
sequence_len: 8192
sample_packing: false
pad_to_sequence_len: true
adapter: lora
lora_model_dir:
lora_r: 8
lora_alpha: 4
lora_dropout: 0.05
lora_target_linear: true
lora_fan_in_fan_out:
wandb_project: Humanish-DPO
wandb_entity:
wandb_watch:
wandb_name:
wandb_log_model:
hub_model_id: HumanLLMs/Humanish-LLama3.1-8B-Instruct
gradient_accumulation_steps: 8
micro_batch_size: 2
num_epochs: 1
optimizer: adamw_bnb_8bit
lr_scheduler: cosine
learning_rate: 0.0002
train_on_inputs: false
group_by_length: false
bf16: auto
fp16:
tf32: false
gradient_checkpointing: true
early_stopping_patience:
resume_from_checkpoint:
local_rank:
logging_steps: 1
xformers_attention:
flash_attention: true
s2_attention:
warmup_steps: 10
evals_per_epoch: 2
eval_table_size:
eval_max_new_tokens: 128
saves_per_epoch: 1
debug:
deepspeed:
weight_decay: 0.0
fsdp:
fsdp_config:
save_safetensors: true
💬 Prompt Template
You can use Llama3 prompt template while using the model:
Llama3
<|start_header_id|>system<|end_header_id|>
{system}<|eot_id|>
<|start_header_id|>user<|end_header_id|>
{user}<|eot_id|>
<|start_header_id|>assistant<|end_header_id|>
{assistant}<|eot_id|>
This prompt template is available as a chat template, which means you can format messages using the
tokenizer.apply_chat_template()
method:
messages = [
{"role": "system", "content": "You are helpful AI asistant."},
{"role": "user", "content": "Hello!"}
]
gen_input = tokenizer.apply_chat_template(message, return_tensors="pt")
model.generate(**gen_input)
🤖 Models
Model | Download |
---|---|
Human-Like-Llama-3-8B-Instruct | 🤗 HuggingFace |
Human-Like-Qwen-2.5-7B-Instruct | 🤗 HuggingFace |
Human-Like-Mistral-Nemo-Instruct | 🤗 HuggingFace |
🎯 Benchmark Results
Group | Model | Average | IFEval | BBH | MATH Lvl 5 | GPQA | MuSR | MMLU-PRO |
---|---|---|---|---|---|---|---|---|
Llama Models | Human-Like-Llama-3-8B-Instruct | 22.37 | 64.97 | 28.01 | 8.45 | 0.78 | 2.00 | 30.01 |
Llama-3-8B-Instruct | 23.57 | 74.08 | 28.24 | 8.68 | 1.23 | 1.60 | 29.60 | |
Difference (Human-Like) | -1.20 | -9.11 | -0.23 | -0.23 | -0.45 | +0.40 | +0.41 | |
Qwen Models | Human-Like-Qwen-2.5-7B-Instruct | 26.66 | 72.84 | 34.48 | 0.00 | 6.49 | 8.42 | 37.76 |
Qwen-2.5-7B-Instruct | 26.86 | 75.85 | 34.89 | 0.00 | 5.48 | 8.45 | 36.52 | |
Difference (Human-Like) | -0.20 | -3.01 | -0.41 | 0.00 | +1.01 | -0.03 | +1.24 | |
Mistral Models | Human-Like-Mistral-Nemo-Instruct | 22.88 | 54.51 | 32.70 | 7.62 | 5.03 | 9.39 | 28.00 |
Mistral-Nemo-Instruct | 23.53 | 63.80 | 29.68 | 5.89 | 5.37 | 8.48 | 27.97 | |
Difference (Human-Like) | -0.65 | -9.29 | +3.02 | +1.73 | -0.34 | +0.91 | +0.03 |
📊 Dataset
The dataset used for fine-tuning was generated using LLaMA 3 models. The dataset includes 10,884 samples across 256 distinct topics such as technology, daily life, science, history, and arts. Each sample consists of:
- Human-like responses: Natural, conversational answers mimicking human dialogue.
- Formal responses: Structured and precise answers with a more formal tone.
The dataset has been open-sourced and is available at:
More details on the dataset creation process can be found in the accompanying research paper.
📝 Citation
@misc{çalık2025enhancinghumanlikeresponseslarge,
title={Enhancing Human-Like Responses in Large Language Models},
author={Ethem Yağız Çalık and Talha Rüzgar Akkuş},
year={2025},
eprint={2501.05032},
archivePrefix={arXiv},
primaryClass={cs.CL},
url={https://arxiv.org/abs/2501.05032},
}
- Downloads last month
- 176
Model tree for HumanLLMs/Human-Like-LLama3-8B-Instruct
Dataset used to train HumanLLMs/Human-Like-LLama3-8B-Instruct
Collection including HumanLLMs/Human-Like-LLama3-8B-Instruct
Evaluation results
- strict accuracy on IFEval (0-Shot)Open LLM Leaderboard64.980
- normalized accuracy on BBH (3-Shot)Open LLM Leaderboard28.010
- exact match on MATH Lvl 5 (4-Shot)Open LLM Leaderboard8.460
- acc_norm on GPQA (0-shot)Open LLM Leaderboard0.780
- acc_norm on MuSR (0-shot)Open LLM Leaderboard2.000
- accuracy on MMLU-PRO (5-shot)test set Open LLM Leaderboard30.020