Grammar-Enhanced T5 Summarizer
This model is a fine-tuned version of T5-base for text summarization with grammar-enhanced inputs. It was trained on historical text summaries with explicit grammar structure analysis.
Model Description
- Base Model: T5-base
- Task: Text Summarization
- Training Data: Historical texts with grammar analysis
- Input Format: Structured text with grammar analysis (subjects, verbs, objects, relationships)
- Output Format: Concise summary
Usage
from transformers import T5ForConditionalGeneration, T5Tokenizer
# Load model and tokenizer
model = T5ForConditionalGeneration.from_pretrained("ambrosfitz/summarize-grammar")
tokenizer = T5Tokenizer.from_pretrained("ambrosfitz/summarize-grammar")
# Prepare input
text = "Your text here..."
input_text = f"summarize: {text}"
# Generate summary
inputs = tokenizer(input_text, return_tensors="pt", max_length=512, truncation=True)
outputs = model.generate(**inputs, max_length=150, num_beams=4, length_penalty=2.0)
summary = tokenizer.decode(outputs[0], skip_special_tokens=True)
Training Details
The model was fine-tuned on a dataset of historical texts with additional grammar analysis information. Each input includes:
- Main subjects
- Key verbs
- Objects
- Grammatical relationships
The model achieved a validation loss of 0.8700 during training.
Limitations
This model works best with:
- Historical texts
- Formal writing
- English language content
- Texts that benefit from structural analysis
Citation
If you use this model, please cite:
@misc{grammar-t5-summarizer,
author = {repo_owner},
title = {Grammar-Enhanced T5 Summarizer},
year = {2024},
publisher = {Hugging Face},
journal = {Hugging Face Model Hub},
howpublished = {https://huggingface.co/ambrosfitz/summarize-grammar}
}
- Downloads last month
- 0
Dataset used to train ambrosfitz/summarize-grammar
Evaluation results
- Validation Loss on ambrosfitz/grammar-summaryself-reported0.870
- Model Type on ambrosfitz/grammar-summaryself-reportedT5-base