This project provides tools to process and summarize large git diff changelog files by interacting with the Anthropic API. The repository contains two Python scripts:
process_input.py: This script processes a large git diff changelog file, splits it into smaller chunks, and sends these chunks to the Anthropic API to generate concise summaries or commit messages.process_json_text.py: After the summaries are generated, this script extracts the relevant text from the API responses and compiles them into a single output file.
- Chunk Processing: Handles large git diff changelog files by splitting them into manageable chunks.
- API Integration: Interacts with the Anthropic API to generate summaries of each chunk.
- Error Handling: Includes robust error handling to manage API failures.
- JSON Parsing: Extracts and compiles text data from JSON responses.
- Python 3.x
- Required Python packages:
requestsosjson
You can install the necessary packages using:
pip install requests-
Clone the repository:
git clone https://github.com/vic-cieslak/claude-large-git-diff-changelog-processor.git cd claude-large-git-diff-changelog-processor -
Set your Anthropic API key in the
process_input.pyfile:API_KEY = "your_api_key_here"
-
Place your
changelog.txtfile containing the git diff content in the project directory.
Run the process_input.py script to process the changelog.txt file and generate summarized chunks:
python process_input.pyThis will create an output_dir folder where the summarized chunks are saved as JSON files.
After processing, run the process_json_text.py script to extract the summarized text from the JSON files:
python process_json_text.pyThe extracted text will be saved in extracted_texts.txt.
Here's an example of how the output files are organized:
- Input: A large
changelog.txtfile containing git diff content. - Output: Multiple JSON files in the
changelog_chunk_processeddirectory, each containing a summary of a chunk of the original git diff content. - Final Output: An
extracted_texts.txtfile containing all the summarized texts compiled together.
- API Errors: If you encounter HTTP errors during API requests, the script will print the error details for debugging.
- JSON Parsing Errors: If the script fails to decode a JSON file, it will skip the file and notify you.
Contributions are welcome! Please submit a pull request or open an issue for any improvements or bug fixes.
Script was created ad hoc, prompt for generating git commit message could likely be improved. Chunking logic could be improved to not split in certain places of changelog.
This project is licensed under the MIT License. See the LICENSE file for more details.
For any questions or support, please open an issue on the GitHub repository.

