You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
fix: Resolve XML validation error in OutputGitRepoXML function (#16)
* fix: Resolve XML validation error in OutputGitRepoXML function
- Fixed XML generation to properly handle special characters and CDATA sections
- Added protection against premature CDATA termination by escaping "]]>" sequences
- Improved XML formatting with consistent indentation and structure
- Simplified token placeholder replacement without breaking formatting
* feat: Add .gptinclude functionality for selective file inclusion
This commit adds support for a .gptinclude file, which allows users to
explicitly specify which files should be included in the repository export.
The feature complements the existing .gptignore functionality:
- When both .gptinclude and .gptignore exist, files are first filtered
by the include patterns, then any matching ignore patterns are excluded
- Added new command-line flag: -I/--include to specify a custom path
to the .gptinclude file
- Default behavior looks for .gptinclude in repository root
- Added comprehensive tests for the new functionality
- Updated README.md with documentation and examples
With this change, users gain more fine-grained control over which parts
of their repositories are processed by git2gpt, making it easier to focus
on specific areas when working with AI language models.
* fix: Properly handle CDATA sections in XML output
This commit fixes an issue where the XML export would fail with
"unexpected EOF in CDATA section" errors when file content contained
the CDATA end marker sequence ']]>'.
The fix implements a proper CDATA handling strategy that:
- Detects all occurrences of ']]>' in file content
- Splits the content around these markers
- Creates properly nested CDATA sections to preserve the original content
- Ensures all XML output is well-formed regardless of source content
This approach maintains the efficiency of CDATA for storing large code
blocks while ensuring compatibility with all possible file content.
Fixes the XML validation error that would occur when processing files
containing CDATA end marker sequences.
Copy file name to clipboardExpand all lines: README.md
+39-3
Original file line number
Diff line number
Diff line change
@@ -24,18 +24,54 @@ To use the git2gpt utility, run the following command:
24
24
git2gpt [flags] /path/to/git/repository
25
25
```
26
26
27
-
### Ignoring Files
27
+
### Including and Ignoring Files
28
28
29
-
By default, your `.git` directory and your `.gitignore` files are ignored. Any files in your `.gitignore` are also skipped. If you want to change this behavior, you should add a `.gptignore` file to your repository. The `.gptignore` file should contain a list of files and directories to ignore, one per line. The `.gptignore` file should be in the same directory as your `.gitignore` file. Please note that this overwrites the default ignore list, so you should include the default ignore list in your `.gptignore` file if you want to keep it.
29
+
By default, your `.git` directory and your `.gitignore` files are ignored. Any files in your `.gitignore` are also skipped. You can customize the files to include or ignore in several ways:
30
30
31
-
### Flags
31
+
### Including Only Specific Files (.gptinclude)
32
+
33
+
Add a `.gptinclude` file to your repository to specify which files should be included in the output. Each line in the file should contain a glob pattern of files or directories to include. If a `.gptinclude` file is present, only files that match these patterns will be included.
34
+
35
+
Example `.gptinclude` file:
36
+
```
37
+
# Include only these file types
38
+
*.go
39
+
*.js
40
+
*.html
41
+
*.css
42
+
43
+
# Include specific directories
44
+
src/**
45
+
docs/api/**
46
+
```
47
+
48
+
### Ignoring Specific Files (.gptignore)
49
+
50
+
Add a `.gptignore` file to your repository to specify which files should be ignored. This works similar to `.gitignore`, but is specific to git2gpt. The `.gptignore` file should contain a list of files and directories to ignore, one per line.
51
+
52
+
Example `.gptignore` file:
53
+
```
54
+
# Ignore these file types
55
+
*.log
56
+
*.tmp
57
+
*.bak
58
+
59
+
# Ignore specific directories
60
+
node_modules/**
61
+
build/**
62
+
```
63
+
64
+
**Note**: When both `.gptinclude` and `.gptignore` files exist, git2gpt will first include files matching the `.gptinclude` patterns, and then exclude any of those files that also match `.gptignore` patterns.
65
+
66
+
## Command Line Options
32
67
33
68
*`-p`, `--preamble`: Path to a text file containing a preamble to include at the beginning of the output file.
34
69
*`-o`, `--output`: Path to the output file. If not specified, will print to standard output.
35
70
*`-e`, `--estimate`: Estimate the tokens of the output file. If not specified, does not estimate.
36
71
*`-j`, `--json`: Output to JSON rather than plain text. Use with `-o` to specify the output file.
37
72
*`-x`, `--xml`: Output to XML rather than plain text. Use with `-o` to specify the output file.
38
73
*`-i`, `--ignore`: Path to the `.gptignore` file. If not specified, will look for a `.gptignore` file in the same directory as the `.gitignore` file.
74
+
*`-I`, `--include`: Path to the `.gptinclude` file. If not specified, will look for a `.gptinclude` file in the repository root.
39
75
*`-g`, `--ignore-gitignore`: Ignore the `.gitignore` file.
40
76
*`-s`, `--scrub-comments`: Remove comments from the output file to save tokens.
0 commit comments