Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

code issue #30

Closed
xiaohei417 opened this issue May 9, 2024 · 18 comments
Closed

code issue #30

xiaohei417 opened this issue May 9, 2024 · 18 comments
Labels
bug Something isn't working help wanted Extra attention is needed

Comments

@xiaohei417
Copy link

I want to add watermark or open file template re-write new file.

@xiaohei417
Copy link
Author

xiaohei417 commented May 9, 2024

I want to read xxx.docx and write some text for xxx.docx, but no change after re-open file.
please help check my code, thanks.

package main

import (
	"fmt"
	"os"

	"github.com/fumiama/go-docx"
)

func main() {
	readFile, err := os.Open("xxx.docx")
	if err != nil {
		panic(err)
	}
	fileinfo, err := readFile.Stat()
	if err != nil {
		panic(err)
	}
	size := fileinfo.Size()
	doc, err := docx.Parse(readFile, size)
	if err != nil {
		panic(err)
	}
	
	d := docx.LoadBodyItems(doc.Document.Body.Items, nil)
	d.AddParagraph().AddText("ashfjklhsjakdhfkhas").Size("20")
	d.AddParagraph().AddText("ashfjklhsjakdhfkhas").Size("20")
	d.AddParagraph().AddText("ashfjklhsjakdhfkhas").Size("20")
	d.AddParagraph().AddText("ashfjklhsjakdhfkhas").Size("20")
	d.AddParagraph().AddText("ashfjklhsjakdhfkhas").Size("20")
	d.AddParagraph().AddText("ashfjklhsjakdhfkhas").Size("20")
	d.AddParagraph().AddText("ashfjklhsjakdhfkhas").Size("20")
	d.AddParagraph().AddText("ashfjklhsjakdhfkhas").Size("20")
	d.AddParagraph().AddText("ashfjklhsjakdhfkhas").Size("20")
	d.AddParagraph().AddText("ashfjklhsjakdhfkhas").Size("20")
	d.AddParagraph().AddText("ashfjklhsjakdhfkhas").Size("20")
	_, err = d.WriteTo(readFile)
	if err != nil {
		panic(err)
	}
	
	fmt.Println("Plain text:")
	for _, it := range d.Document.Body.Items {
		switch it.(type) {
		case *docx.Paragraph, *docx.Table: // printable
			fmt.Println(it)
		}
	}
}

@xiaohei417 xiaohei417 changed the title I want to add watermark add watermark May 9, 2024
@xiaohei417
Copy link
Author

xiaohei417 commented May 9, 2024

我稍微调整了一下代码,出现了新的错误

package main

import (
	"fmt"
	"os"

	"github.com/fumiama/go-docx"
)

func main() {
	readFile, err := os.Open("xxx.docx")
	if err != nil {
		panic(err)
	}
	fileinfo, err := readFile.Stat()
	if err != nil {
		panic(err)
	}
	size := fileinfo.Size()
	doc, err := docx.Parse(readFile, size)
	if err != nil {
		panic(err)
	}

	doc.AddParagraph().AddText("ashfjklhsjakdhfkhas").Size("20")
	doc.AddParagraph().AddText("ashfjklhsjakdhfkhas").Size("20")
	doc.AddParagraph().AddText("ashfjklhsjakdhfkhas").Size("20")
	doc.AddParagraph().AddText("ashfjklhsjakdhfkhas").Size("20")
	doc.AddParagraph().AddText("ashfjklhsjakdhfkhas").Size("20")
	doc.AddParagraph().AddText("ashfjklhsjakdhfkhas").Size("20")
	doc.AddParagraph().AddText("ashfjklhsjakdhfkhas").Size("20")
	doc.AddParagraph().AddText("ashfjklhsjakdhfkhas").Size("20")
	doc.AddParagraph().AddText("ashfjklhsjakdhfkhas").Size("20")
	doc.AddParagraph().AddText("ashfjklhsjakdhfkhas").Size("20")
	doc.AddParagraph().AddText("ashfjklhsjakdhfkhas").Size("20")

	fmt.Println("Plain text:")
	for _, it := range doc.Document.Body.Items {
		switch it.(type) {
		case *docx.Paragraph, *docx.Table: // printable
			fmt.Println(it)
		}
	}

	f, err := os.Create("xxx2.docx")
	if err != nil {
		panic(err)
	}

	_, err = doc.WriteTo(f)
	if err != nil {
		panic(err)
	}
	f.Close()
}

image

@xiaohei417 xiaohei417 changed the title add watermark code issue May 9, 2024
@fumiama
Copy link
Owner

fumiama commented May 9, 2024

说明你的docx中包含一些本库尚未完全支持的内容。详见 #26

@fumiama fumiama added bug Something isn't working help wanted Extra attention is needed labels May 9, 2024
@xiaohei417
Copy link
Author

xiaohei417 commented May 10, 2024

说明你的docx中包含一些本库尚未完全支持的内容。详见 #26

but my docx file only several lines text, my computer is mac intel

@fumiama
Copy link
Owner

fumiama commented May 10, 2024

说明你的docx中包含一些本库尚未完全支持的内容。详见 #26

but my docx file only several lines text, my computer is mac intel

可以发一下你的文件链接,我有空试试。

@xiaohei417
Copy link
Author

说明你的docx中包含一些本库尚未完全支持的内容。详见 #26

but my docx file only several lines text, my computer is mac intel

可以发一下你的文件链接,我有空试试。

xxx.docx

@xiaohei417
Copy link
Author

@fumiama 请问您那边尝试是否有问题?

@willthrom
Copy link

willthrom commented May 29, 2024

Hi @fumiama
I think I have the same problem.
I even went further, go to https://www.microsoft365.com/?auth=1 create a blank empty word docx document.
I download it, then run a simple test, parse and write, the new file is not working in Microsoft Word or Google Word.
I can open it with WordPad thought.

Here it is the code:

func main() {
	readFile, err := os.Open("blank.docx")
	if err != nil {
		panic(err)
	}
	fileinfo, err := readFile.Stat()
	if err != nil {
		panic(err)
	}
	size := fileinfo.Size()
	doc, err := docx.Parse(readFile, size)
	if err != nil {
		panic(err)
	}

	f, err := os.Create("xxx3.docx")
	if err != nil {
		panic(err)
	}

	_, err = doc.WriteTo(f)
	if err != nil {
		panic(err)
	}
	f.Close()
}

Source file:
blank.docx
Generated file:
xxx3.docx

@willthrom
Copy link

willthrom commented May 29, 2024

I did took a look to the doc document and the items after testing with unidoc what happened after reading/writting the file. Unidoc created lots of sections.

If I check for the last item in my generated document and remove the last one, if it is a sections, the word generate is valid and readable by Microsoft Office

// Sections seems to have a bug, so we need to remove the last 1 item if it is a section
	// interface {}(*github.com/fumiama/go-docx.SectPr) *{XMLName: encoding/xml.Name {Space: "", Local: ""}, PgSz: *github.com/fumiama/go-docx.PgSz {W: (*"encoding/xml.Attr")(0xc000234780), H: (*"encoding/xml.Attr")(0xc0002347b0)}}
	if len(doc.Document.Body.Items) > 0 {
		if _, ok := doc.Document.Body.Items[len(doc.Document.Body.Items)-1].(*docx.SectPr); ok {
			doc.Document.Body.Items = doc.Document.Body.Items[:len(doc.Document.Body.Items)-1]
		}
	}

@xiaohei417 xiaohei417 closed this as not planned Won't fix, can't repro, duplicate, stale Sep 11, 2024
@willthrom
Copy link

Closes as not planned?
As mentioned, this library in its current states doesn't generate properly documents to be read by the Latest version of Microsoft Office.

When the library parser a document, If there is a "section" component (which they are always in Microsoft Office 365), it is adding an extra item with not well formatted XML parameters. PageW

The word generated is not working and that is the reason if I remove the last Item in the body it works in Microsoft Office 365.

We had to add this to our code after parsing the doc.

image

@fumiama
Copy link
Owner

fumiama commented Sep 11, 2024

I would like to solve it, but recently I do not have time to debug. This issue is closed by its author, but I will still fix this problem when I am free. Since you have pointed out the real problem, I will try to fix that.

@willthrom
Copy link

Thanks @fumiama , I will try to help you when I have some free time . providing the exact header for the section failing.

@fumiama
Copy link
Owner

fumiama commented Sep 17, 2024

Hi @fumiama I think I have the same problem. I even went further, go to https://www.microsoft365.com/?auth=1 create a blank empty word docx document. I download it, then run a simple test, parse and write, the new file is not working in Microsoft Word or Google Word. I can open it with WordPad thought.

Here it is the code:

func main() {
	readFile, err := os.Open("blank.docx")
	if err != nil {
		panic(err)
	}
	fileinfo, err := readFile.Stat()
	if err != nil {
		panic(err)
	}
	size := fileinfo.Size()
	doc, err := docx.Parse(readFile, size)
	if err != nil {
		panic(err)
	}

	f, err := os.Create("xxx3.docx")
	if err != nil {
		panic(err)
	}

	_, err = doc.WriteTo(f)
	if err != nil {
		panic(err)
	}
	f.Close()
}

Source file: blank.docx Generated file: xxx3.docx

This blank test seems OK.

@fumiama
Copy link
Owner

fumiama commented Sep 17, 2024

我稍微调整了一下代码,出现了新的错误

package main

import (
	"fmt"
	"os"

	"github.com/fumiama/go-docx"
)

func main() {
	readFile, err := os.Open("xxx.docx")
	if err != nil {
		panic(err)
	}
	fileinfo, err := readFile.Stat()
	if err != nil {
		panic(err)
	}
	size := fileinfo.Size()
	doc, err := docx.Parse(readFile, size)
	if err != nil {
		panic(err)
	}

	doc.AddParagraph().AddText("ashfjklhsjakdhfkhas").Size("20")
	doc.AddParagraph().AddText("ashfjklhsjakdhfkhas").Size("20")
	doc.AddParagraph().AddText("ashfjklhsjakdhfkhas").Size("20")
	doc.AddParagraph().AddText("ashfjklhsjakdhfkhas").Size("20")
	doc.AddParagraph().AddText("ashfjklhsjakdhfkhas").Size("20")
	doc.AddParagraph().AddText("ashfjklhsjakdhfkhas").Size("20")
	doc.AddParagraph().AddText("ashfjklhsjakdhfkhas").Size("20")
	doc.AddParagraph().AddText("ashfjklhsjakdhfkhas").Size("20")
	doc.AddParagraph().AddText("ashfjklhsjakdhfkhas").Size("20")
	doc.AddParagraph().AddText("ashfjklhsjakdhfkhas").Size("20")
	doc.AddParagraph().AddText("ashfjklhsjakdhfkhas").Size("20")

	fmt.Println("Plain text:")
	for _, it := range doc.Document.Body.Items {
		switch it.(type) {
		case *docx.Paragraph, *docx.Table: // printable
			fmt.Println(it)
		}
	}

	f, err := os.Create("xxx2.docx")
	if err != nil {
		panic(err)
	}

	_, err = doc.WriteTo(f)
	if err != nil {
		panic(err)
	}
	f.Close()
}

image

我使用最新代码和你的xxx.docx测试,一切正常。

@willthrom
Copy link

willthrom commented Sep 28, 2024

Microsoft Office Online 365
image
Download file
TestDocx 1.docx
Manipulating file open and save:

        readFile := bytes.NewReader(wordBytes)
	size := int64(len(wordBytes))
	doc, err := docx.Parse(readFile, size)
	if err != nil {
		panic(err)
	}
	
	name := "godup_" + "test" + ".docx"
	f, err := os.Create(name)
	if err != nil {
		panic(err)
	}

	// Save the modified document to a new file
	_, err = doc.WriteTo(f)

	// Close the file
	f.Close()

End result file:
godup_test.docx

Trying to open the file in Microsoft Office

image

But if before saving the file I remove the last child:

	// Sections seems to have a bug, so we need to remove the last 1 item if it is a section
	// interface {}(*github.com/fumiama/go-docx.SectPr) *{XMLName: encoding/xml.Name {Space: "", Local: ""}, PgSz: *github.com/fumiama/go-docx.PgSz {W: (*"encoding/xml.Attr")(0xc000234780), H: (*"encoding/xml.Attr")(0xc0002347b0)}}
	if len(doc.Document.Body.Items) > 0 {
		if _, ok := doc.Document.Body.Items[len(doc.Document.Body.Items)-1].(*docx.SectPr); ok {
			doc.Document.Body.Items = doc.Document.Body.Items[:len(doc.Document.Body.Items)-1]
		}
	}

The written file works flawless

@willthrom
Copy link

willthrom commented Sep 29, 2024

I can see in the final file this section:

                <w:sectPr>
			<w:pgSz w:w="11906" w:w="16838"/>
		</w:sectPr>

which is the one I removed after parsing so when I write the file, the file open successfully, if I leave it, it doesn't with the error in my previous message.

	// Sections seems to have a bug, so we need to remove the last 1 item if it is a section
	// interface {}(*github.com/fumiama/go-docx.SectPr) *{XMLName: encoding/xml.Name {Space: "", Local: ""}, PgSz: *github.com/fumiama/go-docx.PgSz {W: (*"encoding/xml.Attr")(0xc000234780), H: (*"encoding/xml.Attr")(0xc0002347b0)}}
	if len(doc.Document.Body.Items) > 0 {
		if _, ok := doc.Document.Body.Items[len(doc.Document.Body.Items)-1].(*docx.SectPr); ok {
			doc.Document.Body.Items = doc.Document.Body.Items[:len(doc.Document.Body.Items)-1]
		}
	}
`

The originals section in my source file is:

`<w:sectPr>
			<w:headerReference r:id="rId7" w:type="default"/>
			<w:headerReference r:id="rId8" w:type="first"/>
			<w:headerReference r:id="rId9" w:type="even"/>
			<w:footerReference r:id="rId10" w:type="default"/>
			<w:footerReference r:id="rId11" w:type="first"/>
			<w:footerReference r:id="rId12" w:type="even"/>
			<w:pgSz w:h="16838" w:w="11906" w:orient="portrait"/>
			<w:pgMar w:bottom="1134" w:top="1417" w:left="1701" w:right="991" w:header="426" w:footer="272"/>
			<w:pgNumType w:start="1"/>
		</w:sectPr>

But if focus in the w:pgSz w:h tag

source was w:h, when destination is w:w

w:pgSz w:h="16838" w:w="11906"
w:pgSz w:w="11906" w:w="16838"

@willthrom
Copy link

I just took a look to your last version of the code, which is not what I was running and you fixed this a month ago...

image

@fumiama
Copy link
Owner

fumiama commented Oct 1, 2024

I just took a look to your last version of the code, which is not what I was running and you fixed this a month ago...

image

I haven't introduce the version tag because this lib is not stable enough. But happy to see that your problem solved.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working help wanted Extra attention is needed
Projects
None yet
Development

No branches or pull requests

3 participants