I try to replace a href
with a string using goquerys
function ReplaceWithHtml()
. The href
contains queries, one of which is ®ion=
. Somehow, it is identified by goquery
as the html entity for registered trademark ®
although the ;
is missing:
got : <html><head></head><body>href{http://www.nytimes.com/action=keypress®ion=FixedLeft}{Text}</body></html>
---^---
want: <html><head></head><body>href{http://www.nytimes.com/action=keypress®ion=FixedLeft}{Text}</head><body>
Minium working example:
package main
import (
"fmt"
"strings"
"github.com/PuerkitoBio/goquery"
"golang.org/x/net/html"
)
func main() {
test := `<a href="http://www.nytimes.com/action=keypress&region=FixedLeft">Text</a>`
node, _ := html.Parse(strings.NewReader(test))
doc := goquery.NewDocumentFromNode(node)
convertLink(doc)
got, _ := doc.Html()
want := `<html><head></head><body>href{http://www.nytimes.com/action=keypress®ion=FixedLeft}{Text} After</head><body>`
fmt.Println("got : " + got)
fmt.Println("want: " + want)
}
func convertLink(doc *goquery.Document) {
//html, _ := doc.Html()
//fmt.Println("Before : " + html)
doc.Find("a").Each(func(_ int, s *goquery.Selection) {
href, _ := s.Attr("href")
text := s.Text()
replace := "\href{" + href + "}{" + text + "}"
//fmt.Println("Replace: " + replace)
s.ReplaceWithHtml(replace)
})
//html, _ = doc.Html()
//fmt.Println("After : " + html)
//fmt.Println("")
}
When uncommenting the comments you can see, that the string replace
is still correct. Only after calling ReplaceWithHtml
the document doc
converted it to the trademark sign.
What am I doing wrong?
Aucun commentaire:
Enregistrer un commentaire