A Simple HTML Parser.
go get github.com/HarrisonKawagoe3960X/GABAHTMLParser
Just add "github.com/HarrisonKawagoe3960X/GABAHTMLParser"
into import
like this:
package main
import(
"fmt"
"github.com/HarrisonKawagoe3960X/GABAHTMLParser" //Add this
)
func main() {
htmlobject := GABAHTMLParser.GetHTMLfromURL("someurl",false)
results := htmlobject.Find("tag = 'a'")
for _ , result := range results{
fmt.Println(result.InnerHTML)
}
}
htmlobject := GABAHTMLParser.GetHTMLfromURL("someurl",false)
if you parse the source of site that use Shift-JIS encoding, change false
to true
htmlobject := GABAHTMLParser.GetHTMLfromURL("someurl",true)
htmlobject := GABAHTMLParser.ParseHTML(strarray)
After parsing the HTML, you can extract the data by calling Element
.
InnerHTML
: HTML code under the current HTML Element.Tag
: Tag name of current HTML Element.Child
: Child Objects of current HTML Element.Parent
: Parent Object of current HTML Element.Attr
: Properties of current HTML Element.
results := htmlobject.Find("tag = 'a'")
you can combine conditions by using &&
results := htmlobject.Find("tag = 'a' && class = 'hanshin'")