Skip to content

A declarative struct-tag-based HTML unmarshaling or scraping package for Go built on top of the htmlquery library

License

Notifications You must be signed in to change notification settings

azlotnikov/goxtag

Folders and files

NameName
Last commit message
Last commit date

Latest commit

1798777 · Feb 26, 2023

History

9 Commits
Nov 25, 2019
Nov 25, 2019
Nov 25, 2019
May 16, 2022
May 13, 2022
Nov 25, 2019
May 13, 2022
May 13, 2022
May 13, 2022
May 13, 2022
May 13, 2022
Feb 26, 2023
May 13, 2022

Repository files navigation

goxtag

GitHub go.mod Go version Build Status Coverage Status

This package is an analog of github.com/andrewstuart/goq for xpath selectors.

Install

go get -u github.com/azlotnikov/goxtag

Example

package main

import (
    "github.com/azlotnikov/goxtag"
    "log"
    "net/http"
)

// Structured representation for github file name table
type example struct {
    Title string `xpath:"//h1"`
    Files []string `xpath:".//table[contains(concat(' ',normalize-space(@class),' '),' files ')]//tbody//tr[contains(concat(' ',normalize-space(@class),' '),' js-navigation-item ')]//td[contains(concat(' ',normalize-space(@class),' '),' content ')]"`
}

func main() {
    res, err := http.Get("https://github.com/azlotnikov/goxtag")
    if err != nil {
        log.Fatal(err)
    }
    defer res.Body.Close()

    var ex example
	
    err = goxtag.NewDecoder(res.Body).Decode(&ex)
    if err != nil {
        log.Fatal(err)
    }

    log.Println(ex.Title, ex.Files)
}

Details

  • You can find info about CannotUnmarshalError in unmarshal-error.go
  • Use xpath_required:"false" if you don't need node not found in document error for not found nodes
  • Use xpath:"-" to ignore field
  • Use Unmarshal(b []byte, v interface{}) error for custom unmarshal

About

A declarative struct-tag-based HTML unmarshaling or scraping package for Go built on top of the htmlquery library

Topics

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages