Skip to content

Commit

Permalink
Make JSON schema available for verification under https:// URIs
Browse files Browse the repository at this point in the history
After updating gojsonschema to include
xeipuuv/gojsonschema#171 , tests fail with
> unable to validate: Could not read schema from HTTP, response status is 404 Not Found

Before that gojsonschema change, "$ref" links were interpreted by taking
the current schema source file's URI as a base, and treating "$ref"
as relative to this.

For example, starting with the [file://]/image-manifest-schema.json
URI, as used by Validator.Validate (based on the "specs" map), the
>  "$ref": "content-descriptor.json"
reference used to evaluate to file:///content-descriptor.json.
gojsonschema.jsonReferenceLoader would then load these file:///*.json
URIs via _escFS.

After the gojsonschema change, "$ref" links are evaluated relative to
a URI base specified by the "id" attribute inside the schema source,
regardless of the "external" URI passed to the gojsonschema.JSONLoader.

This is consistent with
http://json-schema.org/latest/json-schema-core.html#rfc.section.8 and
http://json-schema.org/latest/json-schema-core.html#rfc.section.9.2
(apart from the "id" vs. "$id" attribute name).

In the same example, [file://]/image-manifest-schema.json URI contains
>  "id": "https://opencontainers.org/schema/image/manifest",
so the same
>  "$ref": "content-descriptor.json"
now evaluates to
"https://opencontainers.org/schema/image/content-descriptor.json",
which is not found by gojsonschema.jsonReferenceLoader (it uses
_escFS only for file:/// URIs), resulting in the 404 quoted above.

This is a minimal fix, making the schema files available to
gojsonschema at the https:// URIs, while continuing to read them from
_escFS.

Because gojsonschema.jsonReferenceLoader can only use the provided fs
for file:/// URIs, we are forced to implement our own
gojsonschema.JSONLoaderFactory and gojsonschema.JSONLoader; something
like this might be more generally useful and should therefore instead
be provided by the gojsonschema library.

This particular JSONLoader{Factory,} implementation, though, is
image-spec specific because it locally works around various
inconsistencies in the image-spec JSON schemas, and thus is not suitable
for gojsonschema as is.

Namely, the specs/*.json schema files use URIs with two URI path prefixes,
https://opencontainers.org/schema/{,image/}
in the top-level "id" attributes, and the nested "id" attributes along
with "$ref" references use _several more_ URI path prefixes, e.g.
>       "id": "https://opencontainers.org/schema/image/manifest/annotations",
>      "$ref": "defs-descriptor.json#/definitions/annotations"
in image-manifest-schema.json specifies the
https://opencontainers.org/schema/image/manifest/defs-descriptor.json
URI.

In fact, defs-descriptor.json references use all of the following URIs:
> https://opencontainers.org/schema/defs-descriptor.json
> https://opencontainers.org/schema/image/defs-descriptor.json
> https://opencontainers.org/schema/image/descriptor/defs-descriptor.json
> https://opencontainers.org/schema/image/index/defs-descriptor.json
> https://opencontainers.org/schema/image/manifest/defs-descriptor.json

So, this commit introduces a loader which preserves the original _escFS
layout by recognizing and stripping all of these prefixes, and using
the same /*.json paths for _escFS lookups as before; this is clearly
unsuitable for gojsonschema inclusion.

Finally, the reason this commit uses such a fairly hacky loader is that merely
changing the _escFS structure is still not sufficient to get consistent
schema: the schema/*.json paths in this repository, and the "$ref" values,
do not match the "id" values inside the schemas at all.  E.g.
image-manifest-schema.json refers to
https://opencontainers.org/schema/image/manifest/content-descriptor.json ,
while content-descriptor.json identifies itself as
https://opencontainers.org/schema/descriptor , matching neither the path prefix
nor the file name.

Overall, it is completely unclear to me which of the URIs is the canonical URI
of the "content descriptor" schema, and the owner of the URI namespace
needs to decide on the canonical schema URIs.  Only afterwards can the
code be cleanly modified to match the specification; until then, this
commit at least keeps the tests passing, and the validator usable
by external callers who want to use the public
image-spec/schema.ValidateMediaType*.Validate() API.

Signed-off-by: Miloslav Trmač <mitr@redhat.com>
  • Loading branch information
mtrmac committed Feb 6, 2018
1 parent f2b7079 commit 1ed8b65
Show file tree
Hide file tree
Showing 3 changed files with 157 additions and 7 deletions.
126 changes: 126 additions & 0 deletions schema/loader.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,126 @@
// Copyright 2018 The Linux Foundation
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.

package schema

import (
"bytes"
"encoding/json"
"fmt"
"io"
"io/ioutil"
"net/http"
"strings"

"github.com/xeipuuv/gojsonreference"
"github.com/xeipuuv/gojsonschema"
)

// fsLoaderFactory implements gojsonschema.JSONLoaderFactory by reading files under the specified namespaces from the root of fs.
type fsLoaderFactory struct {
namespaces []string
fs http.FileSystem
}

// newFSLoaderFactory returns a fsLoaderFactory reading files under the specified namespaces from the root of fs.
func newFSLoaderFactory(namespaces []string, fs http.FileSystem) *fsLoaderFactory {
return &fsLoaderFactory{
namespaces: namespaces,
fs: fs,
}
}

func (factory *fsLoaderFactory) New(source string) gojsonschema.JSONLoader {
return &fsLoader{
factory: factory,
source: source,
}
}

// refContents returns the contents of ref, if available in fsLoaderFactory.
func (factory *fsLoaderFactory) refContents(ref gojsonreference.JsonReference) ([]byte, error) {
refStr := ref.String()
path := ""
for _, ns := range factory.namespaces {
if strings.HasPrefix(refStr, ns) {
path = "/" + strings.TrimPrefix(refStr, ns)
break
}
}
if path == "" {
return nil, fmt.Errorf("Schema reference %#v unexpectedly not available in fsLoaderFactory with namespaces %#v", path, factory.namespaces)
}

f, err := factory.fs.Open(path)
if err != nil {
return nil, err
}
defer f.Close()

return ioutil.ReadAll(f)
}

// fsLoader implements gojsonschema.JSONLoader by reading the document named by source from a fsLoaderFactory.
type fsLoader struct {
factory *fsLoaderFactory
source string
}

// JsonSource implements gojsonschema.JSONLoader.JsonSource. The "Json" capitalization needs to be maintained to conform to the interface.
func (l *fsLoader) JsonSource() interface{} { // nolint: golint
return l.source
}

func (l *fsLoader) LoadJSON() (interface{}, error) {
// Based on gojsonschema.jsonReferenceLoader.LoadJSON.
reference, err := gojsonreference.NewJsonReference(l.source)
if err != nil {
return nil, err
}

refToURL := reference
refToURL.GetUrl().Fragment = ""

body, err := l.factory.refContents(refToURL)
if err != nil {
return nil, err
}

return decodeJSONUsingNumber(bytes.NewReader(body))
}

// decodeJSONUsingNumber returns JSON parsed from an io.Reader
func decodeJSONUsingNumber(r io.Reader) (interface{}, error) {
// Copied from gojsonschema.
var document interface{}

decoder := json.NewDecoder(r)
decoder.UseNumber()

err := decoder.Decode(&document)
if err != nil {
return nil, err
}

return document, nil
}

// JsonReference implements gojsonschema.JSONLoader.JsonReference. The "Json" capitalization needs to be maintained to conform to the interface.
func (l *fsLoader) JsonReference() (gojsonreference.JsonReference, error) { // nolint: golint
return gojsonreference.NewJsonReference(l.JsonSource().(string))
}

func (l *fsLoader) LoaderFactory() gojsonschema.JSONLoaderFactory {
return l.factory
}
36 changes: 30 additions & 6 deletions schema/schema.go
Original file line number Diff line number Diff line change
Expand Up @@ -35,13 +35,37 @@ var (
// having the OCI JSON schema files in root "/".
fs = _escFS(false)

// specs maps OCI schema media types to schema files.
// schemaNamespaces is a set of URI prefixes which are treated as containing the schema files of fs.
// This is necessary because *.json schema files in this directory use "id" and "$ref" attributes which evaluate to such URIs, e.g.
// ./image-manifest-schema.json URI contains
// "id": "https://opencontainers.org/schema/image/manifest",
// and
// "$ref": "content-descriptor.json"
// which evaluates as a link to
// "https://opencontainers.org/schema/image/content-descriptor.json",
//
// To support such links without accessing the network (and trying to load content which is not hosted at these URIs),
// fsLoaderFactory accepts any URI starting with one of the schemaNamespaces below,
// and uses _escFS to load them from the root of its in-memory filesystem tree.
//
// (Note that this must contain subdirectories before its parent directories for fsLoaderFactory.refContents to work.)
schemaNamespaces = []string{
"https://opencontainers.org/schema/image/descriptor/",
"https://opencontainers.org/schema/image/index/",
"https://opencontainers.org/schema/image/manifest/",
"https://opencontainers.org/schema/image/",
"https://opencontainers.org/schema/",
}

// specs maps OCI schema media types to schema URIs.
// These URIs are expected to be used only by fsLoaderFactory (which trims schemaNamespaces defined above)
// and should never cause a network access.
specs = map[Validator]string{
ValidatorMediaTypeDescriptor: "content-descriptor.json",
ValidatorMediaTypeLayoutHeader: "image-layout-schema.json",
ValidatorMediaTypeManifest: "image-manifest-schema.json",
ValidatorMediaTypeImageIndex: "image-index-schema.json",
ValidatorMediaTypeImageConfig: "config-schema.json",
ValidatorMediaTypeDescriptor: "https://opencontainers.org/schema/content-descriptor.json",
ValidatorMediaTypeLayoutHeader: "https://opencontainers.org/schema/image/image-layout-schema.json",
ValidatorMediaTypeManifest: "https://opencontainers.org/schema/image/image-manifest-schema.json",
ValidatorMediaTypeImageIndex: "https://opencontainers.org/schema/image/image-index-schema.json",
ValidatorMediaTypeImageConfig: "https://opencontainers.org/schema/image/config-schema.json",
}
)

Expand Down
2 changes: 1 addition & 1 deletion schema/validator.go
Original file line number Diff line number Diff line change
Expand Up @@ -67,7 +67,7 @@ func (v Validator) Validate(src io.Reader) error {
}
}

sl := gojsonschema.NewReferenceLoaderFileSystem("file:///"+specs[v], fs)
sl := newFSLoaderFactory(schemaNamespaces, fs).New(specs[v])
ml := gojsonschema.NewStringLoader(string(buf))

result, err := gojsonschema.Validate(sl, ml)
Expand Down

0 comments on commit 1ed8b65

Please sign in to comment.