Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

No Facts Found In Loaded Document #56

Closed
JeffFerguson opened this issue Dec 10, 2024 · 1 comment
Closed

No Facts Found In Loaded Document #56

JeffFerguson opened this issue Dec 10, 2024 · 1 comment
Assignees
Labels
bug report Bug report.

Comments

@JeffFerguson
Copy link
Owner

The document at https://www.sec.gov/Archives/edgar/data/1688568/000168856818000036/csc-20170331.xml loads into Gepsio properly, but no facts are available once the document is loaded.

@JeffFerguson JeffFerguson added the bug report Bug report. label Dec 10, 2024
@JeffFerguson JeffFerguson self-assigned this Dec 10, 2024
@JeffFerguson
Copy link
Owner Author

The root cause of this issue is that the document is loaded via a stream (due to the document being loaded from the sec.gov domain) and the document's schema reference uses a relative path. This is not supported by the 2.1.0.18 version of Gepsio (which is the version being used to debug the issue), as noted in the documentation for XbrlDocument.Load(Stream):

Gepsio supports streams using the .NET Stream base class, which means that any type of stream supported by .NET and having the Stream class as a base class will be supported by Gepsio. This means that code like this will work:

var webClient = new WebClient();
string readXml = webClient.DownloadString("http://www.xbrl.org/taxonomy/int/fr/ias/ci/pfs/2002-11-15/SampleCompany-2002-11-15.xml");
byte[] byteArray = Encoding.ASCII.GetBytes(readXml);
MemoryStream memStream = new MemoryStream(byteArray);
var newDoc = new XbrlDocument();
newDoc.Load(memStream);

Schema references found in streamed XBRL instances must specify an absolute location, and not a relative location. For example, this schema reference is fine:

xsi:schemaLocation=http://www.xbrlsolutions.com/taxonomies/iso4217/2002-06-30/iso4217.xsd

However, this one is not:

<xbrll:schemaRef xlink:href="msft-20141231.xsd" ... '>

The reason behind this restriction is that Gepsio must load schema references using an absolute location, and uses the location of the XBRL document instance as the reference path when resolving schema relative paths to an absolute location. A schema reference without a path, for example, says "find this schema in the same location as the XBRL document instance referencing the schema". When the XBRL document instance is located through a file path or URL, then the location is known, and the schema reference can be found. When the XBRL document instance is passed in as a stream, however, the instance has no location, per se. Since it has no location, there is no "location starting point" for resolving schema locations using relative paths.

If you try to load an XBRL document instance through a stream, and that stream references a schema through a relative path, then the document will be marked as invalid when the Load() method returns. This code, for example, will load an invalid document instance, since the XBRL document instance references a schema through a relative path:

var webClient = new WebClient();
string readXml = webClient.DownloadString("http://www.sec.gov/Archives/edgar/data/789019/000119312515020351/msft-20141231.xml");
byte[] byteArray = Encoding.ASCII.GetBytes(readXml);
MemoryStream memStream = new MemoryStream(byteArray);
var newDoc = new XbrlDocument();
newDoc.Load(memStream);
// newDoc.IsValid property will be FALSE here. Sad face.

The document's ValidationErrors collection will contain a SchemaValidationError object, which will contain a message similar to the following:

"The XBRL schema at msft-20141231.xsd could not be read because the file could not be found. Because the schema cannot be loaded, some validations will not be able to be performed. Other validation errors reported against this instance may stem from the fact that the schema cannot be loaded. More information on the "file not found" condition is available from the validation error object's inner exception property."

XBRL document instances loaded through a stream which use absolute paths for schema references will be valid (assuming that all of the other XBRL semantics in the instance are correct).

However, this seems restrictive. The Load(Stream) method should take, as an optional parameter, URI specifying the source of the stream's contents, so that relative schema references can be loaded from streams, if the source URI of the stream is actually known. This is not a breaking change, as the proposed stream source parameter would be optional, as in the following:

public void Load(Stream dataStream, string streamSourceUri = null)

JeffFerguson added a commit that referenced this issue Dec 31, 2024
…to report schema load errors and report that no facts are available in the instance. This fixes No Facts Found In Loaded Document #56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug report Bug report.
Projects
None yet
Development

No branches or pull requests

1 participant