Skip to content

Commit

Permalink
Merge pull request #46 from keith-hall/st4_fixes
Browse files Browse the repository at this point in the history
ST4 fixes
  • Loading branch information
rosshadden authored Nov 7, 2024
2 parents f4916d1 + 9e72e1d commit 01ae724
Show file tree
Hide file tree
Showing 9 changed files with 519 additions and 296 deletions.
1 change: 1 addition & 0 deletions .python-version
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
3.8
31 changes: 31 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -87,6 +87,37 @@ The recommended way to install the Sublime Text XPath plugin is via [Package Con

## Troubleshooting

## Context menu items disabled

This can happen if the `lxml` dependency didn't load properly. You'll see errors in the ST console.

### Mac
On a Mac with Apple silicon, the version of lxml installed by Package Control 4 doesn't seem to work.
In ST console, we can see that ST build 4180 is using Python 3.8.12:
```python
import sys; sys.version_info
```
So you can build lxml manually using this version of Python. You will need to download the source code release asset, as lxml's git repository doesn't contain some `.c` files which are required to build. lxml v5.1.1 for sure works:
- Navigate to https://github.com/lxml/lxml/releases/tag/lxml-5.1.1
- Download `lxml-5.1.1.tar.gz`
- extract it

```sh
brew install pyenv
pyenv init # follow instructions

pyenv install 3.8.12
pyenv shell 3.8.12

cd ~/Downloads/lxml-5.1.1
python setup.py build
```

- then copy `~/Downloads/lxml-5.1.1/build/lib.macosx-15.1-arm64-3.8/lxml` into `~/Library/Application Support/Sublime Text/Lib/python38/lxml`, overwriting anything already there.
- Restart ST.

(If you were to try downloading `lxml-5.1.1-cp38-cp38-macosx_10_9_universal2.whl` for example, and extracting that into ST's lib folder mentioned above, when you restart ST, you would be told that an `.so` file in `Lib/python38/lxml/` folder isn't trusted, and there would be no option to "allow". You could go in Mac Settings -> Privacy and Security and it should show up there with an option to allow it. But you'd still see that the `lxml` dependency fails to load in ST, and the only solution seems to be building from source on the Mac.)

### CDATA Nodes

When working with XML documents, you are probably used to the Document Object Model (DOM), where CDATA nodes are separate to text nodes. XPath sees `text()` nodes as all adjacent CDATA and text node siblings together.
Expand Down
56 changes: 28 additions & 28 deletions example_xml_ns.xml
Original file line number Diff line number Diff line change
@@ -1,31 +1,31 @@
<?xml version="1.0"?>
<test>
<hello xmlns="hello_ns">
<!-- from an element name standpoint, you could expect this element to be reachable by the path "/test/hello".
However, XPath 1.0 does not have the concept of a default namespace. If the XML document being queried defines a default namespace, the XPath expression should map the namespace to a prefix for easier access.
This is what this plugin does for you. Because this is the first prefix-less namespace declared in document order, it will become "default", assuming "default" is set as the default namespace prefix to use.
But, because there is at least one other prefix-less namespace in the document - which resolves to a different uri - it will become "default1" instead. -->
<!-- The scope of a default namespace declaration extends from the beginning of the start-tag in which it appears to the end of the corresponding end-tag, excluding the scope of any inner default namespace declarations. -->
<world xmlns="world_ns"><!-- second prefix-less namespace in doc order, becomes "default2" -->
<example /><!-- path here is "/test/default1:hello/default2:world/default2:example" -->
</world>
</hello>
<more xmlns="more_ns" xmlns:an="another_ns"><!-- third prefix-less namespace in doc order, becomes "default3". First "an" prefix declared in doc-order. Because there is more than one unique namespace uri declared with this prefix, it will become "an1" for query purposes. -->
<an:another></an:another><!-- path here is "/test/default3:more[1]/an1:another" -->
</more>
<more xmlns="more_ns" xmlns:an="yet_another_ns" xmlns:unique="single"><!-- this prefixless namespace has the same uri as one already used, "default3", so this is also "default3". Because the an prefix is being declared again to a different uri than the previous one, it becomes "an2". As the "unique" prefix is unique, it remains as "unique", with no numeric suffix. -->
<an:yet_another></an:yet_another><!-- path here is "/test/default3:more[2]/an2:yet_another" -->
<unique:example></unique:example><!-- path here is "/test/default3:more[2]/unique:example" -->
</more>
<foo xmlns="world_ns" /><!-- same uri as default2 used, path is therefore "/test/default2:foo" -->
<numeric xmlns:an="yans"><!-- this one becomes "an4", because "an" has been used twice already and "an3" is explicitly defined later in the document -->
<an:test /><!-- /test/numeric/an4:test -->
</numeric>
<numeric xmlns:an3="numbered_ns"><!-- specific namespace prefix with a numeric suffix declared -->
<an3:okay></an3:okay>
</numeric>
<text attr1="hello" attr2='world'>sample text<more some_value
=
"foobar" another_value = "super" xmlns:abc="abc" abc:another_value="value" /> lorem ipsum etc.</text>abc<![CDATA[def]]>ghi<hij><![CDATA[klm]]></hij><![CDATA[nop]]>
<hello xmlns="hello_ns">
<!-- from an element name standpoint, you could expect this element to be reachable by the path "/test/hello".
However, XPath 1.0 does not have the concept of a default namespace. If the XML document being queried defines a default namespace, the XPath expression should map the namespace to a prefix for easier access.
This is what this plugin does for you. Because this is the first prefix-less namespace declared in document order, it will become "default", assuming "default" is set as the default namespace prefix to use.
But, because there is at least one other prefix-less namespace in the document - which resolves to a different uri - it will become "default1" instead. -->

<!-- The scope of a default namespace declaration extends from the beginning of the start-tag in which it appears to the end of the corresponding end-tag, excluding the scope of any inner default namespace declarations. -->
<world xmlns="world_ns"><!-- second prefix-less namespace in doc order, becomes "default2" -->
<example /><!-- path here is "/test/default1:hello/default2:world/default2:example" -->
</world>
</hello>
<more xmlns="more_ns" xmlns:an="another_ns"><!-- third prefix-less namespace in doc order, becomes "default3". First "an" prefix declared in doc-order. Because there is more than one unique namespace uri declared with this prefix, it will become "an1" for query purposes. -->
<an:another></an:another><!-- path here is "/test/default3:more[1]/an1:another" -->
</more>
<more xmlns="more_ns" xmlns:an="yet_another_ns" xmlns:unique="single"><!-- this prefixless namespace has the same uri as one already used, "default3", so this is also "default3". Because the an prefix is being declared again to a different uri than the previous one, it becomes "an2". As the "unique" prefix is unique, it remains as "unique", with no numeric suffix. -->
<an:yet_another></an:yet_another><!-- path here is "/test/default3:more[2]/an2:yet_another" -->
<unique:example></unique:example><!-- path here is "/test/default3:more[2]/unique:example" -->
</more>
<foo xmlns="world_ns" /><!-- same uri as default2 used, path is therefore "/test/default2:foo" -->
<numeric xmlns:an="yans"><!-- this one becomes "an4", because "an" has been used twice already and "an3" is explicitly defined later in the document -->
<an:test /><!-- /test/numeric/an4:test -->
</numeric>
<numeric xmlns:an3="numbered_ns"><!-- specific namespace prefix with a numeric suffix declared -->
<an3:okay></an3:okay>
</numeric>
<text attr1="hello" attr2='world'>sample text<more some_value
=
"foobar" another_value = "super" xmlns:abc="abc" abc:another_value="value" /> lorem ipsum etc.</text>abc<![CDATA[def]]>ghi<hij><![CDATA[klm]]></hij><![CDATA[nop]]>
</test>
12 changes: 8 additions & 4 deletions lxml_parser.py
Original file line number Diff line number Diff line change
Expand Up @@ -158,6 +158,11 @@ def element_start(self, tag, attrib=None, nsmap=None, location=None):

def create_element(self, tag, attrib=None, nsmap=None):
LocationAwareElement.TAG = tag
if nsmap: # a change made in lxml 3.8.0 / 3.5.0b1 requires us to pass None instead of an empty prefix string
if '' in nsmap:
nsmap[None] = nsmap['']
del nsmap['']

return LocationAwareElement(attrib=attrib, nsmap=nsmap)

def element_end(self, tag, location=None):
Expand Down Expand Up @@ -299,11 +304,10 @@ def unique_namespace_prefixes(namespaces, replaceNoneWith = 'default', start = 1

def get_results_for_xpath_query(query, tree, context = None, namespaces = None, **variables):
"""Given a query string and a document trees and optionally some context elements, compile the xpath query and execute it."""
nsmap = {}
if namespaces is not None:
nsmap = dict()
if namespaces:
for prefix in namespaces.keys():
if namespaces[prefix][0] != '':
nsmap[prefix] = namespaces[prefix][0]
nsmap[prefix] = namespaces[prefix][0]

xpath = etree.XPath(query, namespaces = nsmap)

Expand Down
4 changes: 2 additions & 2 deletions sublime_lxml.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,8 +3,8 @@
from .sublime_helper import get_scopes
import re

RE_TAG_NAME_END_POS = re.compile('[>\s/]')
RE_TAG_ATTRIBUTES = re.compile('\s+((\w+(?::\w+)?)\s*=\s*(?:"([^"]*)"|\'([^\']*)\'))')
RE_TAG_NAME_END_POS = re.compile(r'[>\s/]')
RE_TAG_ATTRIBUTES = re.compile(r'\s+((\w+(?::\w+)?)\s*=\s*(?:"([^"]*)"|\'([^\']*)\'))')

# TODO: consider subclassing etree.ElementBase and adding as methods to that
def getNodeTagRegion(view, node, position_type):
Expand Down
Loading

0 comments on commit 01ae724

Please sign in to comment.