Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

EndNote .xml import to JabRef: PDF links are not imported correctly. #6199

Closed
AtaZadehgol opened this issue Mar 29, 2020 · 8 comments · Fixed by #7667
Closed

EndNote .xml import to JabRef: PDF links are not imported correctly. #6199

AtaZadehgol opened this issue Mar 29, 2020 · 8 comments · Fixed by #7667
Labels
component: external-files component: import-load good first issue An issue intended for project-newcomers. Varies in difficulty. [outdated] type: bug Confirmed bugs or reports that are very likely to be bugs

Comments

@AtaZadehgol
Copy link

JabRef version <JabRef 5.0--2020-03-06--2e6f433
Windows 10 10.0 amd64
Java 13.0.2>

When importing an EndNote library in .xml format into JabRef, the PDF attachment links are not imported into JabRef correctly.

Steps to reproduce the behavior:

  1. I followed instructions under "Importing an EndNote library into JabRef" at https://www.mcgill.ca/library/files/library/jabref_guide_2016.pdf, and exported the EndNote library as an .xml file (attached here as file "My EndNote Library-Converted_xml.txt") . See below snapshot (attached here as file "EndNote Export settings = xml.PNG") showing the export configuration of EndNote version X9.3.1 (Bld 13758).

  2. I then imported the EndNote .xml library file into JabRef version 5.0.

  3. The EndNote .xml library is imported into JabRef; however, the PDF file attachments are not imported automatically.

  4. It appears this is a bug because JabRef expects the pdf url to be encapsulated in a style span as this is the case for the other urls. As a temporary workaround, you can add the <style> ... </style> tag around the filename.pdf (i.e., filename.pdf --> <style> filename.pdf </style>) yourself for the moment; this should work semi-automatically via a "search and replace" in your favorite text editor. Also, remember to replace the "internal-pdf://" before the filename.pdf with the correct path in your computer (e.g., "internal-pdf://path_folder1/filename.pdf" --> "C:/EndNote/PDFs_Folder/path_folder1/filename.pdf" in Windows, etc.).

EndNote Export settings = xml

My EndNote Library-Converted_xml.txt

Log File
Paste an excerpt of your log file here
@tobiasdiez tobiasdiez added [outdated] type: bug Confirmed bugs or reports that are very likely to be bugs component: import-load labels Mar 29, 2020
@tobiasdiez
Copy link
Member

Relevant code:

private Optional<String> getUrlValue(Url url) {
return Optional.ofNullable(url)
.map(Url::getStyle)
.map(Style::getContent)
.map(this::clean);
}

@tobiasdiez tobiasdiez added the good first issue An issue intended for project-newcomers. Varies in difficulty. label Mar 29, 2020
@archeaopteryx
Copy link

@tobiasdiez , thank you for the hint for the relevant code. I am working on a fix but have an addition question. Should I post directly here, or can I send you an email?

@Siedlerchr
Copy link
Member

@archeaopteryx Please ask here so we all can help. Best is to create a draft PR and then ask a specific question about the code

@archeaopteryx
Copy link

@Siedlerchr , thank you for the advice.
The problem is that I'm not sure how to to get the URL content. As stated by OP, it's fine to have an empty style span, but having no style span means that the unmarshaller returns a null object with no content.

My naive approach would be to change the behavior of the url class referenced when defining the unmarshaller context, but that isn't possible since the context is defined by generated files. Am I missing a better solution? And, if not, is it possible for me to change the generated files?

@Siedlerchr
Copy link
Member

I came across this issue and it's inded a problem of the generated unmarshaller. I am not an expert in xml and those schema stuff. But from my opinion it should be necessary to adjust the xsd schema to add some kind of content attribute for the url attribute to represent the text without a style.

<urls>
				<related-urls>
					<url>
						<style face="normal" font="default" size="100%">&lt;Go to
							ISI&gt;://WOS:000377368200039</style>
					</url>
				</related-urls>
				<pdf-urls>
					<url>/Users/xxxx/_JABREFTEP/Menezes2018 - Map2Check Using LLVM and KLEE.pdf</url>
				</pdf-urls>
			</urls>

The xsd file is here:
https://github.com/JabRef/jabref/blob/master/src/main/resources/xjc/endnote/endnote.xsd

@devinluo27
Copy link
Contributor

I am currently working on this issue, but I get a little lost now. Could you give me some hints or start points on that issue @Siedlerchr?

@Siedlerchr
Copy link
Member

Siedlerchr commented Apr 23, 2021

@Aloofwisdom Thanks for your interest. JabRef uses the xjc tool (which uses jaxb) to automatically generate java classes from an xsd/xml schema (see the gradle task generateSource).

The problem is that in the XSD defintion the element <pdf-urls> contains a list of <url> with a style attribute. However, there can also be <url> without a style attribute.
However, as you see that actually xml file can also contain <url> without a style attribute.

To ease your work, I already adjusted the xsd file for you. Replace the section in the endnote.xsd file under /src/main/resouces/xjc/endote/ with the following.
Then do a ./gradlew clean and a ./gradlew run (or ./gradlew generate source first) to regenerate the new EndNote classes.

Then you can adjust the method in the java code @tobiasdiez showed above to take the new fact into account.

<xs:element name="url">
		<xs:complexType>
			<xs:choice>
				<xs:sequence>
					<xs:element minOccurs="0" ref="style" />
				</xs:sequence>
				<xs:element name="url" type="xs:string" />
			</xs:choice>
			<xs:attribute name="has-ut">
				<xs:simpleType>
					<xs:restriction base="xs:token">
						<xs:enumeration value="yes" />
						<xs:enumeration value="no" />
					</xs:restriction>
				</xs:simpleType>
			</xs:attribute>
			<xs:attribute name="ppv-app" />
			<xs:attribute name="ppv-ref">
				<xs:simpleType>
					<xs:restriction base="xs:token">
						<xs:enumeration value="yes" />
						<xs:enumeration value="no" />
					</xs:restriction>
				</xs:simpleType>
			</xs:attribute>
			<xs:attribute name="ppv-ut" />
		</xs:complexType>
	</xs:element>

@devinluo27
Copy link
Contributor

@Siedlerchr Your illustration is pretty clear, thanks. I will make a pull request as soon as I finish.

devinluo27 added a commit to devinluo27/jabref that referenced this issue Apr 24, 2021
devinluo27 added a commit to devinluo27/jabref that referenced this issue Apr 24, 2021
devinluo27 added a commit to devinluo27/jabref that referenced this issue Apr 25, 2021
…oid null check.

Co-authored-by: Christoph <siedlerkiller@gmail.com>
devinluo27 added a commit to devinluo27/jabref that referenced this issue Apr 25, 2021
Since list would be init with Collections.emptyList() if get Optional.empty().
devinluo27 added a commit to devinluo27/jabref that referenced this issue Apr 25, 2021
devinluo27 added a commit to devinluo27/jabref that referenced this issue Apr 25, 2021
devinluo27 added a commit to devinluo27/jabref that referenced this issue Apr 25, 2021
Siedlerchr added a commit that referenced this issue Apr 26, 2021
…imported corrected. (#7667)

* Fix for issue 6199: EndNote .xml import to JabRef: PDF links are not imported correctly. (#6199)

* fix issue #6199 finish checkstyle

Co-authored-by: Christoph <siedlerkiller@gmail.com>
koppor pushed a commit that referenced this issue Sep 15, 2022
201e022 Update trends-journals.csl (#6224)
46e6eed Update nottingham-trent-university-library-harvard.csl (#6220)
684bb48 Update politix.csl (#6199)
c484b0b Update mcgill-fr.csl (#6198)
cbcf2f2 Update mary-ann-liebert-vancouver.csl (#6218)
47174f0 Create journal-of-dairy-research.csl (#6195)
fdd1eac Update harvard-anglia-ruskin-university.csl (#6196)
9e384d6 Create estonian-journal-of-earth-sciences.csl (#6194)
afba9b7 Delete moore-theological-college.csl as per university (#6197)
644549f Create acta-medica-philippina.csl (#6192)
6566114 Update rassegna-degli-archivi-di-stato.csl (#6186)
3509a2f Update universidade-federal-de-sergipe-departamento-de-engenharia-de-… (#6187)
de4845f Update mary-ann-liebert-vancouver.csl (#6213)
16828b6 Update ucl-university-college-apa.csl (#6172)
c08613b Update ucl-university-college-harvard.csl (#6173)
028bad4 Create sociologia-ruralis.csl (#6170)
77d428c Update journal-of-plankton-research.csl (#6169)
92e1022 Update and rename dependent/journal-of-the-national-cancer-institute.… (#6168)
120efb1 Update journal of hearing science
e503477 Create new-harts-rules-the-oxford-style-guide-author-date.csl (#6163)
49ab318 Create the-depositional-record.csl (#6159)
f4f6920 Update and rename laser-and-photonics-reviews.csl to dependent/laser-… (#6165)
d8ca4bc Update united-states-international-trade-commission.csl (#6162)

git-subtree-dir: buildres/csl/csl-styles
git-subtree-split: 201e022
@koppor koppor moved this to Done in Prioritization Nov 10, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
component: external-files component: import-load good first issue An issue intended for project-newcomers. Varies in difficulty. [outdated] type: bug Confirmed bugs or reports that are very likely to be bugs
Projects
Archived in project
6 participants