Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable Merging of BibDatabases #6689

Merged

Conversation

DominikVoigt
Copy link
Contributor

This PR adds a method to the BibDatabases that allows instances to be merged with other instances.
This merging will not introduce duplicates if an entry is contained in both databases.

  • Change in CHANGELOG.md described (if applicable)
  • Tests created for changes (if applicable)
  • Manually tested changed features in running JabRef (always required)
  • Screenshots added in PR description (for UI changes)
  • Checked documentation: Is the information available and up to date? If not created an issue at https://github.com/JabRef/user-documentation/issues or, even better, submitted a pull request to the documentation repository.

Copy link
Member

@tobiasdiez tobiasdiez left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good that you have a look at the merge functionality as well.

We already have two merge methods:

public void importEntries(List<BibEntry> entriesToImport, boolean downloadFiles) {
// Check if we are supposed to warn about duplicates.
// If so, then see if there are duplicates, and warn if yes.
if (preferences.shouldWarnAboutDuplicatesForImport()) {
BackgroundTask.wrap(() -> entriesToImport.stream()
.anyMatch(this::hasDuplicate)).onSuccess(duplicateFound -> {
if (duplicateFound) {
boolean continueImport = dialogService.showConfirmationDialogWithOptOutAndWait(Localization.lang("Duplicates found"),
Localization.lang("There are possible duplicates (marked with an icon) that haven't been resolved. Continue?"),
Localization.lang("Continue with import"),
Localization.lang("Cancel import"),
Localization.lang("Disable this confirmation dialog"),
optOut -> preferences.setShouldWarnAboutDuplicatesForImport(!optOut));
if (!continueImport) {
dialogService.notify(Localization.lang("Import canceled"));
} else {
buildImportHandlerThenImportEntries(entriesToImport);
}
} else {
buildImportHandlerThenImportEntries(entriesToImport);
}
}).executeWith(Globals.TASK_EXECUTOR);
} else {
buildImportHandlerThenImportEntries(entriesToImport);
}
if (downloadFiles) {
for (BibEntry bibEntry : entriesToImport) {
for (LinkedFile linkedFile : bibEntry.getFiles()) {
LinkedFileViewModel linkedFileViewModel = new LinkedFileViewModel(linkedFile, bibEntry, databaseContext, taskExecutor, dialogService, preferences.getXMPPreferences(), preferences.getFilePreferences(), ExternalFileTypes.getInstance());
linkedFileViewModel.download();
}
}
}
NamedCompound namedCompound = new NamedCompound(Localization.lang("Import file"));
namedCompound.addEdit(new UndoableInsertEntries(databaseContext.getDatabase(), entriesToImport));
// merge strings into target database
for (BibtexString bibtexString : parserResult.getDatabase().getStringValues()) {
String bibtexStringName = bibtexString.getName();
if (databaseContext.getDatabase().hasStringByName(bibtexStringName)) {
String importedContent = bibtexString.getContent();
String existingContent = databaseContext.getDatabase().getStringByName(bibtexStringName).get().getContent();
if (!importedContent.equals(existingContent)) {
LOGGER.warn("String contents differ for {}: {} != {}", bibtexStringName, importedContent, existingContent);
// TODO: decide what to do here (in case the same string exits)
}
} else {
databaseContext.getDatabase().addString(bibtexString);
// FIXME: this prevents this method to be moved to logic - we need to implement a new undo/redo data model
namedCompound.addEdit(new UndoableInsertString(databaseContext.getDatabase(), bibtexString));
}
}
// copy content selectors to target database
MetaData targetMetada = databaseContext.getMetaData();
parserResult.getMetaData()
.getContentSelectorList()
.forEach(contentSelector -> targetMetada.addContentSelector(contentSelector));
// TODO undo of content selectors (currently not implemented)
// copy groups to target database
parserResult.getMetaData().getGroups().ifPresent(
newGroupsTreeNode -> {
if (targetMetada.getGroups().isPresent()) {
GroupTreeNode groupTreeNode = targetMetada.getGroups().get();
newGroupsTreeNode.moveTo(groupTreeNode);
namedCompound.addEdit(
new UndoableAddOrRemoveGroup(
new GroupTreeNodeViewModel(groupTreeNode),
new GroupTreeNodeViewModel(newGroupsTreeNode),
UndoableAddOrRemoveGroup.ADD_NODE));
} else {
// target does not contain any groups, so we can just use the new groups
targetMetada.setGroups(newGroupsTreeNode);
namedCompound.addEdit(
new UndoableAddOrRemoveGroup(
new GroupTreeNodeViewModel(newGroupsTreeNode),
new GroupTreeNodeViewModel(newGroupsTreeNode),
UndoableAddOrRemoveGroup.ADD_NODE));
}
}
);
namedCompound.end();
Globals.undoManager.addEdit(namedCompound);
JabRefGUI.getMainFrame().getCurrentBasePanel().markBaseChanged();
}

private ParserResult mergeImportResults(List<ImportFormatReader.UnknownFormatImport> imports) {
BibDatabase resultDatabase = new BibDatabase();
ParserResult result = new ParserResult(resultDatabase);
for (ImportFormatReader.UnknownFormatImport importResult : imports) {
if (importResult == null) {
continue;
}
ParserResult parserResult = importResult.parserResult;
List<BibEntry> entries = parserResult.getDatabase().getEntries();
resultDatabase.insertEntries(entries);
if (ImportFormatReader.BIBTEX_FORMAT.equals(importResult.format)) {
// additional treatment of BibTeX
// merge into existing database
// Merge strings
for (BibtexString bibtexString : parserResult.getDatabase().getStringValues()) {
String bibtexStringName = bibtexString.getName();
if (resultDatabase.hasStringByName(bibtexStringName)) {
String importedContent = bibtexString.getContent();
String existingContent = resultDatabase.getStringByName(bibtexStringName).get().getContent();
if (!importedContent.equals(existingContent)) {
LOGGER.warn("String contents differ for {}: {} != {}", bibtexStringName, importedContent, existingContent);
// TODO: decide what to do here (in case the same string exits)
}
} else {
resultDatabase.addString(bibtexString);
}
}
// Merge groups
// Adds the specified node as a child of the current root. The group contained in <b>newGroups </b> must not be of
// type AllEntriesGroup, since every tree has exactly one AllEntriesGroup (its root). The <b>newGroups </b> are
// inserted directly, i.e. they are not deepCopy()'d.
parserResult.getMetaData().getGroups().ifPresent(newGroups -> {
// ensure that there is always only one AllEntriesGroup in the resulting database
// "Rename" the AllEntriesGroup of the imported database to "Imported"
if (newGroups.getGroup() instanceof AllEntriesGroup) {
// create a dummy group
try {
// This will cause a bug if the group already exists
// There will be group where the two groups are merged
String newGroupName = importResult.parserResult.getFile().map(File::getName).orElse("unknown");
ExplicitGroup group = new ExplicitGroup("Imported " + newGroupName, GroupHierarchyType.INDEPENDENT,
Globals.prefs.getKeywordDelimiter());
newGroups.setGroup(group);
group.add(parserResult.getDatabase().getEntries());
} catch (IllegalArgumentException e) {
LOGGER.error("Problem appending entries to group", e);
}
}
result.getMetaData().getGroups().ifPresent(newGroups::moveTo);
});
for (ContentSelector selector : parserResult.getMetaData().getContentSelectorList()) {
result.getMetaData().addContentSelector(selector);
}
}
// TODO: collect errors into ParserResult, because they are currently ignored (see caller of this method)
}

It would be good to refactor and combine them (with your newly added method as well).

Side remark: depending on your envisioned applications, the ImportEntriesDialog might be helpful.

@DominikVoigt DominikVoigt force-pushed the feature/enable-merging-of-bibdatabases branch from 8523018 to 278a5f7 Compare July 19, 2020 21:47
@koppor koppor marked this pull request as draft July 27, 2020 22:17
@koppor
Copy link
Member

koppor commented Aug 1, 2020

The linked code refs #6488

Currently, I understand the new code better than the looong linked code.

We also have BibDatabaseDiff, which doesn't seem to be referenced in any code. (Example use: https://github.com/koppor/jabref/pull/442/files#diff-58f2d74f80b59a9ba6468243b879c428R68)

@koppor
Copy link
Member

koppor commented Aug 5, 2020

We merged the merge methods at the cose of "some" undo/redo (at the whole import 😟)

undo/redo refs koppor#453

@DominikVoigt DominikVoigt marked this pull request as ready for review August 25, 2020 08:28
@DominikVoigt DominikVoigt force-pushed the feature/enable-merging-of-bibdatabases branch from 7ca130e to 760498b Compare August 28, 2020 08:47
Signed-off-by: Dominik Voigt <dominik.ingo.voigt@gmail.com>
@DominikVoigt DominikVoigt force-pushed the feature/enable-merging-of-bibdatabases branch from 760498b to 000d773 Compare August 28, 2020 08:49
@koppor koppor added the status: ready-for-review Pull Requests that are ready to be reviewed by the maintainers label Aug 28, 2020
Copy link
Member

@koppor koppor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Commit issues resolved. LGTM.

Copy link
Member

@tobiasdiez tobiasdiez left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. The code looks mostly good to me. I've a few comments and suggestions for improvement.

src/main/java/org/jabref/gui/importer/ImportAction.java Outdated Show resolved Hide resolved
@@ -0,0 +1,65 @@
package org.jabref.logic.bibtex;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this is not related to writing bibtex (as the rest of the package), I suggest to move it together with the duplication check to a new logic.database package in parallel to the existing model.database. @JabRef/developers opinions?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I moved it to the recommended package :).

void mergeAddsNonDuplicateEntries() {
// Entries 2 and 3 are identical
BibEntry entry1 = new BibEntry()
.withField(StandardField.AUTHOR, "Stephen Blaha")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please reduce these examples to a minimum. I don't think you need to have 4 entries with full information

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I reduced the number of entries and their information content.

* @param otherFilename the filename of the other library. Pass "unknown" if not known.
* @param allOtherEntries list of all other entries
*/
public void merge(MetaData other, String otherFilename, List<BibEntry> allOtherEntries) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please move these merge metadata methods to the database merger class. There you can also add a merge method operating on BibDatabaseContext objects, to have an easy way to merge two databases including all their metadata.

.collect(Collectors.toList());

assertEquals(List.of(targetString1.toString(), targetString2.toString()), resultStringsSorted);
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please also include tests for the merged groups.

@tobiasdiez tobiasdiez added the status: changes required Pull requests that are not yet complete label Aug 28, 2020
Signed-off-by: Dominik Voigt <dominik.ingo.voigt@gmail.com>
Add DatabaseContext merging capability to DatabaseMerger

Signed-off-by: Dominik Voigt <dominik.ingo.voigt@gmail.com>
Signed-off-by: Dominik Voigt <dominik.ingo.voigt@gmail.com>
Signed-off-by: Dominik Voigt <dominik.ingo.voigt@gmail.com>
Signed-off-by: Dominik Voigt <dominik.ingo.voigt@gmail.com>
@DominikVoigt
Copy link
Contributor Author

Thanks for your comments :). I implemented the requested changes!

@koppor koppor removed the status: changes required Pull requests that are not yet complete label Aug 31, 2020
@tobiasdiez tobiasdiez merged commit 35f5078 into JabRef:master Sep 1, 2020
Siedlerchr added a commit that referenced this pull request Sep 1, 2020
* upstream/master:
  Enable Merging of BibDatabases (#6689)
  Refactor file preferences (#6779)
  Interrupt all running tasks during shutdown (#6118)
  Fixes #6705 , added icon for multiple identifiers (#6809)
  Apply css files correctly to dialogs (#6828)
  Fix link
  Make template more explicit
@DominikVoigt DominikVoigt deleted the feature/enable-merging-of-bibdatabases branch January 1, 2021 16:06
@DominikVoigt DominikVoigt restored the feature/enable-merging-of-bibdatabases branch January 1, 2021 16:06
@DominikVoigt DominikVoigt deleted the feature/enable-merging-of-bibdatabases branch January 1, 2021 16:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
status: ready-for-review Pull Requests that are ready to be reviewed by the maintainers
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants