Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add fulltext search #6188

Closed
tobiasdiez opened this issue Mar 27, 2020 · 12 comments
Closed

Add fulltext search #6188

tobiasdiez opened this issue Mar 27, 2020 · 12 comments

Comments

@tobiasdiez
Copy link
Member

tobiasdiez commented Mar 27, 2020

Index all files linked in some entry and provide full text search of their content.

There have been already two attempts to implement this: #2264 #2838

@AEgit
Copy link

AEgit commented Mar 27, 2020

Just as a reminder: If someone tackles this (definitely interesting) suggestion, please also consider the performance impact on large databases.

@AEgit
Copy link

AEgit commented Mar 30, 2020

Actually, the more I think about, the more it is clear to me, that this will have a massive performance impact and therefore priority when implementing this feature (if it is implemented) should always be performance testing on large databases.

At the moment I have a database containing >20,000 items. The Windows 10 search takes several days (!) to index the database from scratch. I know that because in rare cases (usually because of some Windows update) the search index got corrupted and I had to rebuilt it. Once you have the search index, adding new items usually is not a big issue - the index updates itself. But the first time you do it has a big performance impact. Unfortunately, the search index on Windows 10 at the moment is just single-threaded, but that single thread uses 100% of one CPU core over multiple days when starting the indexing from scratch.

It might actually be better to rely on a separate search index tool (e.g., Recoll: https://www.lesbonscomptes.com/recoll/) and import that index instead of integrating such a performance-intensive feature in JabRef.

In any case, this feature probably should not be active as a default setting.

@ilippert
Copy link
Contributor

ilippert commented May 3, 2020

Hmm, how about a "modular" solution that employs indexing solutions that are already available for an OS. for instance, several Linux distributions come with tracker - and could draw on that data.

so, jabref could proceed by checking whether a known indexing tool is available

  • if yes, use it
  • if no, propose another separate search index

@github-actions github-actions bot added the status: stale Issues marked by a bot as "stale". All issues need to be investigated manually. label Dec 8, 2020
@tobiasdiez tobiasdiez removed the status: stale Issues marked by a bot as "stale". All issues need to be investigated manually. label Dec 8, 2020
@AEgit
Copy link

AEgit commented Dec 10, 2020

As far as I am ware of, this feature has not been added to JabRef at the moment:
JabRef 5.2--2020-12-09--d1fb9e2
Windows 10 10.0 amd64
Java 14.0.2

@github-actions github-actions bot added the status: stale Issues marked by a bot as "stale". All issues need to be investigated manually. label Jun 8, 2021
@tobiasdiez tobiasdiez removed the status: stale Issues marked by a bot as "stale". All issues need to be investigated manually. label Jun 8, 2021
@JabRef JabRef deleted a comment from github-actions bot Jun 8, 2021
@ilippert
Copy link
Contributor

ilippert commented Jun 9, 2021

Another consideration for performance: in case JabRef can detect whether it is in active use (e.g. some form of interaction within the last few seconds), then indexing could be stopped, so that indexing only runs when JabRef is not currently in use.

@koppor koppor mentioned this issue Jul 13, 2021
6 tasks
@koppor
Copy link
Member

koppor commented Aug 2, 2021

@AEgit We think, that is already fixed in our development version and consequently the change will be included in the next release.

We would like to ask you to use a development build from https://builds.jabref.org/main and report back if it works for you.

@koppor koppor closed this as completed Aug 2, 2021
@AEgit
Copy link

AEgit commented Dec 11, 2021

JabRef 5.4--2021-12-10--eff8073
Windows 10 10.0 amd64
Java 16.0.2
JavaFX 17.0.1+1

As far as I can tell, the fulltext search works in the current dev. I have not tested the performance impact properly, since I only use it on a very reduced version of my actual work database. As long as #4237 is not implemented, I cannot switch from JabRef 3.8.2 to Jabref 5 - only when that is implemented, I will be able to test the performance impact of this change.

@Accacio
Copy link

Accacio commented Feb 11, 2022

Does this fulltext search work with text files linked to an entry? I use text files for my comments, and I cannot search the terms in the files.

@ThiloteE
Copy link
Member

I just tried it with a .txt file on

JabRef 5.6--2022-01-30--8ca6b7f
Windows 10 10.0 amd64
Java 16.0.2
JavaFX 17.0.2-ea+3

and entering something into the JabRef search bar yielded zero results, so to answer your question, it does not seem to work with text files linked to an entry.

@Accacio
Copy link

Accacio commented Feb 17, 2022

I just tried it with a .txt file on

JabRef 5.6--2022-01-30--8ca6b7f Windows 10 10.0 amd64 Java 16.0.2 JavaFX 17.0.2-ea+3

and entering something into the JabRef search bar yielded zero results, so to answer your question, it does not seem to work with text files linked to an entry.

Thank you, maybe this issue should be reopened, @koppor ?

@Siedlerchr
Copy link
Member

Please open a new issue, e.g. Support other file types for fulltext search

@Accacio
Copy link

Accacio commented Feb 18, 2022

Please open a new issue, e.g. Support other file types for fulltext search

Thank you, just did it

@JabRef JabRef deleted a comment from github-actions bot Feb 28, 2022
koppor pushed a commit that referenced this issue Sep 1, 2022
8d69f16 Create university-of-hull-harvard.csl (#6146)
139dfdd Create current organic synthesis.csl (#6139)
bb006c8 Update acta-universitatis-agriculturae-sueciae.csl (#6143)
5815da0 Create food-science-and-biotechnology.csl (#6132)
2702a7c Update harvard-university-for-the-creative-arts.csl (#6104)
ef34543 Update economic-geology.csl (#6128)
0adcd30 Bump mathieudutour/github-tag-action from 5.6 to 6.0 (#6141)
3c36e97 Create universite-du-quebec-a-montreal-prenoms.csl (#6073)
415bc05 Bump softprops/action-gh-release from 0.1.14 to 1 (#6142)
ae8c5e4 Create politique-europeenne.csl (#6074)
09cbc09 Update cell-numeric-superscript.csl (#6188)
6ee1ace Update avian-conservation-and-ecology.csl (#6191)
cb5c43f Update harvard-anglia-ruskin-university.csl (#6189)
5c4f4c0 Create anais-da-academia-brasileira-de-ciencias.csl (#6066)
a60dfe9 Update cardiff-university-harvard.csl (#6190)
999a45c Create sociologia-urbana-e-rurale.csl (#6042)
1bc9d62 Bluebook (#6183)
a4f2a72 Oxford Brookes (#6182)
88df8d5 Delete harvard-cardiff-university-old.csl (#6180)
b9302fd Update APA styles for "event" macro (#6174)
d4daec6 remove DOI for printed articles organizational-studies.csl (#6176)
acfc620 Create liver-transplantation.csl (#6167)
129a775 Change "event" to "event-title" (#6164)

git-subtree-dir: buildres/csl/csl-styles
git-subtree-split: 8d69f16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Archived in project
Development

No branches or pull requests

7 participants