-
Notifications
You must be signed in to change notification settings - Fork 7.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Feature] Setting to turn off automatic reindexing of localDocs collections #2651
Comments
I would much rather add an "Are you sure?" dialog to both buttons, and add the "Update" button that we have been lacking for a while, which is like Rebuild but non-destructive. I cannot think of any reason to intentionally have a LocalDocs collection be inconsistent with what is actually on disk, which also outweighs the confusion that would likely be caused by such a situation (since you can't actually inspect the collection to see which files are and aren't in it). e.g., if you are worried that your OneDrive might disconnect and the files will disappear temporarily, you should make a copy of the files instead. There are all manner of sync programs you can use to maintain a copy of a set of files. But trying to build this kind of sync functionality into GPT4All itself seems like unnecessary complexity. If your use case is suited by e.g. leaving embeddings in cache for some duration in case files are moved or deleted but then restored in short order, I would also prefer that. They could even be cached indefinitely in the collection until you clear the cache. But I don't think GPT4All should ever reference files that currently do not exist at the specified path. |
From what I can see, the DB stores all the data that the files provide. It does not rely on the files to exist in order to function. The program itself requires the files to exist, which triggers actions the user may not want to occur. i.e. For each collection "update db" upon change to the files/structure within the collection, or upon changes to the settings that govern collections. I want to choose when my collection is updated. I don't want to rebuild all of my collections because I chose to add a new filetype as a setting. I don't want to rebuild all of my collections because one of my collections needs a larger chunk size and less chunks. I don't want to rebuild when I make one small change to a volatile directory that is otherwise fine. If I have taken the time to embed for several hours I want it protected now that it is done. |
@cebtenzzre I think in the end this is about having a setting that turns off automatic re-indexing when we discover a change through QFileSystemWatcher... some users want to manually control re-indexing. Having that setting (not per collection) plus an 'ARE YOU SURE' dialog I think would get @3Simplex what he's after |
After long debate I think we've settled on a simple option in localdocs that will turn off all automatic reindexing of localdocs collections.
OLDER ORIGINAL REQUEST
Feature Request
The option to "Lock" a localDocs collection to prevent reindex would be useful to ensure that an important collection remains unchanged. (Larger collections take hours to index and embed.)
The text was updated successfully, but these errors were encountered: