-
Notifications
You must be signed in to change notification settings - Fork 500
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Filtering of Files on Dataset Page, Search using Solr on Dataset Page #5584
Comments
As we only index the latest published and draft versions ("present versions"), we'd only be able to use Solr for those versions. It had previously been decided that this was ok (at least, as a first batch). We discussed today at tech hours - we will generate a list of ids from either the solr (for present versions) or the db (for past versions). That list could then be used the same way to get the details to display on the cards. The facets themselves would only be rendered on present versions. |
|
Changes to the files table source code on the dataset pg is being revised as part of Enable the display of file hierarchy metadata on the dataset page #5572 which will impact the same HTML files this issue touches. This will need some development coordination and manual merging to resolve the expected conflicts. |
Added static placeholders for file table facets on dataset pg in a new branch |
Updated the latest from |
(just made the PR above; just to make it easier to look at the commits. it should stay in dev. for now!) |
- added an indexed flag, for the published files removed from the current draft; - backward compatibility, if talking to a solr server with an older schema; - added check for solr being down - reverting back to searching in the db if it is. (#5584)
…le that's been removed from the current draft. (#5584)
|
@mheppler I changed your rendering rules, to show the Sort button for non-indexed versions too. But that really breaks your styling of the fragment, because without the facets, the button is now hanging in the middle of all that white space: I'm sure you can fix it... but I would seriously consider living without it, for non-indexed versions (since they will be rarely used; and that's how the page is looking now anyway). But up to you. I'm still working on the back end. |
sort button should be working now. |
Fixed the Sort btn layout issue @landreev discovered in old versions with no facets. Fixed other various layout issues including the checksums and UNF's getting off in small browser windows due to the new flexbox layout used with the file thumbnail and metadata layout in each row. (See attached.) These fixes were added to the to-do list above which outlines all the moving pieces. |
Notes for the reviewer(s): In the process of working on this we realized it was impossible to accurately search for files in draft versions using our solr index. This is because we do NOT index files that have not changed between the latest published version and the draft. (this is to avoid having duplicate search cards for these files) I solved this by adding another boolean to the solr schema - "fileDeleted". It's set to true in a solr document for a published file that's no longer found in the draft (if exists). This way we can find all the files in the draft by adding 2 filter queries: "parentId: N" and "fileDeleted: false" The disadvantage of this solution is having to force the solr schema update (and the reindex). The search that serves the page is built with backward compatibility - so that it doesn't completely fail if it's talking to the solr server that doesn't have the new schema yet (in this case it reverts to searching without the new flag; potentially showing higher numbers in the facets). Is it worth it? - We could alternatively just say that this is a known limitation - that the facet numbers for draft versions may show higher counts, on account of the deleted files. After all, only the dataset owners will be seeing the draft versions. (that said, we will have to force a reindex of all everything sometime soon anyway - to get all the improved file types indexed!) |
Notes for QA: This PR has a new solr schema. If solr is down, the page should still be working - the search box should still be there, just without the facets. Aside from testing the searching and the facets for basic accuracy, the page should probably be tested some more with larger numbers of files. |
I'm going to do an experiment for a possible workflow change, that is, once there's a PR for something, lock the discussion in the issue and move the discussion to the PR. I think this will work better with the flow on our new project board https://github.com/orgs/IQSS/projects/2. So, I'm going to "lock" this conversation and move the conversation to #5820. Like I said, an experiment. |
On the dataset page, users should be able to search for AND filter files as depicted in the screenshot. This may involve some work with Solr on the page (or some representation of the Solr index that can be easily tapped into). Assigning to @scolapasta so that he can use tech hours to generate some ideas.
The text was updated successfully, but these errors were encountered: