Brainstorming search results that are autosuggested and shown on results page #2421

ebarry · 2018-02-28T18:56:53Z

Update: this is a long conversation and there are some next steps being broken out. Please continue to use this issue for brainstorming! Thanks :)

Original issue continues below:

Please describe the problem

The system by which autosuggested results seems to choose and rank content suggestions is mysterious, and seems like a black box.

Autosuggested results have a display limit of 15 assorted content types, but do not provide an overview of Public Lab resources on a topic.

What did you expect to see that you didn't?

I expect to understand what the results mean.

Please show us where to look

The Search box in the menu bar

jywarren · 2018-02-28T19:47:57Z

It is actually a black box! Full text search is a complex problem and we solve it with the "fulltext" module of MySQL, our database system; some pretty arcane (but thorough) documentation is here: https://dev.mysql.com/doc/refman/5.7/en/fulltext-search.html

It does seem we can tune/adjust it, though. There is, for example, a "natural language" option which attempts to algorithmically determine "relevance" -- https://dev.mysql.com/doc/refman/5.7/en/fulltext-natural-language.html

We use this fulltext feature on this line:

plots2/app/models/node.rb

Line 29 in 47b3004

    
           Revision.where('MATCH(node_revisions.body, node_revisions.title) AGAINST(?)', query)

It does look like we could "turn on" natural language mode by making that say:

    Revision.where('MATCH(node_revisions.body, node_revisions.title) AGAINST(? IN NATURAL LANGUAGE MODE)', query)

We may also need to then add ordering by relevance -- so, i /think/ that would be:

Revision.select('node_revisions.body, node_revisions.title, MATCH(node_revisions.body, node_revisions.title) AGAINST("' + query.to_s + '" IN NATURAL LANGUAGE MODE) AS score')
  .where('MATCH(node_revisions.body, node_revisions.title) AGAINST(? IN NATURAL LANGUAGE MODE)', query)

It might take some testing out.

Would you like to try this out? I have to point out that I do NOT know what will happen. The documentation for "natural language" says:

Every correct word in the collection and in the query is weighted according to its significance in the collection or query. Thus, a word that is present in many documents has a lower weight, because it has lower semantic value in this particular collection. Conversely, if the word is rare, it receives a higher weight. The weights of the words are combined to compute the relevance of the row. This technique works best with large collections.

As to the second issue, --

...but do not provide an overview of Public Lab resources on a topic.

I expect to understand what the results mean.

How might we break this down a bit? Do you mean that you'd like to show a mix of types, or that you'd like to show explanatory information about what different types are?

Thanks!

jywarren · 2018-02-28T19:54:07Z

I tested the above query and it does run, although again, I'm not super clear on how it works. But it'd be pretty easy to put it into production if you'd like!

bronwen9 · 2018-02-28T21:37:04Z

What I'd like to see in the auto-suggest is a list of search terms based on weight (popular, busy pages first). On the results page I would like to see keyword results weighted by relevance (popularity, whether the word in question is included in a tag or a title, etc), and then sorted by type (note, profile, question, comment, etc). I would then like to be able to search within the keyword results (say, I'm interested in spectrometers, but would like to narrow down my search to find examples of how they've been used in schools)

jywarren · 2018-02-28T22:00:30Z

Hi, Bronwen, thanks. Let's break this into separate features:

auto-suggest search ordered by popularity (is this # of views, or likes, or another preference?)
results page ordered by relevance (popularity, whether the word in question is included in a tag or a title, etc)
results page displays each type (note, profile, question, comment, etc) separately -- like this, for example? https://publiclab.org/search/dynamic (that page doesn't work well yet)
ability to refine search within the keyword results (say, I'm interested in spectrometers, but would like to narrow down my search to find examples of how they've been used in schools) -- how would you specify this, do you think? Could you continue typing in the search input and see the results narrow more? Or is there another interface you'd like to suggest?

Thanks! This is super helpful.

jywarren · 2018-02-28T22:03:23Z

And for the second one up there, do you mean not "relevance" as is defined in my comment above about "natural language search" but a definition of popularity such as "likes" or "views"?

bronwen9 · 2018-03-01T15:01:12Z

I think we'd probably want to create a rubric for relevance could includes likes/views, but also weights results based on KIND of page (a wiki page with search term in the title might always show up higher on a list than, say, a comment).

One example where we're struggling with kinds of results is a search for "open hour. On our website, this search brings up 15 research notes in the auto suggest, and two research notes on the keyword search, but none of them direct to our Open Hour page. I do think a popularity ranking would help with this, and might be simpler than introducing a semantic search feature, but I can see either offering improvements.

When I perform the same search on google (without boolean operators), I see a list or results that starts with our main open hour page, followed by items tagged with "openhour" and "open-hour", followed by links to pages for individual open hours. This would seem to be a sensible rubric for page-type sorting (providing that it's still possible to browse or narrow searchers for all occurrences of a search term on our site)

jywarren · 2018-03-01T15:08:46Z

Cool - super helpful. I think there's probably a way to do a more complex ranking (maybe not Google-level pageRank but something) however I wonder if we took a few proposals and made them testable, and examined the results. For example it'd be pretty easy to set up views-based or likes-based ordering, and not much harder to do natural language relevance as I outlined above. If we made an option to view results for a given search query in all three, we could see which seems to work better for us. If that sounds good, we can start those code changes and have something to look at in a week or so; what do you think of that as a next step? We could tackle this iteratively and look at more advanced search rubrics as a follow-up? Thanks!!

…

On Thu, Mar 1, 2018, 10:02 AM bronwen9 ***@***.***> wrote: I think we'd probably want to create a rubric for relevance could includes likes/views, but also weights results based on KIND of page (a wiki page with search term in the title might always show up higher on a list than, say, a comment). One example where we're struggling with *kinds* of results is a search for "open hour. On our website, this search brings up 15 research notes in the auto suggest, and two research notes on the keyword search, but none of them direct to our Open Hour page. I do think a popularity ranking would help with this, and might be simpler than introducing a semantic search feature, but I can see either offering improvements. When I perform the same search on google (without boolean operators), I see a list or results that starts with our main open hour page, followed by items tagged with "openhour" and "open-hour", followed by links to pages for individual open hours. This would seem to be a sensible rubric for page-type sorting (providing that it's still possible to browse or narrow searchers for all occurrences of a search term on our site) [image: openhour] <https://user-images.githubusercontent.com/8331717/36850950-07ea6d18-1d36-11e8-8ed6-e80faf55bba4.gif> [image: openhour2] <https://user-images.githubusercontent.com/8331717/36851397-1cfe5466-1d37-11e8-89bc-bc21bf98c4a7.gif> — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#2421 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AABfJxGXx4qzmp9kf39jrk3Rly_N9qa7ks5taA05gaJpZM4SXKWA> .

bronwen9 · 2018-03-07T17:01:49Z

Ah, sorry for the late response, but I think that it would be great to try some of these. I think at some point we're going to need the ability to work with boolean operators (whether that's through additional search fields or allowing for more than one word or phrase in the field), but I think any of these options would help get us closer to understanding where things are going haywire in the existing search. Plus-one to trying all three!

jywarren · 2018-03-20T22:58:44Z

Work now ongoing in #2518 -- this will result in:

https://publiclab.org/search/pumpkins?order=natural for natural language match ordering
https://publiclab.org/search/pumpkins?order=likes for ordering by likes
https://publiclab.org/search/pumpkins?order=views for ordering by views
https://publiclab.org/search/pumpkins for ordering by original page creation dates

Soon!

(update: now live on the site!)

jywarren · 2018-03-25T13:45:49Z

Hi, this needs some review and reorganization now that the above searches work -- @bronwen9 and @ebarry -- thanks for your help so far! Some additional steps might be:

create a button or set of links to change the sorting on pages like https://publiclab.org/search/oil-spill
choose one of these as the default sorting for the typeahead auto-complete suggestions

Also just cleaning up the lead of this issue a bit or starting a new one with our next steps clearly laid out would be helpful! Thanks!

jywarren · 2018-08-21T15:40:18Z

As the dynamic search work is upcoming (as per your original schedule), I'm not sure if this one is on your radar, @milaaraujo and @stefannibrasil -- what do you think?

stefannibrasil · 2018-08-22T02:50:23Z

we have some few things to finish this week, we are planning to start working on improving the search next week!

stefannibrasil · 2018-09-05T04:03:05Z

@ebarry @bronwen9 @jywarren we started addressing some of your concerns here #3295. Please keep in mind that this PR is mostly on the front-end, but it will help with our planning! :)

stefannibrasil · 2018-09-05T04:03:59Z

I have some notes to share with you, but I need to organize them better before sharing with you xD

jywarren · 2018-09-05T22:24:06Z

So I left some maybe not super helpful comments on #3286 -- and just pulling it back here, I want to highlight that one of the questions we try to answer may need to be:

What is the best default sorting AND default search type for /each result type/ -- acknowledging that the best ordering for nodes might not make sense for profiles.

Make sense?

ebarry added this to the API and search improvements milestone Feb 28, 2018

jywarren mentioned this issue Mar 20, 2018

Search sorting #2518

Merged

ebarry mentioned this issue Mar 26, 2018

Create links on /search/_____ for search sorting options #2544

Closed

6 tasks

ebarry changed the title ~~PLANNING ISSUE: Autosuggested search results~~ Brainstorming search results (autosuggested and on results page) Mar 26, 2018

ebarry changed the title ~~Brainstorming search results (autosuggested and on results page)~~ Brainstorming search results that are autosuggested and shown on results page Mar 26, 2018

stefannibrasil mentioned this issue Sep 5, 2018

Apply UX concepts to the search in navbar #3295

Closed

stefannibrasil mentioned this issue Sep 22, 2018

Search page #3357

Merged

stefannibrasil closed this as completed Sep 22, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Brainstorming search results that are autosuggested and shown on results page #2421

Brainstorming search results that are autosuggested and shown on results page #2421

ebarry commented Feb 28, 2018 •

edited

Loading

jywarren commented Feb 28, 2018 •

edited

Loading

jywarren commented Feb 28, 2018

bronwen9 commented Feb 28, 2018

jywarren commented Feb 28, 2018 •

edited by ebarry

Loading

jywarren commented Feb 28, 2018

bronwen9 commented Mar 1, 2018

jywarren commented Mar 1, 2018 via email

bronwen9 commented Mar 7, 2018

jywarren commented Mar 20, 2018 •

edited

Loading

jywarren commented Mar 25, 2018 •

edited

Loading

jywarren commented Aug 21, 2018

stefannibrasil commented Aug 22, 2018

stefannibrasil commented Sep 5, 2018

stefannibrasil commented Sep 5, 2018

jywarren commented Sep 5, 2018

Brainstorming search results that are autosuggested and shown on results page #2421

Brainstorming search results that are autosuggested and shown on results page #2421

Comments

ebarry commented Feb 28, 2018 • edited Loading

Please describe the problem

Please show us where to look

jywarren commented Feb 28, 2018 • edited Loading

jywarren commented Feb 28, 2018

bronwen9 commented Feb 28, 2018

jywarren commented Feb 28, 2018 • edited by ebarry Loading

jywarren commented Feb 28, 2018

bronwen9 commented Mar 1, 2018

jywarren commented Mar 1, 2018 via email

bronwen9 commented Mar 7, 2018

jywarren commented Mar 20, 2018 • edited Loading

jywarren commented Mar 25, 2018 • edited Loading

jywarren commented Aug 21, 2018

stefannibrasil commented Aug 22, 2018

stefannibrasil commented Sep 5, 2018

stefannibrasil commented Sep 5, 2018

jywarren commented Sep 5, 2018

ebarry commented Feb 28, 2018 •

edited

Loading

jywarren commented Feb 28, 2018 •

edited

Loading

jywarren commented Feb 28, 2018 •

edited by ebarry

Loading

jywarren commented Mar 20, 2018 •

edited

Loading

jywarren commented Mar 25, 2018 •

edited

Loading