Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Redesign "related notes" queries to be smarter #13

Closed
Fastie opened this issue Nov 13, 2013 · 10 comments
Closed

Redesign "related notes" queries to be smarter #13

Fastie opened this issue Nov 13, 2013 · 10 comments
Labels
bug the issue is regarding one of our programs which faces problems when a certain task is executed enhancement explains that the issue is to improve upon one of our existing features outreach issues involve community involvement and helping people who're stuck somewhere

Comments

@Fastie
Copy link

Fastie commented Nov 13, 2013

On place pages (wikis) with "tabbed:notes" the "research" tab displays related research notes. These are not just notes that are tagged with that place name, and sometimes seemingly unrelated notes are included. For example, notes tagged with "chapter" always seem to be included for every place. The Jerusalem page includes notes from lots of places.

Just curious.

@jywarren
Copy link
Member

jywarren commented Nov 13, 2013

yeah -- i think we need to distinguish "special tags" somehow or risk this
kind of generalization of relationships. Maybe it should be "type:chapter"
and/or releated notes should ignore all powertags (those with colons).

@jywarren
Copy link
Member

jywarren commented Apr 4, 2014

So to open this one back up -- the jerusalem page has lots of notes from other places because it's tagged with kite-mapping so it gets all tags for kite-mapping. Related notes is "greedy" -- it accepts any match. If we made it require all tags that are listed, we wouldn't get anything, and ranking posts for how similar they are based on # of shared tags is a computationally intensive database query (bryan correct me if you know an efficient way).

So solutions could be:

  • remove kite-mapping tag
  • make Related Posts act differently for chapter pages, by a) recognizing chapter or some similar power tag, b) not including other content tagged "chapter"
  • make a related:foo power tag which overrides the default greedy "related" list with a short list of tags you'd prefer to grab, like related:jerusalem|israel|palestine or something

@btbonval
Copy link
Member

btbonval commented Apr 4, 2014

Ranking posts for how similar they are is a super fun an interesting
prospect. It doesn't necessarily have to be computationally intensive as a
database query depending upon what is being used for ranking and how. Quite
possibly a normal relational lookup would cut through the database and
yield the results just fine, but we might want to make what is called a
prepared statement or see if there's a way to turn it into a view.

Even if we came up with an intensive algorithm that doesn't belong in the
database, we could always schedule it to run during off peak times and
cache the similarities in the database. I'm pretty sure this is what
Netflix does when it comes up with ratings.

I'm not sure that I have been terribly helpful. Why don't we try to think
about the math which might be used to define relatedness in postings and
see how terrible it would really be?

Alternatively, we could leave associations to the masses in true democratic
fashion and leave the machines out of it entirely. Something like the
related power tag, except that users generate them to relate notes to each
other.

@jywarren
Copy link
Member

jywarren commented Apr 4, 2014

One thing I've thought about is flattening the tags tables to just tags
instead of tags and tagnodes. The new tags table would include tag names.
Bryan probably wouldn't like that but it'd save a join.

Anyhow yes there are lots of fun problems we could get better at... Related
content, but also optimized search!

Maybe we could start with a related tag but then do a longer term project
on algorithmic similarity?

@btbonval
Copy link
Member

btbonval commented Apr 4, 2014

I'm not sure if saving the join necessarily yields an optimization (that
seems to be an assumption many people make which does not always hold).
It's easy enough to test, though. We can quickly plop in a new table,
populate it via a join of the two old tables, and then compare response
times of the old tables vs the new table using EXPLAIN.

Are you proposing related tag in the sense that you originally said
(semi-automated) or the sense that I said (entirely user-driven)?
-Bryan

@jywarren
Copy link
Member

jywarren commented Apr 4, 2014

I guess I meant user defined as the easier to implement first... followed
by automated if we can do better.

@jywarren jywarren added the bug the issue is regarding one of our programs which faces problems when a certain task is executed label Apr 1, 2015
@ebarry ebarry added the outreach issues involve community involvement and helping people who're stuck somewhere label Sep 8, 2015
@jywarren jywarren changed the title The tag "chapter" has legs Redesign "related notes" queries to be smarter Mar 16, 2016
@ananyo2012 ananyo2012 added the enhancement explains that the issue is to improve upon one of our existing features label Nov 24, 2016
@ryzokuken
Copy link
Member

@jywarren Let's round up all these older issues and check if they're still relevant considering the various upgrades and design changes plots2 has received over the years. (Nov 2013 :P)

@ryzokuken
Copy link
Member

@jywarren PING

@jywarren
Copy link
Member

Oops, sorry, missed this ping! I think we could propose:

However, #1706 means we may clean out the sidebars anyways. Will ask @ebarry what to do.

@jywarren
Copy link
Member

This merging was completed by @milaaraujo and @stefannibrasil in #3107 and related issues in #1449 -- but we are removing the sidebar content in #2493 (or potentially making them default hidden, if someone is passionate about those -- we're hoping the new tag interface can help to perform this function)

However, we are looking for ways to find tags related to any given tag, so I'm going to close this and relate it to #3143 which has next steps and code hints. Thanks!!! (wow, closing #13... yipeee!)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug the issue is regarding one of our programs which faces problems when a certain task is executed enhancement explains that the issue is to improve upon one of our existing features outreach issues involve community involvement and helping people who're stuck somewhere
Projects
None yet
Development

No branches or pull requests

6 participants