Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fuzzy Search #16586

Open
wants to merge 3 commits into
base: main
Choose a base branch
from
Open

Fuzzy Search #16586

wants to merge 3 commits into from

Conversation

e82eric
Copy link
Contributor

@e82eric e82eric commented Jan 22, 2024

Summary of the Pull Request

This pull request probably doesn't make sense to merge, but I found an open ticket for it and I had built a version for my personal build, so it felt like it was at least worth posting. If you do decide to implement something like this I would love to help work on it.

#5314

The fuzzy search implementation is from fzf-native. https://github.com/nvim-telescope/telescope-fzf-native.nvim

  • Adapted to work with ICU/UText

Searching Terminal Repository
WindowsTerminal_1LQ4mBJENo

Searching .net memory dump
Fuzzy-cdb

References and Relevant Issues

#5314

Detailed Description of the Pull Request / Additional comments

Validation Steps Performed

PR Checklist

  • Closes #xxx
  • Tests added/passed
  • Documentation updated
    • If checked, please file a pull request on our docs repo and link it here: #xxx
  • Schema updated (if necessary)

This comment has been minimized.

@zadjii-msft
Copy link
Member

but also, we're still wrapping up a couple internal commitments before we loop back to button up 1.20. I'm guessing just based on the scope of this delta that we may need to wait a bit before we can give this the review it very rightly deserves. Thanks for the patience here!

@DHowett
Copy link
Member

DHowett commented Jan 22, 2024

Holy s***!

@e82eric
Copy link
Contributor Author

e82eric commented Jan 22, 2024

Thanks! Totally understand, no rush on my end.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

@e82eric e82eric marked this pull request as draft February 29, 2024 08:34
Copy link

github-actions bot commented Mar 5, 2024

@check-spelling-bot Report

🔴 Please review

See the 📂 files view, the 📜action log, or 📝 job summary for details.

Unrecognized words (55)
Automator
Barbazz
CFdanco
CFnco
CFSo
CFsodc
conni
danco
eidx
Extention
fbb
foob
fzf
fzfunittests
icns
Ingnore
Initalized
MAXASCII
Mfc
noooo
obz
pchar
pidx
previewer
rdoc
samb
sidx
sodc
stdbool
stddef
stdint
strdup
Sufix
uappend
uascii
ubonus
ucalculate
UCALL
udelim
ufzf
uhas
uidx
uindex
uis
uleading
unormalize
ustr
ustrtok
utrailing
utrim
utry
UWhite
Wrappable
zshc
zshcompctl
Previously acknowledged words that are now absent DESTINATIONNAME 🫥
To accept these unrecognized words as correct and remove the previously acknowledged and now absent words, you could run the following commands

... in a clone of the git@github.com:e82eric/terminal.git repository
on the fuzzy-search branch (ℹ️ how do I use this?):

curl -s -S -L 'https://raw.githubusercontent.com/check-spelling/check-spelling/v0.0.22/apply.pl' |
perl - 'https://github.com/microsoft/terminal/actions/runs/8152048091/attempts/1'
Available 📚 dictionaries could cover words (expected and unrecognized) not in the 📘 dictionary

This includes both expected items (2256) from .github/actions/spelling/expect/04cdb9b77d6827c0202f51acd4205b017015bfff.txt
.github/actions/spelling/expect/alphabet.txt
.github/actions/spelling/expect/expect.txt
.github/actions/spelling/expect/web.txt and unrecognized words (55)

Dictionary Entries Covers Uniquely
cspell:k8s/dict/k8s.txt 153 2 2
cspell:swift/src/swift.txt 53 1 1
cspell:gaming-terms/dict/gaming-terms.txt 59 1 1
cspell:monkeyc/src/monkeyc_keywords.txt 123 1 1
cspell:cryptocurrencies/cryptocurrencies.txt 125 1 1

Consider adding them (in .github/workflows/spelling2.yml) for uses: check-spelling/check-spelling@v0.0.22 in its with:

      with:
        extra_dictionaries:
          cspell:k8s/dict/k8s.txt
          cspell:swift/src/swift.txt
          cspell:gaming-terms/dict/gaming-terms.txt
          cspell:monkeyc/src/monkeyc_keywords.txt
          cspell:cryptocurrencies/cryptocurrencies.txt

To stop checking additional dictionaries, add (in .github/workflows/spelling2.yml) for uses: check-spelling/check-spelling@v0.0.22 in its with:

check_extra_dictionaries: ''
Errors (2)

See the 📂 files view, the 📜action log, or 📝 job summary for details.

❌ Errors Count
❌ check-file-path 15
❌ ignored-expect-variant 3

See ❌ Event descriptions for more information.

✏️ Contributor please read this

By default the command suggestion will generate a file named based on your commit. That's generally ok as long as you add the file to your commit. Someone can reorganize it later.

If the listed items are:

  • ... misspelled, then please correct them instead of using the command.
  • ... names, please add them to .github/actions/spelling/allow/names.txt.
  • ... APIs, you can add them to a file in .github/actions/spelling/allow/.
  • ... just things you're using, please add them to an appropriate file in .github/actions/spelling/expect/.
  • ... tokens you only need in one place and shouldn't generally be used, you can add an item in an appropriate file in .github/actions/spelling/patterns/.

See the README.md in each directory for more information.

🔬 You can test your commits without appending to a PR by creating a new branch with that extra change and pushing it to your fork. The check-spelling action will run in response to your push -- it doesn't require an open pull request. By using such a branch, you can limit the number of typos your peers see you make. 😉

If the flagged items are 🤯 false positives

If items relate to a ...

  • binary file (or some other file you wouldn't want to check at all).

    Please add a file path to the excludes.txt file matching the containing file.

    File paths are Perl 5 Regular Expressions - you can test yours before committing to verify it will match your files.

    ^ refers to the file's path from the root of the repository, so ^README\.md$ would exclude README.md (on whichever branch you're using).

  • well-formed pattern.

    If you can write a pattern that would match it,
    try adding it to the patterns.txt file.

    Patterns are Perl 5 Regular Expressions - you can test yours before committing to verify it will match your lines.

    Note that patterns can't match multiline strings.

OpenConsole.sln Outdated
@@ -2806,6 +2810,86 @@ Global
{328729E9-6723-416E-9C98-951F1473BBE1}.Release|ARM64.ActiveCfg = Release|ARM64
{328729E9-6723-416E-9C98-951F1473BBE1}.Release|x64.ActiveCfg = Release|x64
{328729E9-6723-416E-9C98-951F1473BBE1}.Release|x86.ActiveCfg = Release|Win32
{A52785FE-216E-4B64-9634-84172E083DB0}.AuditMode|Any CPU.ActiveCfg = Debug|x64
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure how to keep visual studio from doing this.

@e82eric e82eric marked this pull request as ready for review March 5, 2024 08:06
@zadjii-msft
Copy link
Member

okay so what if we did this as a type of pane content, instead of as a part of the control itself? It could be like, the "Find All" pane or something (which is usually different in VsCode / sublime compared to the usual Find UI)

this has been shower thoughts with mike

@e82eric
Copy link
Contributor Author

e82eric commented Apr 29, 2024

okay so what if we did this as a type of pane content, instead of as a part of the control itself? It could be like, the "Find All" pane or something (which is usually different in VsCode / sublime compared to the usual Find UI)

this has been shower thoughts with mike

Trying to get my head around this, by pane do you mean something like the docked areas in visual studio?

@zadjii-msft
Copy link
Member

by pane do you mean something like the docked areas in visual studio

Sorta, yea. In the chain of PRs that terminated in #16914, we added support for the Terminal to have panes that included things that weren't just TermControl's - they could be anything.
image

So like, we could totally make a FuzzyFindPaneContent in the same manner. I'm sure there's some way we could like, stick a TermControl as a preview into that pane too.

(i cannot stress enough, this was a half baked shower thought)

@e82eric
Copy link
Contributor Author

e82eric commented Apr 29, 2024

by pane do you mean something like the docked areas in visual studio

Sorta, yea. In the chain of PRs that terminated in #16914, we added support for the Terminal to have panes that included things that weren't just TermControl's - they could be anything. image

So like, we could totally make a FuzzyFindPaneContent in the same manner. I'm sure there's some way we could like, stick a TermControl as a preview into that pane too.

(i cannot stress enough, this was a half baked shower thought)

got it, it should work, that is similar to how I had it implemented at first, it was a xaml row below the swap panel control, and I was scrolling the swap panel to see the selected match (the scrolling wasn't necessary it was just nice to see the context rows around the selected match while in the list box).

the main downside was that when I was done with the fuzzy search I wanted the screen space back so I was hiding the row after the search, which mostly worked but cdb and a few other TUI applications would do weird things with the cursor with all the reflows (The reflows wouldn't be an issue if the pane doesn't auto hide).

having it in another pane would would make the code a lot simpler (mostly not having the extra renderer/swap panel for the previewer).

@e82eric
Copy link
Contributor Author

e82eric commented Apr 30, 2024

@zadjii-msft not sure if it helps, this is what it looked like when I had the search control below the swap panel. I think visually that would be similar to a pane (without the auto show/hide stuff)

fuzzy_pane

@zadjii-msft
Copy link
Member

Honestly yea that looks almost exactly like what I was thinking. It's probably a lot of plumbing to get it up and into a separate Pane, but I'm thinking that might be a bit cleaner architecture-wise

@e82eric
Copy link
Contributor Author

e82eric commented Apr 30, 2024

Makes sense, I think the main thing is that the pane would need a reference to the Terminal instance (or some proxy/interface) to read the contents of the buffer, set the selected char and optionally perform scrolling to show the current item from the list box in the swap panel.

I think selecting the char had a dependency on the Renderer also.

@zadjii-msft
Copy link
Member

I don't think I'd be opposed to some winrt API's on TermControl.idl (and probably ControlCore.idl) to proxy through something like

struct SelectionRange {
  Windows.Foundation.Point Start;
  Windows.Foundation.Point End;
}

runtimeclass TermControl {
  // ...
  Vector<SelectionRange> GetPatternMatches(String pattern);

  void SelectRange(SelectionRange range); // basically just a projection of ControlCore::_selectSpan
}

@e82eric
Copy link
Contributor Author

e82eric commented May 1, 2024

@zadjii-msft

I can give it a try and see how it goes. Its probably a long shot that it would be something you would want to merge but it feels like the exercise would be worth while to test the Pane abstraction.

  • I would use ScratchPad as an example right?

Will this work for TermControl?

//HighlightedTextSegment is from the file below.  I am not sure if I would have access to it, if not I could create a similar set of clasess.
//I need the text spans to show the parts of the row that make up the fuzzy match
//https://github.com/microsoft/terminal/blob/32fbb16d43ddfcdd039a61c5af051a16b972faeb/src/cascadia/TerminalApp/HighlightedText.h#L11
struct FuzzySearchMetch {
  HighlightedTextSegment Text;
  //I think the start is the only part of match that is useful for selection since it is n number of spans across the row.  Selecting the first char in the match seems to get me where I want to go.  (This looked like how fzf and Telescope work)
  Windows.Foundation.Point Start;
}

runtimeclass TermControl {
  Vector<FuzzySearchMatch> GetPatternMatches(String pattern);
  void SelectPoint(SelectionPoint point);
}

For Vector<FuzzySearchMatch> GetPatternMatches(String pattern) These are the parts that I think I would need. (Would this work or would it be too invasive?)

  • Somewhere that has access to GetTextBuffer(). I think this would be ControlCore?
  • This iterates over all of the rows in the textbuffer and scores them.
    std::vector<FuzzySearchResultRow> Search::FuzzySearch(Microsoft::Console::Render::IRenderData& renderData, const std::wstring_view& needle) const
  • I needed this additional function in UTextAdapter to be able to support unicode, (before I was converting from wchar to char, I think this also keeps from copying the rows during the search)
    UText Microsoft::Console::ICU::UTextForWrappableRow(const TextBuffer& textBuffer, til::CoordType& row) noexcept
  • This is the library that I used for the native implementation of FZF, this may be a non starter for merging. I think pretty much everyone that uses neovim ends using this, so it should be battle tested, but I don't know how it would fit with Microsoft's security policies etc. https://github.com/nvim-telescope/telescope-fzf-native.nvim (I adapted it to work with ICU/UText)
    fzf_result_t ufzf_fuzzy_match_v2(bool case_sensitive, bool normalize,

@e82eric
Copy link
Contributor Author

e82eric commented May 7, 2024

@zadjii-msft I was able to get a really basic version of the fuzzy search working in the pane.

I forgot to ask, was the idea that the fuzzy search would search across all TermControls?

This is what the diff looked like.
e82eric@89a4e14

FuzzySearchPane

@zadjii-msft
Copy link
Member

image

@zadjii-msft
Copy link
Member

I forgot to ask, was the idea that the fuzzy search would search across all TermControls

Let's save that for a v2? Cause this is already awesome

@e82eric
Copy link
Contributor Author

e82eric commented May 13, 2024

@zadjii-msft Any chance you guys are thinking about adding the ability to float a pane? I was trying it out this weekend and it seemed pretty nice, for the fuzzy search it got out of the way after the search without having to reflow, for the TermControl it was nice to be able to do something temporary and have it get out the way (I actually think I found this more useful than the fuzzy search).

Fuzzy Search pane as float.
FloatFuzzy

TermControl pane as float.
FloatTerm

@zadjii-msft
Copy link
Member

Ha you've figured out my plan! I was absolutely planning that (in a bit longer term).

We've got this QueryPalette element we're using currently for Terminal Chat:

image

and part of my eventual goal is to unify that floating "pane" with the rest of our Pane code, s.t. arbitrary panes could "float" like that too, or start as a floating pane, then get pinned as one in the tree, or.... who knows what yet. We still need to better sort out how some of those UI flows would work, but it's definitely something I've been thinking about.

@e82eric
Copy link
Contributor Author

e82eric commented May 14, 2024

Awesome! I was surprised how nice it was having the termcontrol in a float windows.

I'm sure there's some way we could like, stick a TermControl as a preview into that pane too.

I was going to attempt adding a preview window using a TermControl just to see if it is possible or not, do I have it right that I would use the same instance of the ControlCore from main TermControl to create the preview TermControl?

@zadjii-msft
Copy link
Member

do I have it right that I would use the same instance of the ControlCore from main TermControl to create the preview TermControl

Yes, ish. Having multiple different TermControl's attached to the same ControlCore at the same time is not something the codebase is super well suited for right now. I actually experimented with something like this just a couple months ago. I left my very haphazard notes over in #16495 (comment). I'd warn that those notes are very rough and likely stale. But I think the basic idea - having a bunch of TermControl's that can all connect to the same ControlCore - is probably pretty similar.

(it's also probably way past overkill for a v0 of this PR, but something you could play with)


I'll probably have more thoughts to share on this PR in like two weeks (after Build is done next week). Sorry for the delays!

@e82eric
Copy link
Contributor Author

e82eric commented May 14, 2024

Thanks! That gets me pointed in the right direction.

No rush on reviewing the PR.

@e82eric
Copy link
Contributor Author

e82eric commented May 29, 2024

That worked. (Adding the screenshot for reference, agree that it is overkill for v0)
FuzzyPaneWithPreview

@zadjii-msft zadjii-msft mentioned this pull request May 31, 2024
20 tasks
@zadjii-msft
Copy link
Member

I was able to get a really basic version of the fuzzy search working in the pane.

I forgot to ask, was the idea that the fuzzy search would search across all TermControls?

This is what the diff looked like. e82eric@89a4e14

Looping back around - was there a reason this Pane version wasn't pushed as the tip of this PR? I think that's probably a lot closer to merging than all this IRenderData stuff.

(I'm collecting up other notes too, but I wanted to make sure that question was answered first)

@e82eric
Copy link
Contributor Author

e82eric commented Jun 3, 2024

I was able to get a really basic version of the fuzzy search working in the pane.
I forgot to ask, was the idea that the fuzzy search would search across all TermControls?
This is what the diff looked like. e82eric@89a4e14

Looping back around - was there a reason this Pane version wasn't pushed as the tip of this PR? I think that's probably a lot closer to merging than all this IRenderData stuff.

(I'm collecting up other notes too, but I wanted to make sure that question was answered first)

@zadjii-msft sorry, I got distracted by the float plane and preview window stuff. Will update the PR to use e82eric@89a4e14 tonight.

@zadjii-msft
Copy link
Member

sorry, I got distracted by the float plane and preview window stuff. Will update the PR to use e82eric@89a4e14 tonight.

No worries! Just wanted to make sure there wasn't something else I was missing.

I'll write up the rest of our notes in the meantime ☺️

@e82eric
Copy link
Contributor Author

e82eric commented Jun 4, 2024

sorry, I got distracted by the float plane and preview window stuff. Will update the PR to use e82eric@89a4e14 tonight.

No worries! Just wanted to make sure there wasn't something else I was missing.

I'll write up the rest of our notes in the meantime ☺️

Should be updated now.

@zadjii-msft
Copy link
Member

We talked about this as a team, and we're pretty excited about it. Especially the pane version, which seems a lot cleaner than before.

Our biggest concern right now is that we think that particular fzf implementation's license isn't quite right. The original fzf has a "Copyright (c) 2013-2024 Junegunn Choi" which is missing in the telescope-fzf-native.vim C port LICENSE. It just says "Copyright (c) 2021 Simon Hauser". Since it's a "derivative work" it's a license violation (it's a source port), because the 2nd paragraph in the MIT license states:

The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.

We can't fix that violation ourselves for the same reason, because the 2nd paragraph compels us to reproduce the license exactly, which we can, but it's an invalid one.
----> We need to contact telescope-fzf-native.nvim to fix their LICENSE file.
The right way to do it is to simply have 2 "Copyright ..." lines. For reference: https://softwareengineering.stackexchange.com/a/386584

We also wanna make sure that the fzf implementation handles non-ascii chars correctly, which we're not totally sure about. Polishing our support for unicode is one of our highest priorities right now - I'm not sure that we'd be able to take this out of canary-only until we can make sure that it'll better support unicode. (this is something we'll need to do diligence on our side)

I think @lhecker will have additional feedback on the particular UText integration into fzf. E.g. ufzf_string_t should just be a UText* ; utextSlice_charAt may return U_SENTINEL (= 0xffffffff) which may be given to functions that can't deal with that = random behavior).

But big picture: this is super cool. Let's work together to get this across the finish line 😄

@e82eric
Copy link
Contributor Author

e82eric commented Jun 4, 2024

Awesome, I am excited to get to work on this. Side note I have learned a ton from this and even if it doesn't get merged or out of the canary build it will have been time well spent for me.

Totally understand on the license. Is there anything I can do to help?

For the unicode stuff, any chance there is a list of chars/scenerios that need to be tested? I had added a test project in the old PR that ported the telescope-native-fzf tests that I used to validate that stuff still worked after the Utext changes. I also added some tests to see if it worked with some chars that I think were unicode, (it looked like I mostly used \U0001F600: 😀 \u01C5: Dž \u00CF: Ĩ).

If that type of test works I can add that test project to the new PR and try to add tests for all of the unicode scenerios.
https://github.com/e82eric/terminal/blob/ae97bbd1819e5fa183125d30185a3d51dd343384/src/buffer/fzf.Unit.Tests/fzf.Unit.Tests.cpp#L124

Will start looking at the ufzf_string_t/UText* stuff tonight.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants