Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add natural language image search (embed img into vector and index) #89

Merged
merged 10 commits into from
Feb 10, 2024

Conversation

Antonoko
Copy link
Member

@Antonoko Antonoko commented Jan 1, 2024

使用自然语义搜索画面内容

#20

  • 与全局搜索兼容同一套 webui (df 表格与视频选择)
  • 关联向量数据库与 sqlite 关系(ROWID 关联)
  • 完成自动索引机制(让 iframe cache 留久一些、由图像向量负责清理。提取 iframe 方法结果使用opencv)
  • 提供对先前资料的手动索引脚本
  • onboarding setting
  • i18n
  • 完成向量化核心功能和向量数据库事务
  • 根据 python 3.10 3.11 自动安装对应推理环境
  • 改用 onnx 模型进行推理(没内能耐 等官方更新了再说【

fixed: #98

tweak

  • 修复搜索未重置翻页状态
  • 搜索时添加 with st.spinner
  • 给索引过程加锁

@Antonoko Antonoko marked this pull request as ready for review January 21, 2024 14:46
@Antonoko Antonoko requested a review from jpswing January 21, 2024 14:46
@Antonoko
Copy link
Member Author

Antonoko commented Jan 21, 2024

huge work, recommended to review the last files changed only.🥴
All functional behavior has been tested under multiple rounds.

  • fix: the poetry env seems can't enable pytorch cuda (uform will use cuda to accelerate embedding, cuda works on my PC's env but not behave same in virtual env

@Antonoko Antonoko marked this pull request as draft January 22, 2024 09:54
@Antonoko Antonoko marked this pull request as ready for review January 22, 2024 12:42
@Antonoko
Copy link
Member Author

Antonoko commented Jan 23, 2024

更新:

先做成可选的安装功能吧!不作为默认附带的功能。


支持cuda的torch也太大了!!可能得考虑使用onnx来进行推理😔

edit: uform 开发者的回复

We didn't attach the example of use in UForm's repo yet.  But it should be as easy as it is in the Docs (https://onnxruntime.ai/docs/api/python/api_summary.html)

Also, you can try User's PR (https://github.com/unum-cloud/uform/pull/57). But I will need to polish that before merging (probably it will be a part of our next release).

https://huggingface.co/unum-cloud/uform-coreml-onnx
image

@Antonoko Antonoko marked this pull request as draft January 23, 2024 06:19
@Antonoko Antonoko force-pushed the embed-image-vector-index branch 2 times, most recently from 3811343 to 4f306d6 Compare February 3, 2024 09:08
@Antonoko Antonoko marked this pull request as ready for review February 3, 2024 16:20
@Antonoko Antonoko force-pushed the embed-image-vector-index branch from 39df31b to d055123 Compare February 9, 2024 14:25
add natural language image search

tweak manager running logic

add: log

improve img emb search load and ux

add: lock during img embedding

tweak onboard setting

fix: handle error with no search results

Update embedding_img_for_all_videofiles.py

add i18n

add: webui img embed search

add manual img emb script

add idle routine

refactor video file on disk checking

eliminate unnecessary endswith

add: video file embedding process

remove random walk, tweak code, WIP, sorry.

add db_get_row_from_vid_filename

tweak webui

add image search webui

tweak search ux; add image embed lib

add install script

update onboard setting

add extension readme

Update languages.json
@Antonoko Antonoko force-pushed the embed-image-vector-index branch from d055123 to 8ab6320 Compare February 9, 2024 14:38
@Antonoko Antonoko merged commit 1036329 into main Feb 10, 2024
@Antonoko Antonoko deleted the embed-image-vector-index branch February 10, 2024 05:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

bug: 全局搜索中的计时不准确、未起作用
1 participant