-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support /extract and /crawl for self-hosted #1137
base: main
Are you sure you want to change the base?
Conversation
This returns the job response from redis rather than supabase when db auth is disabled (self hosted mode)
@mogery or anyone else. Can you let me know any thoughts on this issue with self hosting? I'm mainly wondering whether returning from redis (easier and as-implemented in this example) vs supabase (much harder) in a self-hosted non-db-authenticated environment would be desired. |
I personally agree that using redis for self-hosted is much more desirable than supabase. It keeps with the spirit of it being self-hosted. |
This makes sense for self-hosted! What I'm more confused about is the manual retrieval via hgetall instead of using |
You are absolutely right, was mainly trying to understand the differences in the getExtract and what I saw in the queue, but never have come across this setup. I think I will add I was having trouble though finding a type to describe the data from supabase vs the extract job returnValue. For example, |
After creating
With these changes, calls like |
I did go ahead and make use of getJob in crawl-status.ts and did not create the centralized completed-jobs.ts in lib. We can leave it as-is or I can make the other proposed changes. This could be merged as-is. |
The goal of this branch would be to support retrieving the crawl and extract data, even if there is no supabase for persistant storage and db authentication.
Things I'm not sure of:
Error When Getting Extract by ID (after completed)
Extract Job in Redis
Here is a completed job stored in redis underneath
bull:{extractQueue}:extract:${extract_job_id}
. I noticed the completed extract exists here, so updatedextract-status.ts
to fetch this data when db auth is set to false.Example Response from /extract/{id} After My Changes