-
Notifications
You must be signed in to change notification settings - Fork 11
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
restructure responsibilities around indexing and generative content #317
Conversation
Southclaws
commented
Dec 14, 2024
- remove autotag/summarise from index consumer
- fix some issues with auto tag
- refactor node mutation logic
- remove autotag/summarise from index consumer - fix some issues with auto tag - refactor node mutation logic
The latest updates on your projects. Learn more about Vercel for Git ↗︎ 2 Skipped Deployments
|
📝 Walkthrough📝 WalkthroughWalkthroughThis pull request introduces significant changes to the node management and processing system. The modifications span multiple files and services, focusing on enhancing node creation, updating, and indexing processes. Key changes include adding an Changes
Possibly related PRs
Poem
Tip CodeRabbit's docstrings feature is now available as part of our Early Access Program! Simply use the command Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 4
🧹 Outside diff range and nitpick comments (4)
app/services/library/node_mutate/update.go (1)
58-58
: Logging failure to publish index post messageLogging the error when failing to publish the index post message is appropriate. However, consider whether additional error handling or retries are necessary to ensure indexing consistency.
app/services/library/node_semdex/autofill.go (1)
16-35
: Add documentation for the autofill functionThe function implements important business logic for node autofill but lacks documentation explaining its purpose and behavior.
Add a doc comment explaining the function's purpose:
+// autofill updates a node with optional summarization and auto-tagging. +// It delegates the actual updates to the node updater while maintaining +// separation of concerns between indexing and content generation. func (i *semdexer) autofill(ctx context.Context, id library.NodeID, summarise bool, autotag bool) error {app/services/library/node_semdex/node_semdex.go (1)
149-169
: Consider adding graceful shutdown handling for the subscription goroutine.The new subscription goroutine for
autoFillQueue
should be properly cleaned up when the service shuts down. Consider using a context cancellation or a done channel.lc.Append(fx.StartHook(func(_ context.Context) error { + ctx, cancel := context.WithCancel(ctx) sub, err := autoFillQueue.Subscribe(ctx) if err != nil { return err } go func() { + defer cancel() for msg := range sub { ctx = session.GetSessionFromMessage(ctx, msg) if err := re.autofill(ctx, msg.Payload.ID, msg.Payload.SummariseContent, msg.Payload.AutoTag); err != nil { l.Error("failed to autofill node", zap.Error(err)) } msg.Ack() } }() return nil }))app/transports/http/bindings/nodes.go (1)
246-248
: Consider adding validation for tag fill rule.The tag fill rule handling could benefit from additional validation before applying it to the partial struct.
if tfr, ok := tagFillRuleParam.Get(); ok { + // Validate the fill rule before applying + if err := validateTagFillRule(tfr); err != nil { + return nil, fault.Wrap(err, fctx.With(ctx)) + } partial.TagFill = opt.New(tag.TagFillCommand{FillRule: tfr}) }
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (13)
app/resources/mq/message_types.go
(1 hunks)app/services/generative/generative.go
(1 hunks)app/services/generative/summary.go
(2 hunks)app/services/library/node_fill/filler.go
(3 hunks)app/services/library/node_mutate/create.go
(2 hunks)app/services/library/node_mutate/mutate.go
(5 hunks)app/services/library/node_mutate/options.go
(1 hunks)app/services/library/node_mutate/update.go
(3 hunks)app/services/library/node_semdex/autofill.go
(1 hunks)app/services/library/node_semdex/indexer.go
(2 hunks)app/services/library/node_semdex/node_semdex.go
(7 hunks)app/transports/http/bindings/nodes.go
(3 hunks)web/src/screens/library/LibraryPageScreen/useLibraryPageScreen.ts
(2 hunks)
🔇 Additional comments (23)
app/services/library/node_mutate/create.go (2)
9-9
: Importing opt
package is appropriate
The addition of github.com/Southclaws/opt
is necessary for handling optional values in the code.
40-44
: Verify that preMutation
fully replaces applyOpts
Ensure that the new preMutation
method encompasses all functionalities previously handled by applyOpts
, to prevent any regression or loss of functionality.
Run the following script to check for any remaining references to applyOpts
:
✅ Verification successful
preMutation
fully replaces and enhances the functionality of applyOpts
Based on the comprehensive code review, the preMutation
method not only replaces but significantly enhances the previous functionality:
- It handles all basic field updates (name, slug, content, etc.)
- Adds advanced features like:
- Asset source handling
- URL fetching and link creation
- Tag management (add/remove)
- Content summarization
- Parent node resolution
- Properly structures the mutation options into a result type with additional metadata
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Description: Verify that `applyOpts` has been fully removed from the codebase.
# Test: Search for any occurrences of `applyOpts`.
rg 'applyOpts' -A 2 -B 2
Length of output: 24
Script:
#!/bin/bash
# Let's try a more comprehensive search strategy
# 1. Search for any historical references to applyOpts in git history
git log -p --all | rg "applyOpts"
# 2. Search for the current implementation of preMutation to verify its functionality
ast-grep --pattern 'func $_($_ *service) preMutation($$$) {
$$$
}'
# 3. Search for any test files that might contain references
rg -t go "preMutation|applyOpts" --glob "*_test.go" -A 2 -B 2
Length of output: 134
Script:
#!/bin/bash
# Let's try a different approach focusing on the current file and its implementation
# 1. Look for the file content to understand the implementation
cat app/services/library/node_mutate/create.go
# 2. Search for any other files in the same directory that might contain related code
fd . app/services/library/node_mutate/ -t f -x cat {}
# 3. Search for preMutation usage across the codebase
rg "preMutation" -A 3 -B 3
Length of output: 20873
app/services/library/node_mutate/update.go (3)
44-49
: Usage of preMutation
in Update
method
The Update
method now utilizes preMutation
to prepare options. This change improves consistency with the Create
method. Ensure that preMutation
handles all necessary option preparations.
73-73
: Assigning TagSuggestions
from pre.tags
Setting TagSuggestions
from pre.tags
ensures that any generated tag suggestions are included in the response. This is appropriate and maintains consistency with the updated tagging process.
24-24
: Verify calls to Update
method reflect the new signature
The Update
method no longer accepts the options
parameter. Ensure that all calls to Update
have been updated to match the new signature.
Run the following script to find calls to Update
that may still use the old signature:
✅ Verification successful
Let me search for direct calls to the Update
method of the Manager
type in the node_mutate
package to verify if there are any callers still using the old signature.
Let me try a different search pattern to find calls to the Update method in the node_mutate package.
Based on the search results, I can now provide a final response:
All calls to the Update method have the correct signature
The search results show all calls to the Update
method in the codebase, and they all use the new signature with three parameters (ctx, qk, partial). Here are the verified calls:
app/services/asset/analyse/pdf.go
: UsesUpdate(ctx, library.QueryKey{}, node_mutate.Partial{})
app/services/library/node_semdex/autofill.go
: UsesUpdate(ctx, qk, p)
app/services/library/node_mutate/update.go
: Internal implementation and a call withUpdate(ctx, qk, pre.opts...)
app/transports/http/bindings/nodes.go
: Multiple calls, all usingUpdate(ctx, deserialiseNodeMark(), partial)
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Description: Find calls to `Update` with the old method signature.
# Test: Search for `Update` calls with four or more arguments.
rg 'Update\([^,]+,[^,]+,[^,]+,[^,]+\)' -A 2 -B 2
Length of output: 26902
Script:
#!/bin/bash
# Search for direct calls to Manager.Update in node_mutate package
ast-grep --pattern 'Manager.Update($$$)'
# Also search using ripgrep to catch any potential calls
rg 'node_mutate.*Manager.*Update\(' -A 2 -B 2
Length of output: 90
Script:
#!/bin/bash
# Search for files that import node_mutate and look for Update calls
rg -l "node_mutate" | xargs rg "Update\(" -A 2 -B 2
# Also search specifically in the library package
rg -l "library" | xargs rg "node_mutate.*Manager.*Update" -A 2 -B 2
Length of output: 4338
app/services/library/node_mutate/mutate.go (3)
42-42
: Addition of TagFill
to Partial
struct
Including TagFill
in the Partial
struct allows for more flexible tag management during node mutations. Ensure that all uses of Partial
handle this new field correctly.
56-56
: Introduction of summariser
in Manager
struct
Adding the summariser
to Manager
integrates summarization capabilities into node mutations. Confirm that summariser
is properly initialized and utilized where necessary.
Line range hint 71-85
: Verify constructors are updated with new summariser
parameter
The New
constructor now requires a summariser
parameter. Ensure that all instantiations of Manager
pass this new parameter to prevent runtime issues.
Run the following script to find any constructor calls missing the summariser
argument:
✅ Verification successful
No issues with the summariser parameter in constructors
The verification shows that the dependency injection is properly set up:
- The
summariser
is provided via the generative package's fx setup (app/services/generative/generative.go) where it's registered asfx.As(new(Summariser))
- The
node_mutate.New
constructor is injected through fx.Provide in app/services/library/library.go - Since this is using the fx dependency injection framework, the framework ensures all required dependencies are provided at startup time, and will fail fast if the summariser is not properly configured
The dependency injection framework guarantees that the new summariser
parameter will be properly passed to all Manager
instantiations. No manual verification of constructor calls is needed as they are handled automatically by the fx framework.
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Description: Find `node_mutate.New` calls lacking the `summariser` parameter.
# Test: Search for `node_mutate.New` calls with fewer than expected arguments.
rg 'node_mutate\.New\(' -A 5 | awk '/node_mutate\.New\(/ {c=5} c&&c--'
Length of output: 72
Script:
#!/bin/bash
# Let's try a different approach to find Manager instantiations
# First, find all files that might contain node_mutate.New or direct Manager struct initialization
rg -l "node_mutate|Manager{"
# Then let's look for actual instantiations with more context
rg -B 2 -A 5 "(?:node_mutate\.New|Manager{)"
Length of output: 321
Script:
#!/bin/bash
# Let's try a simpler approach with proper escaping
# First find files containing node_mutate.New
rg -l "node_mutate\.New"
# Then look for the actual instantiations with context
rg -B 2 -A 5 "node_mutate\.New"
# Also search for direct Manager struct initialization
rg -B 2 -A 5 "Manager\{"
Length of output: 6465
Script:
#!/bin/bash
# Let's examine the dependency injection setup since we found node_mutate.New is provided via fx.Provide
# First, let's find where the summariser is provided
rg -l "Summariser|summariser"
# Then look for the actual provider setup
rg -B 2 -A 5 "(?i)summariser"
# Also check for any wire/fx setup files
fd -e go -x grep -l "fx\." {} \;
Length of output: 7848
app/services/library/node_mutate/options.go (5)
38-149
: Review preMutation
method for comprehensive option handling
The preMutation
method centralizes option preparation for node mutations. Ensure that all fields in Partial
are correctly mapped to their corresponding options and that error handling is thorough.
75-87
: Queue assets for analysis when ContentFill
is specified
The logic correctly queues assets for analysis when ContentFill
is provided. Confirm that assets are processed as expected and that potential errors are properly managed.
97-102
: Ensure correct content selection for pre-mutation tasks
The method selects content from the Partial
or the current node for tasks like tag suggestion and summarization. Verify that this logic correctly prioritizes new content over existing when appropriate.
105-117
: Validate tag filling logic
The handling of TagFill
respects the specified fill rule. When Replace
is used, tags are replaced with suggestions; otherwise, existing tags are preserved or augmented. Ensure this aligns with the desired tag management strategy.
207-219
: Handle empty content in buildTagSuggestionOpts
The method gracefully returns when content is empty, avoiding unnecessary processing. Error handling for the tagging process is also appropriate. Confirm that this behavior meets expected requirements.
app/services/generative/generative.go (1)
18-18
: Interface change requires verification of all implementations
The parameter type change from datagraph.Item
to datagraph.Content
is more specific and aligned with the content processing responsibility. However, this is a breaking change.
✅ Verification successful
Let me try a different approach to find implementations of the Summariser interface.
Let me check the implementation in the generator struct.
Let me check the previous implementation to confirm the change.
Based on the search results, I can now provide a final response:
Interface change appears to be properly implemented
The Summariser
interface is implemented by the generator
struct in the generative package, and its implementation in summary.go
already uses the new datagraph.Content
parameter type. The interface is only used internally within the same package and by node_mutate
package as a dependency injection.
- ✓ Implementation in
generator.Summarise
matches the interface - ✓ All usages of
Summariser
are properly wired through dependency injection - ✓ No other implementations of the interface were found in the codebase
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Find all implementations of the Summariser interface to ensure they're updated
ast-grep --pattern 'type $_ struct {
$$$
}
func ($_ $_) Summarise(ctx context.Context, $_) (string, error) {
$$$
}'
Length of output: 122
Script:
#!/bin/bash
# Search for struct types that implement Summarise method
rg -U "type.*struct.*\n.*\n.*Summarise.*context\.Context.*\).*\(string,.*error\)" --type go
# Also search for the interface usage to find where it might be implemented
rg "Summariser" --type go
Length of output: 479
Script:
#!/bin/bash
# Check the generator struct implementation
rg -A 10 "type generator struct" --type go
# Check the actual Summarise method implementation
rg -B 2 -A 10 "func \(.*generator\) Summarise\(" --type go
Length of output: 1552
Script:
#!/bin/bash
# Check git history for the previous implementation of the Summarise method
git log -p -- app/services/generative/summary.go | rg -B 5 -A 5 "func.*Summarise"
Length of output: 0
app/services/library/node_semdex/indexer.go (1)
Line range hint 14-32
: Clean separation of indexing from content generation
The simplified index
method now properly focuses on its core responsibility of indexing, with content generation concerns moved elsewhere. This aligns well with the PR objectives.
✅ Verification successful
Let me verify the context of getSummary
usage to ensure it's not related to the indexing functionality.
Verified: Indexing functionality successfully separated from content generation
The getSummary
occurrences found are in app/resources/datagraph/content.go
and are part of content processing logic, completely unrelated to the indexing functionality. The old summarization and tagging logic has been successfully removed from the indexer, making it more focused on its core responsibility.
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Verify that old summarization and tagging logic is completely removed
rg -g '!*.md' -i 'getSummary|generateTags|applyTags'
Length of output: 207
Script:
#!/bin/bash
# Check the context of getSummary usage in content.go
rg -A 5 -B 5 'getSummary' app/resources/datagraph/content.go
Length of output: 639
app/resources/mq/message_types.go (2)
20-21
: LGTM: IndexNode ID field addition
The addition of the ID field to IndexNode struct provides clear identification of nodes for indexing operations.
Line range hint 23-27
: LGTM: Well-structured AutoFillNode type
The new AutoFillNode type is well-designed with clear boolean flags for controlling summarization and auto-tagging features.
app/services/generative/summary.go (2)
Line range hint 15-30
: LGTM: Improved summarization prompt
The updated prompt template provides:
- Clear instructions for handling malformed content
- Better guidance for maintaining neutral tone
- Explicit output format requirements
Line range hint 32-46
: LGTM: Improved type safety in Summarise method
The change from datagraph.Item
to datagraph.Content
provides better type safety and clearer intent.
app/services/library/node_fill/filler.go (3)
28-31
: LGTM: Well-structured Filler type
The addition of autoFillQueue is well-integrated with existing queue fields.
Line range hint 77-88
: LGTM: Proper error handling for asset queue
The error handling for asset queue publishing is properly implemented with fault wrapping.
90-97
: LGTM: Well-structured auto-fill queue publishing
The auto-fill queue publishing is properly implemented with appropriate error handling.
app/transports/http/bindings/nodes.go (1)
233-244
: LGTM! Well-structured partial object construction.
The construction of the node_mutate.Partial
struct is clear and organized, with proper use of optional types.
// failing to publish the deletion message is worthy of an error. | ||
return nil, fault.Wrap(err, fctx.With(ctx)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Return error on failure to publish deletion message
By returning an error when failing to publish the deletion message, you ensure that this critical failure is propagated. Confirm that this change aligns with the overall error handling strategy and that upstream functions handle this error appropriately.
ID: library.NodeID(library.NodeID(cfr.TargetNodeID)), | ||
SummariseContent: true, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fix redundant type conversion
There's a redundant type conversion in the ID field assignment.
Apply this fix:
-ID: library.NodeID(library.NodeID(cfr.TargetNodeID)),
+ID: library.NodeID(cfr.TargetNodeID),
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
ID: library.NodeID(library.NodeID(cfr.TargetNodeID)), | |
SummariseContent: true, | |
ID: library.NodeID(cfr.TargetNodeID), | |
SummariseContent: true, |
ctx = session.GetSessionFromMessage(ctx, msg) | ||
|
||
if err := re.autofill(ctx, msg.Payload.ID, msg.Payload.SummariseContent, msg.Payload.AutoTag); err != nil { | ||
l.Error("failed to autofill node", zap.Error(err)) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Enhance error handling for session context retrieval.
The session context retrieval and autofill operations should handle potential errors more gracefully.
-ctx = session.GetSessionFromMessage(ctx, msg)
+newCtx, err := session.GetSessionFromMessage(ctx, msg)
+if err != nil {
+ l.Error("failed to get session from message", zap.Error(err))
+ msg.Ack()
+ continue
+}
+ctx = newCtx
if err := re.autofill(ctx, msg.Payload.ID, msg.Payload.SummariseContent, msg.Payload.AutoTag); err != nil {
l.Error("failed to autofill node", zap.Error(err))
+ // Consider implementing a retry mechanism or dead letter queue
}
Committable suggestion skipped: line range outside the PR's diff.
await linkCreate( | ||
{ url: link.url }, | ||
{ | ||
content: link.description, | ||
primary_image_asset_id: link.primary_image?.id, | ||
content_fill_rule: "replace", | ||
node_content_fill_target: node.id, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Improve error handling in linkCreate operation.
The linkCreate operation should handle potential errors more explicitly and provide user feedback.
await linkCreate(
{ url: link.url },
{
content_fill_rule: "replace",
node_content_fill_target: node.id,
+ }).catch((error) => {
+ console.error('Failed to create link:', error);
+ throw new Error('Failed to import link. Please try again.');
},
);
Committable suggestion skipped: line range outside the PR's diff.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
🧹 Nitpick comments (8)
internal/infrastructure/ai/ai.go (2)
15-15
: Consider adding documentation for the EmbeddingFunc methodThe new interface method would benefit from documentation explaining its purpose, expected behavior, and return value characteristics.
+// EmbeddingFunc returns a function that generates vector embeddings for text input. +// The returned function should be thread-safe and suitable for concurrent use. EmbeddingFunc() func(ctx context.Context, text string) ([]float32, error)
37-39
: Consider adding error documentation for disabled implementationThe nil return might be unexpected. Consider documenting the behavior or returning a specific error.
func (d *Disabled) EmbeddingFunc() func(ctx context.Context, text string) ([]float32, error) { - return nil + return func(ctx context.Context, text string) ([]float32, error) { + return nil, ErrDisabled + } }internal/infrastructure/ai/mock.go (2)
34-35
: Document the embedding generation formulaThe magic numbers (256, 3071, 65535) in the embedding generation formula need documentation explaining their significance.
+// Generate a deterministic embedding value using: +// - 256: byte range for character normalization +// - 3071: prime factor for distribution +// - 65535: maximum uint16 for range limitation c := ((float32(v % 256)) / 256) * float32(((i+1)*3071)%65535)
43-56
: Consider optimizing vector normalizationThe normalization function could be optimized by:
- Pre-computing the squared values
- Using a single allocation
func normalizeVector(vec []float32) []float32 { var n float32 + r := make([]float32, len(vec)) for _, v := range vec { n += v * v } n = float32(math.Sqrt(float64(n))) - r := make([]float32, len(vec)) for i, v := range vec { r[i] = v / n } return r }internal/infrastructure/ai/openai.go (2)
16-17
: Document thread safety requirements for ef fieldThe embedding function field should be documented regarding its thread safety guarantees.
type OpenAI struct { client *openai.Client + // ef is a thread-safe function for generating text embeddings ef func(ctx context.Context, text string) ([]float32, error) }
46-48
: Consider adding context validation in EmbeddingFunc wrapperThe EmbeddingFunc method could benefit from additional context validation.
func (o *OpenAI) EmbeddingFunc() func(ctx context.Context, text string) ([]float32, error) { - return o.ef + return func(ctx context.Context, text string) ([]float32, error) { + if ctx == nil { + ctx = context.Background() + } + return o.ef(ctx, text) + } }app/services/semdex/semdexer/semdexer.go (1)
13-13
: LGTM! Clean dependency injection implementation.The changes effectively decouple the semantic indexer from specific AI implementations by introducing dependency injection through the
ai.Prompter
interface. This improves modularity and testability.Consider documenting the expected behavior of the
ai.Prompter
interface in the package documentation to help implementers understand the requirements.Also applies to: 23-23, 31-31
app/services/semdex/semdexer/chromem_semdexer/chromem.go (1)
35-36
: Consider enhancing error message clarity.The error message could be more specific about why the language model provider is required.
Consider this alternative error message:
- return nil, fault.New("a language model provider must be enabled for the embedded semdexer to be enabled") + return nil, fault.New("embedded semdexer requires an enabled language model provider for generating embeddings")
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (5)
app/services/semdex/semdexer/chromem_semdexer/chromem.go
(2 hunks)app/services/semdex/semdexer/semdexer.go
(2 hunks)internal/infrastructure/ai/ai.go
(2 hunks)internal/infrastructure/ai/mock.go
(1 hunks)internal/infrastructure/ai/openai.go
(2 hunks)
🔇 Additional comments (3)
internal/infrastructure/ai/ai.go (1)
23-25
: LGTM: Mock provider addition
The addition of the mock provider case is well-placed and follows the existing pattern.
app/services/semdex/semdexer/chromem_semdexer/chromem.go (2)
20-20
: LGTM! Clean interface implementation.
The addition of the ai.Prompter
parameter and its import aligns with the dependency injection pattern established in the parent package.
Also applies to: 29-29
39-39
: Verify error handling for embedding function.
The embedding function is retrieved but there's no explicit error handling if the function returns nil or fails.
Let's check if the EmbeddingFunc
method can return nil or error:
for _, v := range text { | ||
for i := range mockEmbeddingSize { | ||
c := ((float32(v % 256)) / 256) * float32(((i+1)*3071)%65535) | ||
embedding[i] = c | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fix the range loop in embedding generation
There's a bug in the embedding generation loop. The range should be over indices, not the size itself.
-for i := range mockEmbeddingSize {
+for i := 0; i < mockEmbeddingSize; i++ {
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
for _, v := range text { | |
for i := range mockEmbeddingSize { | |
c := ((float32(v % 256)) / 256) * float32(((i+1)*3071)%65535) | |
embedding[i] = c | |
} | |
} | |
for _, v := range text { | |
for i := 0; i < mockEmbeddingSize; i++ { | |
c := ((float32(v % 256)) / 256) * float32(((i+1)*3071)%65535) | |
embedding[i] = c | |
} | |
} |
client := openai.NewClient(option.WithAPIKey(cfg.OpenAIKey)) | ||
return &OpenAI{client: client}, nil | ||
ef := chromem.NewEmbeddingFuncOpenAI(cfg.OpenAIKey, chromem.EmbeddingModelOpenAI3Large) | ||
return &OpenAI{client: client, ef: ef}, nil |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add error handling for embedding function initialization
The embedding function initialization should include error handling as it might fail.
func newOpenAI(cfg config.Config) (*OpenAI, error) {
client := openai.NewClient(option.WithAPIKey(cfg.OpenAIKey))
- ef := chromem.NewEmbeddingFuncOpenAI(cfg.OpenAIKey, chromem.EmbeddingModelOpenAI3Large)
- return &OpenAI{client: client, ef: ef}, nil
+ ef, err := chromem.NewEmbeddingFuncOpenAI(cfg.OpenAIKey, chromem.EmbeddingModelOpenAI3Large)
+ if err != nil {
+ return nil, fault.Wrap(err, fctx.With(ctx))
+ }
+ return &OpenAI{client: client, ef: ef}, nil
}
Committable suggestion skipped: line range outside the PR's diff.