Skip to content

MLE-22708 - Adds vector encoding/decoding utilities #111

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Jul 11, 2025

Conversation

BillFarber
Copy link
Contributor

@BillFarber BillFarber commented Jul 11, 2025

Implements utilities for encoding and decoding vectors to/from base64 strings, mirroring MarkLogic's vec:base64-encode and vec:base64-decode functions.

  • Enables interoperability between Python and MarkLogic for vector data.
  • Includes tests to verify encoding and decoding functionality in Python and against a MarkLogic server.

80% Copilot generated. I just had to fix the tests that send requests to MarkLogic.

Copy link

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

Implements Python utilities to encode/decode float vectors as base64 strings compatible with MarkLogic’s vec:base64 functions, adds corresponding unit and integration tests, and updates the test environment Docker Compose configuration.

  • Introduces VectorUtil with base64_encode/base64_decode methods in marklogic/vector_util.py
  • Adds Python and server-backed tests in tests/test_vector_util.py
  • Updates test-app/docker-compose.yml to rename the test service and use the MarkLogic 12 image

Reviewed Changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 1 comment.

File Description
tests/test_vector_util.py New tests for encoding/decoding vectors in Python and via server
test-app/docker-compose.yml Renamed test service and updated MarkLogic image reference
marklogic/vector_util.py Added base64 encoding/decoding utility for float vectors
Comments suppressed due to low confidence (2)

tests/test_vector_util.py:24

  • Consider adding a test case that passes a base64 string with a non-zero version header to base64_decode and asserts that a ValueError is raised, to cover the unsupported version path.
    for a, b in zip(decoded, VECTOR):

test-app/docker-compose.yml:1

  • [nitpick] The commented-out original service name is no longer needed—removing it will reduce confusion and clean up the configuration.
# name: docker-tests-marklogic_python

@BillFarber BillFarber force-pushed the feature/vectorEncodeDecode branch from df422b6 to dba9788 Compare July 11, 2025 00:22
Copy link
Contributor

@rjrudin rjrudin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, just a couple renaming things and then merge away.

@BillFarber BillFarber force-pushed the feature/vectorEncodeDecode branch from dba9788 to 0d9c324 Compare July 11, 2025 14:09
@BillFarber BillFarber merged commit e53ef8f into marklogic:develop Jul 11, 2025
1 check passed
@BillFarber BillFarber deleted the feature/vectorEncodeDecode branch July 11, 2025 14:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants