Skip to content

Commit 2c7b7b1

Browse files
Manisha4adchiaGyuminJackGyumin Leebjfletcher
authored
Msudhir/add vector update functionality (#14)
* ci: Add bigtable cleanup script Signed-off-by: Danny C <d.chiao@gmail.com> * fix: Missing Catalog argument in athena connector (feast-dev#3661) update Catalog argument in athena connector Signed-off-by: Gyumin Lee <t1100394@T1100394PM01.local> Co-authored-by: Gyumin Lee <t1100394@T1100394PM01.local> * ci: Disable flaky lambda materialization test Signed-off-by: Danny C <d.chiao@gmail.com> * fix: Broken non-root path with projects-list.json (feast-dev#3665) ensure correct precedence with the two operators Signed-off-by: Ben Fletcher <ben.fletcher@ft.com> * fix: Manage redis pipe's context (feast-dev#3655) Signed-off-by: Jiwon Park <bakjeeone@hotmail.com> * chore: Bump tough-cookie from 4.0.0 to 4.1.3 in /sdk/python/feast/ui (feast-dev#3677) Bumps [tough-cookie](https://github.com/salesforce/tough-cookie) from 4.0.0 to 4.1.3. - [Release notes](https://github.com/salesforce/tough-cookie/releases) - [Changelog](https://github.com/salesforce/tough-cookie/blob/master/CHANGELOG.md) - [Commits](salesforce/tough-cookie@v4.0.0...v4.1.3) --- updated-dependencies: - dependency-name: tough-cookie dependency-type: indirect ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * chore: Bump tough-cookie from 4.0.0 to 4.1.3 in /ui (feast-dev#3676) Bumps [tough-cookie](https://github.com/salesforce/tough-cookie) from 4.0.0 to 4.1.3. - [Release notes](https://github.com/salesforce/tough-cookie/releases) - [Changelog](https://github.com/salesforce/tough-cookie/blob/master/CHANGELOG.md) - [Commits](salesforce/tough-cookie@v4.0.0...v4.1.3) --- updated-dependencies: - dependency-name: tough-cookie dependency-type: indirect ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * fix: For SQL registry, increase max data_source_name length to 255 (feast-dev#3630) * sql.py data_sources.data_source_name String(255) Extend the limit of the data_source_name field from 50 to 255. Signed-off-by: Ross Donnachie <code@radonn.co.za> * fix: Optimize bytes processed when retrieving entity df schema to 0 (feast-dev#3680) feat: Optimize bytes processed when retrieving entity df schema to 0 Signed-off-by: Hai Nguyen <quanghai.ng1512@gmail.com> * fix: Entityless fv breaks with `KeyError: __dummy` applying feature_store.plan() on python (feast-dev#3640) * fix! KeyError: __dummy on entityless fv Signed-off-by: williamfoschiera <william.foschiera@buser.com.br> * fix! join_keys typing. Signed-off-by: williamfoschiera <william.foschiera@buser.com.br> --------- Signed-off-by: williamfoschiera <william.foschiera@buser.com.br> Co-authored-by: williamfoschiera <william.foschiera@buser.com.br> * chore: Bump protobufjs from 7.1.1 to 7.2.4 in /ui (feast-dev#3674) Bumps [protobufjs](https://github.com/protobufjs/protobuf.js) from 7.1.1 to 7.2.4. - [Release notes](https://github.com/protobufjs/protobuf.js/releases) - [Changelog](https://github.com/protobufjs/protobuf.js/blob/master/CHANGELOG.md) - [Commits](protobufjs/protobuf.js@protobufjs-v7.1.1...protobufjs-v7.2.4) --- updated-dependencies: - dependency-name: protobufjs dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * chore: Bump protobufjs from 7.1.2 to 7.2.4 in /sdk/python/feast/ui (feast-dev#3675) Bumps [protobufjs](https://github.com/protobufjs/protobuf.js) from 7.1.2 to 7.2.4. - [Release notes](https://github.com/protobufjs/protobuf.js/releases) - [Changelog](https://github.com/protobufjs/protobuf.js/blob/master/CHANGELOG.md) - [Commits](protobufjs/protobuf.js@protobufjs-v7.1.2...protobufjs-v7.2.4) --- updated-dependencies: - dependency-name: protobufjs dependency-type: indirect ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * chore: Bump semver from 6.3.0 to 6.3.1 in /ui (feast-dev#3678) Bumps [semver](https://github.com/npm/node-semver) from 6.3.0 to 6.3.1. - [Release notes](https://github.com/npm/node-semver/releases) - [Changelog](https://github.com/npm/node-semver/blob/v6.3.1/CHANGELOG.md) - [Commits](npm/node-semver@v6.3.0...v6.3.1) --- updated-dependencies: - dependency-name: semver dependency-type: indirect ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * chore: Bump semver from 6.3.0 to 6.3.1 in /sdk/python/feast/ui (feast-dev#3679) Bumps [semver](https://github.com/npm/node-semver) from 6.3.0 to 6.3.1. - [Release notes](https://github.com/npm/node-semver/releases) - [Changelog](https://github.com/npm/node-semver/blob/v6.3.1/CHANGELOG.md) - [Commits](npm/node-semver@v6.3.0...v6.3.1) --- updated-dependencies: - dependency-name: semver dependency-type: indirect ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * chore: Bump google.golang.org/grpc from 1.47.0 to 1.53.0 (feast-dev#3670) Bumps [google.golang.org/grpc](https://github.com/grpc/grpc-go) from 1.47.0 to 1.53.0. - [Release notes](https://github.com/grpc/grpc-go/releases) - [Commits](grpc/grpc-go@v1.47.0...v1.53.0) --- updated-dependencies: - dependency-name: google.golang.org/grpc dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * chore(release): release 0.32.0 # [0.32.0](feast-dev/feast@v0.31.0...v0.32.0) (2023-07-17) ### Bug Fixes * Added generic Feature store Creation for CLI ([feast-dev#3618](feast-dev#3618)) ([bf740d2](feast-dev@bf740d2)) * Broken non-root path with projects-list.json ([feast-dev#3665](feast-dev#3665)) ([4861af0](feast-dev@4861af0)) * Clean up snowflake to_spark_df() ([feast-dev#3607](feast-dev#3607)) ([e8e643e](feast-dev@e8e643e)) * Entityless fv breaks with `KeyError: __dummy` applying feature_store.plan() on python ([feast-dev#3640](feast-dev#3640)) ([ef4ef32](feast-dev@ef4ef32)) * Fix scan datasize to 0 for inference schema ([feast-dev#3628](feast-dev#3628)) ([c3dd74e](feast-dev@c3dd74e)) * Fix timestamp consistency in push api ([feast-dev#3614](feast-dev#3614)) ([9b227d7](feast-dev@9b227d7)) * For SQL registry, increase max data_source_name length to 255 ([feast-dev#3630](feast-dev#3630)) ([478caec](feast-dev@478caec)) * Implements connection pool for postgres online store ([feast-dev#3633](feast-dev#3633)) ([059509a](feast-dev@059509a)) * Manage redis pipe's context ([feast-dev#3655](feast-dev#3655)) ([48e0971](feast-dev@48e0971)) * Missing Catalog argument in athena connector ([feast-dev#3661](feast-dev#3661)) ([f6d3caf](feast-dev@f6d3caf)) * Optimize bytes processed when retrieving entity df schema to 0 ([feast-dev#3680](feast-dev#3680)) ([1c01035](feast-dev@1c01035)) ### Features * Add gunicorn for serve with multiprocess ([feast-dev#3636](feast-dev#3636)) ([4de7faf](feast-dev@4de7faf)) * Use string as a substitute for unregistered types during schema inference ([feast-dev#3646](feast-dev#3646)) ([c474ccd](feast-dev@c474ccd)) * fix: Redshift push ignores schema (feast-dev#3671) * Add fully-qualified-table-name Redshift prop Signed-off-by: Robin Neufeld <metavee@users.noreply.github.com> * pre-commit Signed-off-by: Robin Neufeld <metavee@users.noreply.github.com> * Docstring Signed-off-by: Robin Neufeld <metavee@users.noreply.github.com> * Test fully_qualified_table_name Signed-off-by: Robin Neufeld <metavee@users.noreply.github.com> * Simplify logic Signed-off-by: Robin Neufeld <metavee@users.noreply.github.com> * pre-commit Signed-off-by: Robin Neufeld <metavee@users.noreply.github.com> * pre-commit Signed-off-by: Robin Neufeld <metavee@users.noreply.github.com> * Test offline_write_batch Signed-off-by: Robin Neufeld <metavee@users.noreply.github.com> * Bump to trigger CI Signed-off-by: Robin Neufeld <metavee@users.noreply.github.com> * another bump for ci Signed-off-by: Robin Neufeld <metavee@users.noreply.github.com> --------- Signed-off-by: Robin Neufeld <metavee@users.noreply.github.com> * fix: Add aws-sts dependency in java sdk so that S3 client acquires IRSA role (feast-dev#3696) Add aws-sts dependency in java sdk Signed-off-by: harmeet-singh-discovery <harmeet_singh@discovery.com> * Adding initial update changes * Added formatting changes * Revert "Merge branch 'feast-dev:master' into msudhir/add-vector-update-functionality" This reverts commit 8487678, reversing changes made to 0578b9b. * Added more tests and functionality * updating tests * updated functionality and added more tests * correcting a test case * Making formatting corrections and changeing log * Improved tests and added functionality to convert feast schema to milvus readable schema * Added PR Review comments * Fixed failing test --------- Signed-off-by: Danny C <d.chiao@gmail.com> Signed-off-by: Gyumin Lee <t1100394@T1100394PM01.local> Signed-off-by: Ben Fletcher <ben.fletcher@ft.com> Signed-off-by: Jiwon Park <bakjeeone@hotmail.com> Signed-off-by: dependabot[bot] <support@github.com> Signed-off-by: Ross Donnachie <code@radonn.co.za> Signed-off-by: Hai Nguyen <quanghai.ng1512@gmail.com> Signed-off-by: williamfoschiera <william.foschiera@buser.com.br> Signed-off-by: Robin Neufeld <metavee@users.noreply.github.com> Signed-off-by: harmeet-singh-discovery <harmeet_singh@discovery.com> Co-authored-by: Danny C <d.chiao@gmail.com> Co-authored-by: 이규민 <32768535+GyuminJack@users.noreply.github.com> Co-authored-by: Gyumin Lee <t1100394@T1100394PM01.local> Co-authored-by: Ben Fletcher <bjfletcher@gmail.com> Co-authored-by: Jiwon Park <bakjeeone@hotmail.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Ross Donnachie <code@radonn.co.za> Co-authored-by: Harry <quanghai.ng1512@gmail.com> Co-authored-by: William Foschiera <wfoschiera@gmail.com> Co-authored-by: williamfoschiera <william.foschiera@buser.com.br> Co-authored-by: feast-ci-bot <feast-ci-bot@willem.co> Co-authored-by: Robin Neufeld <metavee@users.noreply.github.com> Co-authored-by: harmeet-singh-discovery <95894926+harmeet-singh-discovery@users.noreply.github.com> Co-authored-by: Manisha Sudhir <msudhir@expediagroup.com>
1 parent d421016 commit 2c7b7b1

File tree

5 files changed

+551
-29
lines changed

5 files changed

+551
-29
lines changed

sdk/python/docs/source/feast.protos.feast.core.rst

+16
Original file line numberDiff line numberDiff line change
@@ -164,6 +164,22 @@ feast.protos.feast.core.FeatureView\_pb2\_grpc module
164164
:undoc-members:
165165
:show-inheritance:
166166

167+
feast.protos.feast.core.VectorFeatureView\_pb2 module
168+
-----------------------------------------------
169+
170+
.. automodule:: feast.protos.feast.core.VectorFeatureView_pb2
171+
:members:
172+
:undoc-members:
173+
:show-inheritance:
174+
175+
feast.protos.feast.core.VectorFeatureView\_pb2\_grpc module
176+
-----------------------------------------------------
177+
178+
.. automodule:: feast.protos.feast.core.VectorFeatureView_pb2_grpc
179+
:members:
180+
:undoc-members:
181+
:show-inheritance:
182+
167183
feast.protos.feast.core.Feature\_pb2 module
168184
-------------------------------------------
169185

sdk/python/feast/expediagroup/vectordb/milvus_online_store.py

+162-3
Original file line numberDiff line numberDiff line change
@@ -1,14 +1,37 @@
1+
import logging
12
from datetime import datetime
23
from typing import Any, Callable, Dict, List, Optional, Sequence, Tuple
34

45
from pydantic.typing import Literal
6+
from pymilvus import (
7+
Collection,
8+
CollectionSchema,
9+
DataType,
10+
FieldSchema,
11+
connections,
12+
utility,
13+
)
514

615
from feast import Entity, RepoConfig
716
from feast.expediagroup.vectordb.vector_feature_view import VectorFeatureView
817
from feast.expediagroup.vectordb.vector_online_store import VectorOnlineStore
18+
from feast.field import Field
919
from feast.protos.feast.types.EntityKey_pb2 import EntityKey as EntityKeyProto
1020
from feast.protos.feast.types.Value_pb2 import Value as ValueProto
1121
from feast.repo_config import FeastConfigBaseModel
22+
from feast.types import (
23+
Array,
24+
FeastType,
25+
Float32,
26+
Float64,
27+
Int32,
28+
Int64,
29+
Invalid,
30+
String,
31+
)
32+
from feast.usage import log_exceptions_and_usage
33+
34+
logger = logging.getLogger(__name__)
1235

1336

1437
class MilvusOnlineStoreConfig(FeastConfigBaseModel):
@@ -17,13 +40,47 @@ class MilvusOnlineStoreConfig(FeastConfigBaseModel):
1740
type: Literal["milvus"] = "milvus"
1841
"""Online store type selector"""
1942

43+
alias: str = "default"
44+
""" alias for milvus connection"""
45+
2046
host: str
2147
""" the host URL """
2248

49+
username: str
50+
""" username to connect to Milvus """
51+
52+
password: str
53+
""" password to connect to Milvus """
54+
2355
port: int = 19530
2456
""" the port to connect to a Milvus instance. Should be the one used for GRPC (default: 19530) """
2557

2658

59+
class MilvusConnectionManager:
60+
def __init__(self, online_config: RepoConfig):
61+
self.online_config = online_config
62+
63+
def __enter__(self):
64+
# Connecting to Milvus
65+
logger.info(
66+
f"Connecting to Milvus with alias {self.online_config.alias} and host {self.online_config.host} and default port {self.online_config.port}."
67+
)
68+
connections.connect(
69+
host=self.online_config.host,
70+
username=self.online_config.username,
71+
password=self.online_config.password,
72+
use_secure=True,
73+
)
74+
75+
def __exit__(self, exc_type, exc_value, traceback):
76+
# Disconnecting from Milvus
77+
logger.info("Closing the connection to Milvus")
78+
connections.disconnect(self.online_config.alias)
79+
logger.info("Connection Closed")
80+
if exc_type is not None:
81+
logger.error(f"An exception of type {exc_type} occurred: {exc_value}")
82+
83+
2784
class MilvusOnlineStore(VectorOnlineStore):
2885
def online_write_batch(
2986
self,
@@ -49,6 +106,7 @@ def online_read(
49106
"to be implemented in https://jira.expedia.biz/browse/EAPC-7972"
50107
)
51108

109+
@log_exceptions_and_usage(online_store="milvus")
52110
def update(
53111
self,
54112
config: RepoConfig,
@@ -58,9 +116,41 @@ def update(
58116
entities_to_keep: Sequence[Entity],
59117
partial: bool,
60118
):
61-
raise NotImplementedError(
62-
"to be implemented in https://jira.expedia.biz/browse/EAPC-7970"
63-
)
119+
with MilvusConnectionManager(config.online_store):
120+
for table_to_keep in tables_to_keep:
121+
collection_available = utility.has_collection(table_to_keep.name)
122+
try:
123+
if collection_available:
124+
logger.info(f"Collection {table_to_keep.name} already exists.")
125+
else:
126+
schema = self._convert_featureview_schema_to_milvus_readable(
127+
table_to_keep.schema,
128+
table_to_keep.vector_field,
129+
table_to_keep.dimensions,
130+
)
131+
132+
collection = Collection(name=table_to_keep.name, schema=schema)
133+
logger.info(f"Collection name is {collection.name}")
134+
logger.info(
135+
f"Collection {table_to_keep.name} has been created successfully."
136+
)
137+
except Exception as e:
138+
logger.error(f"Collection update failed due to {e}")
139+
140+
for table_to_delete in tables_to_delete:
141+
collection_available = utility.has_collection(table_to_delete.name)
142+
try:
143+
if collection_available:
144+
utility.drop_collection(table_to_delete.name)
145+
logger.info(
146+
f"Collection {table_to_delete.name} has been deleted successfully."
147+
)
148+
else:
149+
logger.warning(
150+
f"Collection {table_to_delete.name} does not exist or is already deleted."
151+
)
152+
except Exception as e:
153+
logger.error(f"Collection deletion failed due to {e}")
64154

65155
def teardown(
66156
self,
@@ -71,3 +161,72 @@ def teardown(
71161
raise NotImplementedError(
72162
"to be implemented in https://jira.expedia.biz/browse/EAPC-7974"
73163
)
164+
165+
def _convert_featureview_schema_to_milvus_readable(
166+
self, feast_schema: List[Field], vector_field, vector_field_dimensions
167+
) -> CollectionSchema:
168+
"""
169+
Converting a schema understood by Feast to a schema that is readable by Milvus so that it
170+
can be used when a collection is created in Milvus.
171+
172+
Parameters:
173+
feast_schema (List[Field]): Schema stored in VectorFeatureView.
174+
175+
Returns:
176+
(CollectionSchema): Schema readable by Milvus.
177+
178+
"""
179+
boolean_mapping_from_string = {"True": True, "False": False}
180+
field_list = []
181+
dimension = None
182+
183+
for field in feast_schema:
184+
if field.name == vector_field:
185+
field_name = vector_field
186+
dimension = vector_field_dimensions
187+
else:
188+
field_name = field.name
189+
190+
data_type = self._feast_to_milvus_data_type(field.dtype)
191+
192+
if field.tags:
193+
description = field.tags.get("description", " ")
194+
is_primary = boolean_mapping_from_string.get(
195+
field.tags.get("is_primary", "False")
196+
)
197+
198+
# Appending the above converted values to construct a FieldSchema
199+
field_list.append(
200+
FieldSchema(
201+
name=field_name,
202+
dtype=data_type,
203+
description=description,
204+
is_primary=is_primary,
205+
dim=dimension,
206+
)
207+
)
208+
# Returning a CollectionSchema which is a list of type FieldSchema.
209+
return CollectionSchema(field_list)
210+
211+
def _feast_to_milvus_data_type(self, feast_type: FeastType) -> DataType:
212+
"""
213+
Mapping for converting Feast data type to a data type compatible wih Milvus.
214+
215+
Parameters:
216+
feast_type (FeastType): This is a type associated with a Feature that is stored in a VectorFeatureView, readable with Feast.
217+
218+
Returns:
219+
DataType : DataType associated with what Milvus can understand and associate its Feature types to
220+
"""
221+
222+
return {
223+
Int32: DataType.INT32,
224+
Int64: DataType.INT64,
225+
Float32: DataType.FLOAT,
226+
Float64: DataType.DOUBLE,
227+
String: DataType.STRING,
228+
Invalid: DataType.UNKNOWN,
229+
Array(Float32): DataType.FLOAT_VECTOR,
230+
# TODO: Need to think about list of binaries and list of bytes
231+
# FeastType.BYTES_LIST: DataType.BINARY_VECTOR
232+
}.get(feast_type, None)

sdk/python/feast/expediagroup/vectordb/vector_feature_view.py

+2-1
Original file line numberDiff line numberDiff line change
@@ -50,6 +50,7 @@ class VectorFeatureView(BaseFeatureView):
5050

5151
# inheriting from FeatureView wouldn't work due to issue with conflicting proto classes
5252
# therefore using composition instead
53+
name: str
5354
feature_view: FeatureView
5455
vector_field: str
5556
dimensions: int
@@ -106,7 +107,7 @@ def __init__(
106107
tags=tags,
107108
owner=owner,
108109
)
109-
110+
self.name = name
110111
self.feature_view = feature_view
111112
self.vector_field = vector_field
112113
self.dimensions = dimensions

sdk/python/feast/repo_config.py

+4
Original file line numberDiff line numberDiff line change
@@ -203,6 +203,8 @@ def __init__(self, **data: Any):
203203
self._offline_config = "redshift"
204204
elif data["provider"] == "azure":
205205
self._offline_config = "mssql"
206+
elif data["provider"] == "milvus":
207+
self._online_config = "milvus"
206208

207209
self._online_store = None
208210
if "online_store" in data:
@@ -216,6 +218,8 @@ def __init__(self, **data: Any):
216218
self._online_config = "dynamodb"
217219
elif data["provider"] == "rockset":
218220
self._online_config = "rockset"
221+
elif data["provider"] == "milvus":
222+
self._online_config = "milvus"
219223

220224
self._batch_engine = None
221225
if "batch_engine" in data:

0 commit comments

Comments
 (0)