-
Notifications
You must be signed in to change notification settings - Fork 5.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Bug]: TypeError: ColbertIndex._build_index_from_nodes() got an unexpected keyword argument 'index_name' #14398
Comments
@gavinHuang Hello there! I'm here to assist you with any bugs, questions, or contributions. Let's work together to solve this issue. The To resolve this issue, you need to ensure that the import pandas as pd
from llama_index.core.schema import BaseNode
from llama_index.indices.managed.colbert import ColbertIndex
# Load data
df = pd.read_csv("./data/products.csv", sep=",").drop_duplicates()
rows = df.to_dict("records")
# Convert rows to BaseNode objects
nodes = []
for row in rows:
node = BaseNode(
node_id=row["ITEM_NUMBER"],
content=row["DISPLAYNAME"],
metadata={"DESCRIPTION": row["DESCRIPTION"]}
)
nodes.append(node)
# Create ColbertIndex using nodes
colBertIndex = ColbertIndex(nodes=nodes) Ensure that you have the necessary dependencies installed, including the from llama_index.core.schema import BaseNode
# Assuming `docs` is a list of document contents
docs = ["Document content 1", "Document content 2"]
# Convert documents to BaseNode objects
nodes = [BaseNode(node_id=str(i), content=doc) for i, doc in enumerate(docs)]
# Create ColbertIndex using nodes
colBertIndex = ColbertIndex(nodes=nodes) This should help you avoid the |
Understanding it's a bot generated post, but it didn't get the point. |
The To resolve this, you need to ensure that the def _build_index_from_nodes(self, nodes: Sequence[BaseNode]) -> IndexDict:
"""Generate a PLAID index from the ColBERT checkpoint via its hugging face model_name."""
from colbert import Indexer, Searcher
from colbert.infra import ColBERTConfig, Run, RunConfig
index_struct = IndexDict()
docs_list = []
for i, node in enumerate(nodes):
docs_list.append(node.get_content())
self._docs_pos_to_node_id[i] = node.node_id
index_struct.add_node(node, text_id=str(i))
with Run().context(
RunConfig(index_root=self.index_path, nranks=self.ranks, gpus=self.gpus)
):
config = ColBERTConfig(
doc_maxlen=self.doc_maxlen,
query_maxlen=self.query_maxlen,
nbits=self.nbits,
kmeans_niters=self.kmeans_niters,
)
indexer = Indexer(checkpoint=self.model_name, config=config)
indexer.index(name=self.index_name, collection=docs_list, overwrite=True)
self.store = Searcher(
index=self.index_name, collection=docs_list, checkpoint=self.model_name
)
return index_struct The Additionally, when initializing the class ColbertIndex(BaseIndex[IndexDict]):
def __init__(
self,
nodes: Optional[Sequence[BaseNode]] = None,
objects: Optional[Sequence[IndexNode]] = None,
index_struct: Optional[IndexDict] = None,
storage_context: Optional[StorageContext] = None,
model_name: str = "colbert-ir/colbertv2.0",
index_name: str = "",
show_progress: bool = False,
nbits: int = 2,
gpus: int = 0,
ranks: int = 1,
doc_maxlen: int = 120,
query_maxlen: int = 60,
kmeans_niters: int = 4,
**kwargs: Any,
) -> None:
self.model_name = model_name
self.index_path = "storage/colbert_index"
self.index_name = index_name
self.nbits = nbits
self.gpus = gpus
self.ranks = ranks
self.doc_maxlen = doc_maxlen
self.query_maxlen = query_maxlen
self.kmeans_niters = kmeans_niters
self._docs_pos_to_node_id: Dict[int, str] = {}
super().__init__(
nodes=nodes,
index_struct=index_struct,
index_name=index_name,
storage_context=storage_context,
show_progress=show_progress,
objects=objects,
**kwargs,
) By ensuring that |
Bug Description
Trying to create ColbertIndex, via:
colBertIndex = ColbertIndex(nodes=nodes)
andColbertIndex.from_documents(documents=docs)
Both getting error:
checked BaseIndex at
llama-index-core/llama_index/core/indices/base.py
Neither init function nor build_index_from_nodes take
index_name
as a parameter from all history.Version
0.10.30
Steps to Reproduce
extra installation via conda
pytorch
channle:pytorch-cpu=1.13
run the following code
Relevant Logs/Tracbacks
The text was updated successfully, but these errors were encountered: