The NeumVector is the object used to organize data extracted for a given data source. It is analogous to similar constructs (i.e. Document) used in frameworks like Langchain and LlamaIndex.

The goal of this interface is to abstract three properties:

  • id (str): This is a unique identifier for a given vector. The id is constructed throughout the pre-processing of the data and used as the vector id within the vector database. It is used at synchronization to ensure vectors are not being re-computed and duplicated.
  • vector (List[float]): Vector embedding generated from the content in the NeumDocument.
  • metadata (dict): This value contains the attached metadata for a given document. This can include values extracted from the data source or loader, as well as any calculated values. Metadata includes the content from the NeumDocument.

Usage

NeumVector
from neumai.Shared.NeumVector import NeumVector
neum_vector = NeumVector(
    id = 'abc', 
    vector = [.....],
    metadata = {'text':'Hello', 'createdDate':'2023-01-01'}
)