Learn about how data is extracted and pre-processed.
Source Connector
. Each Source Connector can be individually configured to extract data from a given data source and process that data. The main pre-processing units supported are Loaders
and Chunkers
. The goal of the pre-processing steps are to clean and organize your data to be ready for embedding and ingestion. It comes down to one key question: What data do I want to embed (i.e. content
) and what data should I use to augment the vector to improve retrieval (i.e. metadata
).
Data extraction
Data loading
Data chunking
Data extraction
Data loading
Data chunking