Chunkers
CharacterChunker
This class is responsible for chunking text data into smaller pieces based on character count, with optional overlapping between chunks.
The CharacterChunker
class is designed to break down large text documents into smaller, more manageable chunks of text. This process is based on the number of characters, which can be defined by the user.
Properties
Required properties:
- None
Optional properties:
chunk_size
: The number of characters each chunk should contain.chunk_overlap
: The number of characters that can overlap between consecutive chunks.batch_size
: The number of chunks to process in one batch.separator
: The character used to separate chunks.
Was this page helpful?