`Chunk`

The Chunk model is closely related to the File model. It is used to store the actual data of a file in chunks. This is done to allow the Large Language Models to process the data in smaller sections.

The embedding field is used to store the text embedding of the chunk, which is crucial to the vector search functionality.

Each chunk references the File it belongs to using the parent_file_uuid field.

redbox.models.file.Chunk

Bases: PersistableModel

Chunk of a File

uuid `class-attribute` `instance-attribute`

uuid = Field(default_factory=uuid4)

created_datetime `class-attribute` `instance-attribute`

created_datetime = Field(default_factory=utcnow)

creator_user_uuid `instance-attribute`

creator_user_uuid

model_type `property`

model_type

Return the name of the model class.

RETURNS	DESCRIPTION
`str`	The name of the model class. TYPE: `str`

parent_file_uuid `class-attribute` `instance-attribute`

parent_file_uuid = Field(description='id of the original file which this text came from')

index `class-attribute` `instance-attribute`

index = Field(description='relative position of this chunk in the original file')

text `class-attribute` `instance-attribute`

text = Field(description='chunk of the original text')

metadata `class-attribute` `instance-attribute`

metadata = Field(description='subset of the unstructured Element.Metadata object', default=None)

embedding `class-attribute` `instance-attribute`

embedding = Field(description='the vector representation of the text', default=None)

text_hash `property`

text_hash

token_count `property`

token_count

`ChunkStatus`

The Chunk model also has a companion ChunkStatus model that helps track the status of the chunk processing. This includes information about the embedding process.

redbox.models.file.ChunkStatus

Bases: BaseModel

Status of a chunk of a file.

chunk_uuid `instance-attribute`

chunk_uuid

embedded `instance-attribute`

embedded

Chunk

redbox.models.file.Chunk

uuid class-attribute instance-attribute

created_datetime class-attribute instance-attribute

creator_user_uuid instance-attribute

model_type property

parent_file_uuid class-attribute instance-attribute

index class-attribute instance-attribute

text class-attribute instance-attribute

metadata class-attribute instance-attribute

embedding class-attribute instance-attribute

text_hash property

token_count property

ChunkStatus