Transactions and Schema
Transaction Sequences
Below are the transaction sequences for the APIs. The transaction sequences are generated using the mermaid library.
File APIs
See below for some of the main File
routes.
POST /file
---
title: Transaction sequence - POST /file
---
sequenceDiagram
Django->>S3: file key, content
Django->>Core: file key
Core->>Workers: file key
Core->>Elastic: file key
S3->>Workers: file content
Workers->>Elastic: chunk key, content
Chat APIs
POST /chat/vanilla
---
title: Transaction sequence - POST /chat/vanilla
---
sequenceDiagram
Django->>Core: ChatHistory.messages[]
Core->>LLM API: ChatHistory.messages[]
POST /chat/rag
---
title: Transaction sequence - POST /chat/rag
---
sequenceDiagram
Django->> Core: ChatHistory.messages[], File[].uuid
Elastic->>Core: File[].Chunk[].embeddings
Core->>LLM API: ChatHistory.messages[].embeddings, File[].Chunk[].embeddings
Schema
Django Schema
The Django schema is a simple schema that stores the users, user groups, chat messages, chat histories, and file records. We keep all the business logic isolated here to keep the Core API simple. For any organisations with more complex business logic, this is where you would add it in your own version.
---
title: Django schema
---
erDiagram
User }|--|{ "UserGroup(django.models.Group)" : "UserGroup.users"
User {
UUID uuid
string name
}
"UserGroup(django.models.Group)" {
UUID uuid
string name
UUID[] users
}
FileRecord }|--|| "UserGroup(django.models.Group)": "FileRecord.owner"
FileRecord {
UUID uuid
UUID owner
string key
}
ChatMessage {
UUID uuid
UUID chat_history
string text
string role
}
ChatMessage }|--|| ChatHistory: "ChatMessage.chat_history"
"UserGroup(django.models.Group)" ||--|{ ChatHistory: "ChatHistory.owner"
ChatHistory {
UUID uuid
string name
UUID owner
UUID[] files_received
UUID[] files_retrieved
}
ChatHistory }|--o{ FileRecord: "ChatHistory.files_received"
ChatHistory }|--o{ FileRecord: "ChatHistory.files_retrieved"
Elastic Schema
Keeping things simple is the primary ethos here. We are storing the UUID of the parent file in the chunk. This allows us to easily query for all chunks of a file. We are also storing the text of the chunk, the metadata of the chunk, and the embedding of the chunk. The embedding is a float array that is generated by the embedding API.
---
title: Elastic schema
---
erDiagram
File ||--o{ Chunk : "File.uuid"
File {
UUID uuid
}
Chunk {
UUID uuid
UUID parent_file_uuid
int index
str text
dict metadata
float[] embedding
}