AI Chat, fully equipped
From RAG retrieval to conversation memory. Everything you need to build intelligent chatbots.
Retrieval-Augmented Generation
RAG Pipeline
Accurate answers from your knowledge base. TUTUR combines retrieval and generation to produce contextual and factual responses.
- Hybrid search — combination of vector similarity and keyword matching for maximum coverage
- Pipeline search — multi-stage retrieval with filtering, scoring, and reranking
- Chunk management — automatic chunking with overlap for complete context
- Score threshold — only highly relevant documents are included in the prompt
- Source attribution — every answer includes source document references
- Configurable top-k — set the number of relevant documents per query as needed
Context-Aware Conversations
4-Layer Memory System
A chatbot that truly remembers. Four memory layers work together to deliver a coherent conversation experience.
- Session memory — conversation history within a session, auto-summarized when long
- Semantic memory — user facts and preferences auto-extracted from conversations
- Temporal memory — time awareness: "yesterday I said..." understood correctly
- Entity extraction — names, locations, preferences auto-detected and stored
- Memory decay — older information gradually decreases in relevance, like human memory
- Cross-session recall — users can continue context from previous conversations
Provider-Agnostic Architecture
Multi-Provider AI
Not locked into one provider. Choose the best model for each use case, switch anytime without code changes.
- OpenAI — GPT-4o, GPT-4o-mini for general purpose and reasoning
- DeepSeek — DeepSeek V3/R1 for coding and technical tasks at competitive pricing
- Groq — Ultra-fast inference for low-latency use cases
- Streaming built-in — Server-Sent Events for real-time response across all providers
- Per-tenant config — each tenant can choose their own provider and model
- Fallback chain — automatic failover to another provider if primary is down
One Platform, Many Tenants
Multi-Tenant Architecture
One TUTUR deployment serves many tenants. Each tenant is fully isolated with independent configuration.
- Tenant isolation — data, knowledge base, and sessions are completely separate
- Per-tenant LLM config — provider, model, temperature, max tokens, all configurable
- Per-tenant RAG config — strategy, top-k, score threshold configurable per tenant
- Feature flags — enable/disable memory, streaming, summarization per tenant
- API key management — each tenant has their own API keys with granular permissions
- Usage tracking — monitor usage per tenant for billing and capacity planning
Your Data, AI-Ready
Knowledge Base
Upload documents, TUTUR automatically processes and creates an AI-ready knowledge base.
- Document upload — supports various formats (PDF, TXT, Markdown, and more)
- Auto-chunking — documents split into optimal chunks with overlap
- Vector embedding — text-embedding-3-small from OpenAI for semantic representation
- Qdrant vector DB — high-performance vector database for similarity search
- Metadata filtering — filter documents by tag, source, or custom metadata
- Real-time indexing — new documents immediately searchable without rebuild
REST API + Streaming
Developer-Friendly API
Clean and consistent API. Integrate TUTUR into your application in minutes.
- RESTful API — intuitive and well-documented endpoints with OpenAPI spec
- SSE streaming — Server-Sent Events for real-time chat responses
- Tenant-scoped — all endpoints scoped per tenant via API key
- Session management — create, list, get, delete sessions via API
- Knowledge CRUD — upload, search, and manage knowledge base via API
- Rate limiting — configurable per API key for fair usage