Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions backend/database/attachment_db.py
Original file line number Diff line number Diff line change
Expand Up @@ -272,6 +272,7 @@ def get_content_type(file_path: str) -> str:
'.html': 'text/html',
'.htm': 'text/html',
'.json': 'application/json',
'.epub': 'application/epub',
'.xml': 'application/xml',
'.zip': 'application/zip',
'.rar': 'application/x-rar-compressed',
Expand Down
6 changes: 3 additions & 3 deletions doc/docs/en/sdk/data-process.md
Original file line number Diff line number Diff line change
Expand Up @@ -43,10 +43,10 @@ def file_process(self,

## 📁 Supported File Formats

- **Text files**: .txt, .md, .csv
- **Documents**: .pdf, .docx, .pptx
- **Text files**: .txt, .md, .csv, .json
- **Documents**: .pdf, .docx, .pptx, .epub
- **Images**: .jpg, .png, .gif (with OCR)
- **Web content**: HTML, URLs
- **Web content**: HTML, URLs, XML
- **Archives**: .zip, .tar

## 💡 Usage Examples
Expand Down
4 changes: 3 additions & 1 deletion doc/docs/en/user-guide/knowledge-base.md
Original file line number Diff line number Diff line change
Expand Up @@ -26,12 +26,14 @@ Create and manage knowledge bases, upload documents, and generate summaries. Kno
### Supported File Formats

Nexent supports multiple file formats, including:
- **Text:** .txt, .md
- **Text:** .txt, .md, .csv, .json
- **PDF:** .pdf
- **Word:** .docx
- **PowerPoint:** .pptx
- **EPUB:** .epub
- **Excel:** .xlsx
- **Data files:** .csv
- **Web content:** .html, .xml

## 📊 Knowledge Base Summary

Expand Down
4 changes: 2 additions & 2 deletions doc/docs/en/user-guide/start-chat.md
Original file line number Diff line number Diff line change
Expand Up @@ -79,8 +79,8 @@ You can upload files during a chat so the agent can reason over their content:
- Or drag files directly into the chat area

2. **Supported File Formats**
- **Documents:** PDF, Word (.docx), PowerPoint (.pptx), Excel (.xlsx)
- **Text:** Markdown (.md), Plain text (.txt)
- **Documents:** PDF, Word (.docx), PowerPoint (.pptx), Excel (.xlsx), EPUB (.epub), HTML (.html), XML (.xml)
- **Text & Data:** Markdown (.md), Plain text (.txt), JSON (.json), CSV (.csv)
- **Images:** JPG, PNG, GIF, and other common formats

3. **File Processing Flow**
Expand Down
3 changes: 3 additions & 0 deletions doc/docs/zh/sdk/data-process.md
Original file line number Diff line number Diff line change
Expand Up @@ -98,6 +98,9 @@ def file_process(self,
- `.odt` - OpenDocument文本
- `.pptx` - PowerPoint 2007及更高版本
- `.ppt` - PowerPoint 97-2003版本
- `.xml` - XML数据文件
- `.json` - JSON数据文件
- `.csv` - 逗号分隔值文件

## 💡 使用示例

Expand Down
4 changes: 3 additions & 1 deletion doc/docs/zh/user-guide/knowledge-base.md
Original file line number Diff line number Diff line change
Expand Up @@ -26,12 +26,14 @@

Nexent支持多种文件格式,包括:

- **文本**: .txt, .md文件
- **文本**: .txt, .md, .json文件
- **PDF**: .pdf文件
- **Word**: .docx文件
- **PowerPoint**: .pptx文件
- **Excel**: .xlsx文件
- **EPUB** .epub文件
- **数据文件**: .csv文件
- **Web content**: .html, .xml文件

## 📊 知识库总结

Expand Down
4 changes: 2 additions & 2 deletions doc/docs/zh/user-guide/start-chat.md
Original file line number Diff line number Diff line change
Expand Up @@ -80,8 +80,8 @@ Nexent支持语音输入功能,让您可以通过语音与智能体交互。
- 或直接将文件拖拽到对话区域

2. **支持的文件格式**
- **文档类**:PDF、Word (.docx)、PowerPoint (.pptx)、Excel (.xlsx)
- **文本类**:Markdown (.md)、纯文本 (.txt)
- **文档类**:PDF、Word (.docx)、PowerPoint (.pptx)、Excel (.xlsx), EPUB (.epub), HTML (.html), XML (.xml)
- **文本类**:Markdown (.md)、纯文本 (.txt), JSON (.json), CSV (.csv)
- **图片类**:JPG、PNG、GIF 等常见图片格式

3. **文件处理流程**
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -233,7 +233,7 @@ const UploadArea = forwardRef<UploadAreaRef, UploadAreaProps>(
fileList,
onChange: handleChange,
customRequest: handleCustomRequest,
accept: ".pdf,.docx,.pptx,.xlsx,.md,.txt,.csv",
accept: ".pdf,.docx,.pptx,.xlsx,.md,.txt,.csv,.json,.epub,.xml,.html",
showUploadList: true,
disabled: disabled,
progress: {
Expand Down
9 changes: 5 additions & 4 deletions frontend/const/chatConfig.ts
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,7 @@ export const chatConfig = {
"application/json",
"application/xml",
"text/markdown",
"text/csv",
],

// Supported text file extensions
Expand Down Expand Up @@ -36,10 +37,10 @@ export const chatConfig = {
imageExtensions: ["jpg", "jpeg", "png", "gif", "webp", "svg", "bmp"],

// Supported document file extensions
documentExtensions: ["pdf", "doc", "docx", "xls", "xlsx", "ppt", "pptx"],
documentExtensions: ["pdf", "doc", "docx", "xls", "xlsx", "ppt", "pptx", "epub", "html", "xml"],

// Supported text document extensions
supportedTextExtensions: ["md", "markdown", "txt"],
supportedTextExtensions: ["md", "markdown", "txt", "csv", "json"],

// File icon mapping configuration
fileIcons: {
Expand All @@ -50,7 +51,7 @@ export const chatConfig = {
word: ["doc", "docx"],

// Plain text files
text: ["txt"],
text: ["txt", "epub"],

// Markdown files
markdown: ["md"],
Expand All @@ -62,7 +63,7 @@ export const chatConfig = {
powerpoint: ["ppt", "pptx"],

// HTML files
html: ["html", "htm"],
html: ["html", "htm", "xml"],

// Code files
code: ["css", "js", "ts", "jsx", "tsx", "php", "py", "java", "c", "cpp", "cs"],
Expand Down
47 changes: 31 additions & 16 deletions frontend/const/knowledgeBase.ts
Original file line number Diff line number Diff line change
Expand Up @@ -113,26 +113,36 @@ export const NOTIFICATION_TYPES = {

// File extension constants
export const FILE_EXTENSIONS = {
PDF: "pdf",
DOC: "doc",
DOCX: "docx",
XLS: "xls",
XLSX: "xlsx",
PPT: "ppt",
PPTX: "pptx",
TXT: "txt",
MD: "md",
PDF: 'pdf',
DOC: 'doc',
DOCX: 'docx',
XLS: 'xls',
XLSX: 'xlsx',
PPT: 'ppt',
PPTX: 'pptx',
TXT: 'txt',
MD: 'md',
EPUB: 'epub',
CSV: 'csv',
HTML: 'html',
XML: 'xml',
JSON: 'json'
} as const;

// File type constants
export const FILE_TYPES = {
PDF: "PDF",
WORD: "Word",
EXCEL: "Excel",
POWERPOINT: "PowerPoint",
TEXT: "Text",
MARKDOWN: "Markdown",
UNKNOWN: "Unknown",
PDF: 'PDF',
WORD: 'Word',
EXCEL: 'Excel',
POWERPOINT: 'PowerPoint',
TEXT: 'Text',
MARKDOWN: 'Markdown',
EPUB: 'EPUB',
CSV: 'CSV',
JSON: 'JSON',
HTML: 'HTML',
XML: 'XML',
UNKNOWN: 'Unknown'
} as const;

// File extension to type mapping
Expand All @@ -146,4 +156,9 @@ export const EXTENSION_TO_TYPE_MAP = {
[FILE_EXTENSIONS.PPTX]: FILE_TYPES.POWERPOINT,
[FILE_EXTENSIONS.TXT]: FILE_TYPES.TEXT,
[FILE_EXTENSIONS.MD]: FILE_TYPES.MARKDOWN,
[FILE_EXTENSIONS.CSV]: FILE_TYPES.CSV,
[FILE_EXTENSIONS.JSON]: FILE_TYPES.JSON,
[FILE_EXTENSIONS.HTML]: FILE_TYPES.HTML,
[FILE_EXTENSIONS.XML]: FILE_TYPES.XML,
[FILE_EXTENSIONS.EPUB]: FILE_TYPES.EPUB
} as const;
8 changes: 4 additions & 4 deletions frontend/public/locales/en/common.json
Original file line number Diff line number Diff line change
Expand Up @@ -75,10 +75,10 @@
"chatInput.thisFileTypeCannotBePreviewed": "This file type cannot be previewed",
"chatInput.fileCountExceedsLimit": "File count exceeds limit. Maximum {{count}} files allowed",
"chatInput.fileSizeExceedsLimit": "File \"{{name}}\" exceeds size limit. Maximum 10MB per file",
"chatInput.unsupportedFileType": "File \"{{name}}\" is not a supported file type. Supported formats: images, documents (PDF, Word, Excel, PPT), text files, CSV/TSV, Markdown",
"chatInput.unsupportedFileType": "File \"{{name}}\" is not a supported file type. Supported formats: images, documents (PDF, Word, Excel, PPT), text files, CSV/TSV, Markdown、JSON、HTML、XML",
"chatInput.unsupportedFileTypeSimple": "Unsupported file type",
"chatInput.dragAndDropFilesHere": "Drag and drop files here to upload",
"chatInput.supportedFileFormats": "Supported formats: images, documents (PDF, Word, Excel, PPT), text files, CSV/TSV, Markdown",
"chatInput.supportedFileFormats": "Supported formats: images, documents (PDF, Word, Excel, PPT, EPUB), text files, CSV/TSV, Markdown、JSON、HTML、XML",
"chatInput.sendMessageTo": "Send message to {{appName}}",
"chatInput.stopRecording": "Stop Recording",
"chatInput.startRecording": "Start Recording",
Expand Down Expand Up @@ -451,13 +451,13 @@
"knowledgeBase.hint.selectFirst": "Please select a knowledge base to upload files",
"knowledgeBase.hint.changeName": "Please modify the knowledge base name to continue",
"knowledgeBase.upload.dragHint": "Click or drag files to this area to upload and add knowledge to the knowledge base",
"knowledgeBase.upload.supportedFormats": "Supports PDF, Word, PPT, Excel, MD, TXT file formats",
"knowledgeBase.upload.supportedFormats": "Supports PDF, Word, PPT, Excel, MD, TXT, EPUB, CSV, JSON, HTML, XML file formats",
"knowledgeBase.upload.completed": "Upload completed",
"knowledgeBase.upload.fileCount": "{{count}} files",
"knowledgeBase.upload.status.uploading": "Uploading",
"knowledgeBase.upload.status.completed": "Completed",
"knowledgeBase.upload.status.failed": "Upload failed",
"knowledgeBase.upload.invalidFileType": "Only PDF, Word, PPT, Excel, MD, TXT, CSV file formats are supported!",
"knowledgeBase.upload.invalidFileType": "Only PDF, Word, PPT, Excel, MD, TXT, CSV, JSON, EPUB, HTML, XML file formats are supported!",
"knowledgeBase.check.nameError": "Failed to check knowledge base name",
"knowledgeBase.fetch.error": "Failed to fetch knowledge base information",
"knowledgeBase.fetch.retryError": "Failed to fetch knowledge base information, please try again later",
Expand Down
8 changes: 4 additions & 4 deletions frontend/public/locales/zh/common.json
Original file line number Diff line number Diff line change
Expand Up @@ -75,10 +75,10 @@
"chatInput.thisFileTypeCannotBePreviewed": "此文件类型无法预览",
"chatInput.fileCountExceedsLimit": "文件数量超过限制,最多只能上传{{count}}个文件",
"chatInput.fileSizeExceedsLimit": "文件\"{{name}}\"超过大小限制,单个文件最大10MB",
"chatInput.unsupportedFileType": "文件\"{{name}}\"不是支持的文件类型,支持的格式包括:图片、文档(PDF、Word、Excel、PPT)、纯文本、CSV/TSV、Markdown",
"chatInput.unsupportedFileType": "文件\"{{name}}\"不是支持的文件类型,支持的格式包括:图片、文档(PDF、Word、Excel、PPT、EPUB)、纯文本、CSV/TSV、Markdown、JSON、HTML、XML",
"chatInput.unsupportedFileTypeSimple": "不支持的文件类型",
"chatInput.dragAndDropFilesHere": "文件拖动到此处即可上传",
"chatInput.supportedFileFormats": "支持的格式包括:图片、文档(PDF、Word、Excel、PPT)、纯文本、CSV/TSV、Markdown",
"chatInput.supportedFileFormats": "支持的格式包括:图片、文档(PDF、Word、Excel、PPT、EPUB)、纯文本、CSV/TSV、Markdown、JSON、HTML、XML",
"chatInput.sendMessageTo": "给 {{appName}} 发送消息",
"chatInput.stopRecording": "停止录音",
"chatInput.startRecording": "开始录音",
Expand Down Expand Up @@ -454,13 +454,13 @@
"knowledgeBase.hint.selectFirst": "请先选择一个知识库以上传文件",
"knowledgeBase.hint.changeName": "请修改知识库名称后继续",
"knowledgeBase.upload.dragHint": "点击或拖拽文件到此区域上传,为知识库添加知识",
"knowledgeBase.upload.supportedFormats": "支持 PDF、Word、Excel、PPT、纯文本、CSV、TSV、Markdown 文件格式",
"knowledgeBase.upload.supportedFormats": "支持 PDF、Word、Excel、PPT、纯文本、CSV、TSV、Markdown、JSON、EPUB、HTML、XML 文件格式",
"knowledgeBase.upload.completed": "上传完成",
"knowledgeBase.upload.fileCount": "{{count}} 个文件",
"knowledgeBase.upload.status.uploading": "上传中",
"knowledgeBase.upload.status.completed": "已完成",
"knowledgeBase.upload.status.failed": "上传失败",
"knowledgeBase.upload.invalidFileType": "只支持 PDF、Word、PPT、Excel、MD、TXT、CSV 文件格式!",
"knowledgeBase.upload.invalidFileType": "只支持 PDF、Word、PPT、Excel、MD、TXT、CSV、JSON、EPUB、HTML、XML 文件格式!",
"knowledgeBase.check.nameError": "检查知识库名称失败",
"knowledgeBase.fetch.error": "获取知识库信息失败",
"knowledgeBase.fetch.retryError": "获取知识库信息失败,请稍后重试",
Expand Down
8 changes: 7 additions & 1 deletion frontend/services/uploadService.ts
Original file line number Diff line number Diff line change
Expand Up @@ -57,7 +57,13 @@ export const validateFileType = (file: File, t: TFunction, message: any): boolea
'text/markdown',
'text/plain',
'text/csv',
'application/csv'
'application/csv',
'application/epub',
'application/epub+zip',
'text/html',
'application/json',
'application/xml',
'text/xml'
];

// First check MIME type
Expand Down
6 changes: 5 additions & 1 deletion sdk/nexent/data_process/core.py
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,7 @@ class DataProcessCore:

Supported file types:
- Excel files: .xlsx, .xls
- Generic files: .txt, .pdf, .docx, .doc, .html, .htm, .md, .rtf, .odt, .pptx, .ppt
- Generic files: .txt, .pdf, .docx, .doc, .html, .htm, .md, .rtf, .odt, .pptx, .ppt, .epub, .xml, .csv, .json

Supported input methods:
- In-memory byte data
Expand Down Expand Up @@ -147,6 +147,10 @@ def get_supported_file_types(self) -> Dict[str, List[str]]:
".odt",
".pptx",
".ppt",
".epub",
".json",
".xml",
".csv",
]

return {"excel": list(self.EXCEL_EXTENSIONS), "generic": generic_formats}
Expand Down
Loading