Files
SaaS-PDF/docs/tool_inventory.md
Your Name d7f6228d7f الميزات: إضافة أدوات جديدة لمعالجة ملفات PDF، تشمل التلخيص والترجمة واستخراج الجداول.
- تفعيل مكون SummarizePdf لإنشاء ملخصات PDF باستخدام الذكاء الاصطناعي.

- تفعيل مكون TranslatePdf لترجمة محتوى PDF إلى لغات متعددة.

- تفعيل مكون TableExtractor لاستخراج الجداول من ملفات PDF.

- تحديث الصفحة الرئيسية والتوجيه ليشمل الأدوات الجديدة.

- إضافة ترجمات للأدوات الجديدة باللغات الإنجليزية والعربية والفرنسية.

- توسيع أنواع واجهة برمجة التطبيقات (API) لدعم الميزات الجديدة المتعلقة بمعالجة ملفات PDF. --feat: Initialize frontend with React, Vite, and Tailwind CSS

- Set up main entry point for React application.
- Create About, Home, NotFound, Privacy, and Terms pages with SEO support.
- Implement API service for file uploads and task management.
- Add global styles using Tailwind CSS.
- Create utility functions for SEO and text processing.
- Configure Vite for development and production builds.
- Set up Nginx configuration for serving frontend and backend.
- Add scripts for cleanup of expired files and sitemap generation.
- Implement deployment script for production environment.
2026-03-08 05:49:09 +02:00

275 lines
14 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# SaaS-PDF — Tool Inventory & Competitive Gap Analysis
> Generated: March 7, 2026
> Branch: `feature/critical-maintenance-and-editor`
---
## 1. Platform Infrastructure
| Component | Technology | Status |
|---|---|---|
| Backend | Flask + Gunicorn | ✅ Production-ready |
| Frontend | React + Vite + TypeScript + Tailwind | ✅ Production-ready |
| Task Queue | Celery + Redis | ✅ 3 queues (default, image, pdf_tools) |
| Scheduler | Celery Beat | ✅ Expired-file cleanup every 30 min |
| Database | SQLite | ✅ Users, API keys, history, usage events |
| Storage | Local + S3 (optional) | ✅ Presigned URLs |
| Auth | Session-based + API Key (B2B) | ✅ Free & Pro plans |
| Security | Talisman CSP, rate limiting, CORS, input sanitization | ✅ |
| i18n | react-i18next (en, ar, fr) | ✅ All tools translated |
| Monetization | Google AdSense slots | ✅ Integrated |
| Email | SMTP (password reset) | ✅ |
| Docker | docker-compose (dev + prod) | ✅ |
| Nginx | Reverse proxy + SSL | ✅ |
### Plans & Quotas
| | Free | Pro |
|---|---|---|
| Web requests/month | 50 | 500 |
| API requests/month | — | 1,000 |
| Max file size | 50 MB | 100 MB |
| History retention | 25 | 250 |
| API key access | ❌ | ✅ |
### Registered Blueprints: 18
| Blueprint | Prefix | Purpose |
|---|---|---|
| `health_bp` | `/api` | Health check |
| `auth_bp` | `/api/auth` | Login, register, forgot/reset password |
| `account_bp` | `/api/account` | Profile, API keys, usage |
| `admin_bp` | `/api/internal/admin` | Plan management |
| `convert_bp` | `/api/convert` | PDF ↔ Word |
| `compress_bp` | `/api/compress` | PDF compression |
| `image_bp` | `/api/image` | Image convert & resize |
| `video_bp` | `/api/video` | Video to GIF |
| `history_bp` | `/api` | User history |
| `pdf_tools_bp` | `/api/pdf-tools` | Merge, split, rotate, watermark, etc. |
| `flowchart_bp` | `/api/flowchart` | AI flowchart extraction |
| `tasks_bp` | `/api/tasks` | Task status polling |
| `download_bp` | `/api/download` | Secure file download |
| `v1_bp` | `/api/v1` | B2B API (all tools) |
| `config_bp` | `/api/config` | Dynamic limits |
| `ocr_bp` | `/api/ocr` | OCR text extraction |
| `removebg_bp` | `/api/remove-bg` | Background removal |
| `pdf_editor_bp` | `/api/pdf-editor` | PDF text annotations |
---
## 2. Existing Tools — Complete Inventory (21 tools)
### 2.1 PDF Tools (14)
| # | Tool | Endpoint | Service | Task | Component | Route | i18n | B2B API |
|---|---|---|---|---|---|---|---|---|
| 1 | **Compress PDF** | `POST /api/compress/pdf` | `compress_service` | `compress_pdf_task` | `PdfCompressor.tsx` | `/tools/compress-pdf` | ✅ | ✅ |
| 2 | **PDF to Word** | `POST /api/convert/pdf-to-word` | `pdf_service` | `convert_pdf_to_word` | `PdfToWord.tsx` | `/tools/pdf-to-word` | ✅ | ✅ |
| 3 | **Word to PDF** | `POST /api/convert/word-to-pdf` | `pdf_service` | `convert_word_to_pdf` | `WordToPdf.tsx` | `/tools/word-to-pdf` | ✅ | ✅ |
| 4 | **Merge PDF** | `POST /api/pdf-tools/merge` | `pdf_tools_service` | `merge_pdfs_task` | `MergePdf.tsx` | `/tools/merge-pdf` | ✅ | ✅ |
| 5 | **Split PDF** | `POST /api/pdf-tools/split` | `pdf_tools_service` | `split_pdf_task` | `SplitPdf.tsx` | `/tools/split-pdf` | ✅ | ✅ |
| 6 | **Rotate PDF** | `POST /api/pdf-tools/rotate` | `pdf_tools_service` | `rotate_pdf_task` | `RotatePdf.tsx` | `/tools/rotate-pdf` | ✅ | ✅ |
| 7 | **PDF to Images** | `POST /api/pdf-tools/pdf-to-images` | `pdf_tools_service` | `pdf_to_images_task` | `PdfToImages.tsx` | `/tools/pdf-to-images` | ✅ | ✅ |
| 8 | **Images to PDF** | `POST /api/pdf-tools/images-to-pdf` | `pdf_tools_service` | `images_to_pdf_task` | `ImagesToPdf.tsx` | `/tools/images-to-pdf` | ✅ | ✅ |
| 9 | **Watermark PDF** | `POST /api/pdf-tools/watermark` | `pdf_tools_service` | `watermark_pdf_task` | `WatermarkPdf.tsx` | `/tools/watermark-pdf` | ✅ | ✅ |
| 10 | **Protect PDF** | `POST /api/pdf-tools/protect` | `pdf_tools_service` | `protect_pdf_task` | `ProtectPdf.tsx` | `/tools/protect-pdf` | ✅ | ✅ |
| 11 | **Unlock PDF** | `POST /api/pdf-tools/unlock` | `pdf_tools_service` | `unlock_pdf_task` | `UnlockPdf.tsx` | `/tools/unlock-pdf` | ✅ | ✅ |
| 12 | **Add Page Numbers** | `POST /api/pdf-tools/page-numbers` | `pdf_tools_service` | `add_page_numbers_task` | `AddPageNumbers.tsx` | `/tools/page-numbers` | ✅ | ✅ |
| 13 | **PDF Editor** | `POST /api/pdf-editor/edit` | `pdf_editor_service` | `edit_pdf_task` | `PdfEditor.tsx` | `/tools/pdf-editor` | ✅ | ❌ |
| 14 | **PDF Flowchart** | `POST /api/flowchart/extract` + 3 | `flowchart_service` | `extract_flowchart_task` | `PdfFlowchart.tsx` | `/tools/pdf-flowchart` | ✅ | ✅ |
### 2.2 Image Tools (4)
| # | Tool | Endpoint | Service | Task | Component | Route | i18n | B2B API |
|---|---|---|---|---|---|---|---|---|
| 15 | **Image Converter** | `POST /api/image/convert` | `image_service` | `convert_image_task` | `ImageConverter.tsx` | `/tools/image-converter` | ✅ | ✅ |
| 16 | **Image Resize** | `POST /api/image/resize` | `image_service` | `resize_image_task` | `ImageResize.tsx` | `/tools/image-resize` | ✅ | ✅ |
| 17 | **OCR** | `POST /api/ocr/image` + `/pdf` | `ocr_service` | `ocr_image_task` / `ocr_pdf_task` | `OcrTool.tsx` | `/tools/ocr` | ✅ | ❌ |
| 18 | **Remove Background** | `POST /api/remove-bg` | `removebg_service` | `remove_bg_task` | `RemoveBackground.tsx` | `/tools/remove-background` | ✅ | ❌ |
### 2.3 Video Tools (1)
| # | Tool | Endpoint | Service | Task | Component | Route | i18n | B2B API |
|---|---|---|---|---|---|---|---|---|
| 19 | **Video to GIF** | `POST /api/video/to-gif` | `video_service` | `create_gif_task` | `VideoToGif.tsx` | `/tools/video-to-gif` | ✅ | ✅ |
### 2.4 Text Tools — Client-Side Only (2)
| # | Tool | Backend | Component | Route | i18n |
|---|---|---|---|---|---|
| 20 | **Word Counter** | None (JS) | `WordCounter.tsx` | `/tools/word-counter` | ✅ |
| 21 | **Text Cleaner** | None (JS) | `TextCleaner.tsx` | `/tools/text-cleaner` | ✅ |
### Feature Flags
| Flag | Default | Controls |
|---|---|---|
| `FEATURE_EDITOR` | `false` | OCR, Remove Background, PDF Editor routes (403 when off) |
---
## 3. Test Coverage
| Category | Test Files | Tests |
|---|---|---|
| Auth | `test_auth.py` | 5 |
| Config | `test_config.py` | 3 |
| Password reset | `test_password_reset.py` | 8 |
| Maintenance | `test_maintenance_tasks.py` | 8 |
| Compress | `test_compress.py`, `test_compress_service.py`, `test_compress_tasks.py` | 6 |
| Convert | `test_convert.py`, `test_convert_tasks.py` | 6 |
| Image | `test_image.py`, `test_image_service.py`, `test_image_tasks.py` | ~18 |
| Video | `test_video.py`, `test_video_service.py`, `test_video_tasks.py` | ~12 |
| PDF tools | `test_pdf_tools.py`, `test_pdf_tools_service.py`, `test_pdf_tools_tasks.py` | ~50 |
| Flowchart | `test_flowchart_tasks.py` | ~6 |
| OCR | `test_ocr.py`, `test_ocr_service.py` | 12 |
| Remove BG | `test_removebg.py` | 3 |
| PDF Editor | `test_pdf_editor.py` | 7 |
| Infra | `test_download.py`, `test_health.py`, `test_history.py`, `test_rate_limiter.py`, `test_sanitizer.py`, `test_storage_service.py`, `test_file_validator.py`, `test_utils.py`, `test_tasks_route.py` | ~36 |
| **TOTAL** | **30 files** | **180 ✅** |
---
## 4. Missing Tools — Competitive Gap Analysis
Comparison against: iLovePDF, SmallPDF, TinyWow, PDF24, Adobe Acrobat Online.
### 4.1 HIGH PRIORITY — Core tools competitors all have
| # | Tool | Category | Complexity | Dependencies | Notes |
|---|---|---|---|---|---|
| 1 | **Compress Image** | Image | Low | Pillow (exists) | JPEG/PNG/WebP quality reduction + resize. Pillow already installed. |
| 2 | **PDF to Excel** | PDF → Office | Medium | `camelot-py` or `tabula-py` | Table extraction from PDFs — high user demand. |
| 3 | **PDF to PowerPoint** | PDF → Office | Medium | `python-pptx` | Convert PDF pages to PPTX slides (images per slide or OCR). |
| 4 | **Excel to PDF** | Office → PDF | Medium | LibreOffice CLI | Same pattern as Word to PDF. |
| 5 | **PowerPoint to PDF** | Office → PDF | Medium | LibreOffice CLI | Same pattern as Word to PDF. |
| 6 | **HTML to PDF** | Web → PDF | Low | `weasyprint` or `playwright` | Input URL or HTML snippet → PDF. |
| 7 | **Reorder / Rearrange Pages** | PDF | Low | PyPDF2 (exists) | Drag-and-drop page reorder UI → backend rebuilds PDF. |
| 8 | **Extract Pages** | PDF | Low | PyPDF2 (exists) | Similar to Split but with visual page picker. Already partially covered by Split tool. |
| 9 | **Sign PDF** | PDF | Medium | ReportLab + canvas | Draw/upload signature → overlay onto PDF page. |
| 10 | **PDF Repair** | PDF | Low | PyPDF2 (exists) | Read → rewrite to fix broken xref tables. |
### 4.2 MEDIUM PRIORITY — Differentiators present on 23 competitors
| # | Tool | Category | Complexity | Dependencies | Notes |
|---|---|---|---|---|---|
| 11 | **PDF to PDF/A** | PDF | Medium | Ghostscript (exists) | Archival format conversion. |
| 12 | **Flatten PDF** | PDF | Low | PyPDF2 (exists) | Remove form fields / annotations → flat page. |
| 13 | **Crop PDF** | PDF | Medium | PyPDF2 (exists) | Crop margins / adjust page boundaries. |
| 14 | **Compare PDFs** | PDF | High | `diff-match-patch` + PyPDF2 | Side-by-side visual diff of two documents. |
| 15 | **QR Code Generator** | Utility | Low | `qrcode` + Pillow | Text/URL → QR image. Client-side possible but backend for API. |
| 16 | **Barcode Generator** | Utility | Low | `python-barcode` | Generate Code128, EAN, UPC barcodes. |
| 17 | **Image Crop** | Image | Low | Pillow (exists) | Visual cropping UI → backend Pillow crop. |
| 18 | **Image Rotate / Flip** | Image | Low | Pillow (exists) | 90°/180°/270° + horizontal/vertical flip. |
| 19 | **Image Filters** | Image | Low | Pillow (exists) | Grayscale, sepia, blur, sharpen, brightness, contrast. |
### 4.3 LOW PRIORITY — Advanced / niche (12 competitors, premium features)
| # | Tool | Category | Complexity | Dependencies | Notes |
|---|---|---|---|---|---|
| 20 | **AI Chat with PDF** | AI | High | OpenRouter (exists) | Upload PDF → ask questions. Flowchart service has partial foundation. |
| 21 | **AI PDF Summarizer** | AI | Medium | OpenRouter (exists) | Extract text → prompt LLM for summary. |
| 22 | **AI PDF Translator** | AI | Medium | OpenRouter (exists) | Extract text → translate via LLM → overlay or return translated doc. |
| 23 | **PDF Form Filler** | PDF | High | ReportLab + PyPDF2 | Detect form fields → UI to fill → save. |
| 24 | **Redact PDF** | PDF | Medium | ReportLab + PyPDF2 | Blackout sensitive text regions. |
| 25 | **PDF Metadata Editor** | PDF | Low | PyPDF2 (exists) | Edit title, author, subject, keywords. |
| 26 | **eSign / Digital Signature** | PDF | High | `cryptography` + PKCS#7 | Cryptographic digital signatures (different from visual sign). |
| 27 | **Batch Processing** | All | Medium | Existing tasks | Upload multiple files → apply same operation to all. |
| 28 | **GIF to Video** | Video | Medium | ffmpeg (exists) | Reverse of Video to GIF. |
| 29 | **Video Compress** | Video | Medium | ffmpeg (exists) | Reduce video file size. |
| 30 | **Audio Extract** | Video | Low | ffmpeg (exists) | Extract audio track from video → MP3/WAV. |
| 31 | **Screenshot to PDF** | Utility | Low | Pillow (exists) | Paste screenshot → generate PDF (similar to Images to PDF). |
| 32 | **Markdown to PDF** | Utility | Low | `markdown` + WeasyPrint | Render Markdown → PDF. |
| 33 | **JSON / CSV Viewer** | Utility | Low | Client-side | Pretty-print structured data. |
---
## 5. Implementation Readiness Matrix
Tools grouped by effort required (backend dependencies already present in the project):
### Ready to build (dependencies exist: PyPDF2, Pillow, Ghostscript, ffmpeg)
| Tool | Effort | Reuses |
|---|---|---|
| Compress Image | ~2h | `image_service.py` + Pillow |
| Reorder Pages | ~3h | `pdf_tools_service.py` + PyPDF2 |
| Extract Pages | ~2h | Split tool pattern |
| PDF Repair | ~2h | PyPDF2 read/write |
| Flatten PDF | ~2h | PyPDF2 |
| Crop PDF | ~3h | PyPDF2 MediaBox |
| Image Crop | ~2h | Pillow |
| Image Rotate/Flip | ~2h | Pillow |
| Image Filters | ~3h | Pillow ImageFilter |
| PDF Metadata Editor | ~2h | PyPDF2 |
| PDF to PDF/A | ~2h | Ghostscript (exists in Dockerfile) |
| QR Code Generator | ~2h | `qrcode` pip package |
| AI PDF Summarizer | ~3h | `ai_chat_service.py` + OpenRouter |
| GIF to Video | ~2h | ffmpeg |
| Audio Extract | ~2h | ffmpeg |
### Need new dependencies (1 pip package)
| Tool | New Dependency | Effort |
|---|---|---|
| PDF to Excel | `camelot-py[cv]` or `tabula-py` | ~4h |
| PDF to PowerPoint | `python-pptx` | ~4h |
| Excel to PDF | LibreOffice CLI (exists) | ~3h |
| PowerPoint to PDF | LibreOffice CLI (exists) | ~3h |
| HTML to PDF | `weasyprint` or `playwright` | ~4h |
| Sign PDF | ReportLab (exists) + canvas overlay | ~6h |
| Barcode Generator | `python-barcode` | ~2h |
| Markdown to PDF | `markdown` + `weasyprint` | ~3h |
### Requires significant new architecture
| Tool | Complexity | Effort |
|---|---|---|
| AI Chat with PDF | RAG pipeline or full-doc prompt | ~8h |
| AI PDF Translator | OCR + LLM + overlay | ~8h |
| PDF Form Filler | Field detection + fill engine | ~10h |
| Redact PDF | Region detection + blackout overlay | ~6h |
| Compare PDFs | Diff algorithm + visual rendering | ~10h |
| eSign / Digital Signature | PKCS#7 cryptographic signing | ~10h |
| Batch Processing | Queue orchestration for multi-file | ~6h |
| Video Compress | ffmpeg transcoding | ~4h |
---
## 6. Summary
| Metric | Count |
|---|---|
| **Existing tools** | 21 |
| **Missing HIGH priority** | 10 |
| **Missing MEDIUM priority** | 9 |
| **Missing LOW priority** | 14 |
| **Total gap** | 33 |
| **Backend tests** | 180 ✅ |
| **Frontend build** | ✅ Clean |
| **Blueprints** | 18 |
| **Celery task modules** | 10 |
| **Service files** | 15 |
| **i18n languages** | 3 (en, ar, fr) |
### Competitor Parity Score
| Competitor | Their tools | We match | Coverage |
|---|---|---|---|
| iLovePDF | ~25 core | ~16 | 64% |
| SmallPDF | ~21 core | ~15 | 71% |
| TinyWow | ~50+ (many AI) | ~14 | 28% |
| PDF24 | ~30 core | ~17 | 57% |
### Recommended Next Sprint
**Highest ROI — 6 tools to reach 80%+ parity with SmallPDF/iLovePDF:**
1. Compress Image (Pillow — already installed)
2. PDF to Excel (`camelot-py`)
3. HTML to PDF (`weasyprint`)
4. Sign PDF (ReportLab overlay)
5. Reorder Pages (PyPDF2 — already installed)
6. PDF to PowerPoint (`python-pptx`)