Files
Marlin/docs/vision.md
thePR0M3TH3AN f6fca2c0dd update
2025-05-18 16:02:48 -04:00

14 KiB
Raw Blame History

Marlin MetadataDriven File Explorer

Version 2 12 May 2025


1  Key Features & Functionality

Feature Area Capabilities
Tagging System • Unlimited, hierarchical or flat tags.
• Alias/synonym support with precedence rules (admindefined canonical name).
Bulk tag editing via multiselect context menu.
• FoldertoTag import with optional watch & sync mode so new subfolders inherit tags automatically.
Custom Metadata Attributes • Userdefined fields (text, number, date, enum, boolean).
• Pertemplate Custom Metadata Schemas (e.g. PhotoDate, Location).
File Relationships • Typed, directional or bidirectional links (related to, duplicate of, cites…).
• Plugin API can register new relationship sets.
Version Control for Metadata • Every change logged; unlimited rollback.
• Sidebyside diff viewer and blame panel showing who/when/what.
• Offline edits stored locally and merged (Gitstyle optimistic merge with conflict prompts).
Advanced Search & Smart Folders • Structured query syntax: tag:ProjectX AND author:Alice.
• Naturallanguage search ("files Alice edited last month") with toggle to exact mode.
• Visual Query Builder showing live query string.
• Saved queries appear as virtual “smart folders” that update in realtime.
User Interface • Sidebar: tags, attributes, relationships.
• Draganddrop tagging; inline metadata editor.
• Search bar with autocomplete (Bloom filter backed).
Dual View Mode metadata vs traditional folder; remembers preference per location.
Interactive 60second tour on first launch plus contextual tooltip help.
Collaboration • Realtime metadata sync across devices via cloud or selfhosted relay.
• Conflict handling as per Version Control.
• Rolebased permissions (read / write / admin) on tags & attributes.
Performance & Scale • Sharded/distributed index optional for >1 M files.
• Query cache with LRU eviction.
• Target metrics (100 k files): cold start ≤ 3 s, complex query ≤ 150 ms (stretch 50 ms).
Backup & Restore • Scheduled encrypted backups; export to JSON / XML.
• Oneclick restore from any pointintime snapshot.
Extensibility • Plugin system (TypeScript/JS) see §2.4.
• Python scripting hook for automation and batch tasks.
• REST/IPC API for external tools.

2  Technical Implementation

2.1  Core Stack

Component Primary Choice Notes
File Manager Dolphin (KDE) KIObased plugins GTK users can install a Nautilus extension (featureparity subset).
Metadata Store SQLite + FTS5 (singleuser) → optional LiteFS/Postgres for replication & multiuser scale. Perrow AESGCM encryption for sensitive fields; keys stored in OS keyring.
Indexer Daemon Rust service using notify (inotify on Linux, FSEvents on macOS). 100 ms debounce batches, async SQLite writes.
Cache Inmemory LRU + Bloom filter for autocomplete.

2.2  Database Schema (simplified)

files(id PK, path, inode, size, mtime, ctime, hash)
tags(id PK, name, parent_id, canonical_id)
file_tags(file_id FK, tag_id FK)
attributes(id PK, file_id FK, key, value, value_type)
relationships(id PK, src_file_id FK, dst_file_id FK, rel_type, direction)
change_log(change_id PK, object_table, object_id, op, actor, ts, payload_json)

2.3  Sync & Conflict Resolution

  1. Each client appends to change_log (CRDTcompatible delta).
  2. Delta sync via WebSocket; server merges and rebroadcasts.
  3. Conflicts → Conflict Queue UI (choose theirs / mine / merge).

2.4  Plugin API (TypeScript)

export interface MarlinPlugin {
  onInit(ctx: CoreContext): void;
  extendSchema?(db: Database): void;    // e.g. add new relationship table
  addCommands?(ui: UIContext): void;    // register menus, actions
}

Plugins run in a sandboxed process with whitelisted IPC calls.


3  UX & Accessibility

  • Keyboardonly workflow audit (Tab / ShiftTab / Space toggles).
  • Highcontrast theme; adheres to WCAG 2.1 AA.
  • Ctrl+Alt+V toggles Dual View.
  • Generated query string shown live under Visual Builder educates power users.

4  Performance Budget

Metric MVP Stretch
Cold start (100 k files) ≤ 3 s 1 s
Complex AND/OR query ≤ 150 ms 50 ms
Sustained inserts 5 k ops/s 20 k ops/s

Benchmarks run nightly; regressions block merge.


5  Security & Privacy

  • Rolebased ACL on tags/attributes.
  • Perchange audit trail; logs rotated to cold storage (≥ 90 days online).
  • Plugins confined by seccomp/AppArmor; no direct disk/network unless declared.

6  Packaging & Distribution

  • Flatpak (GNOME/KDE) and AppImage for portable builds.
  • Background service runs as a systemd user unit: --user marlin-indexerd.service.
  • CLI (marlin-cli) packaged for headless servers & CI.

7  Roadmap

Milestone Scope Timeline
M1 Tagging, attributes, virtual folders, SQLite, Dolphin plugin 6 weeks
M2 Sync service, version control, CLI +6 weeks
M3 NLP search, Visual Builder, distributed index prototype +6 weeks
M4 Plugin marketplace, enterprise auth (LDAP/OIDC), mobile companion (viewonly) +8 weeks

8  Branding

  • Name: Marlin fast, precise.
  • Icon: stylised sailfish fin forming a folder corner.
  • Tagline: “Cut through clutter.”
  • Domain: marlinexplorer.io (availability checked 20250512).

9  QuickWin Checklist (Sprint 0)

  • Implement bulk metadata editor UI
  • Write conflictresolution spec & unit tests
  • Build diff viewer prototype
  • Keyboardonly navigation audit
  • Establish performance CI with sample 100 k file corpus


10 Development Plan (Outline)

10.1 Process & Methodology

  • Framework – 2week Scrum sprints with Jira backlog, GitHub Projects mirror for public issues.
  • Branching – Trunkbased: feature branches → PR → required CI & codereview approvals (2).Main autodeploys nightly Flatpak.
  • Definition of Done – Code + unit tests + docs + passing CI + demo video (for UI work).
  • CI/CD – GitHub Actions matrix (Ubuntu22.04, KDENeon, Fedora39) → Flatpak / AppImage artefacts, cargo clippy, coverage gate ≥ 85%.

10.2 Team & Roles (FTEequivalent)

Role Core Skills Allocation
Lead Engineer Rust, Qt/Kirigami, KIO 1.0
Backend Engineer Rust, LiteFS/Postgres, WebSocket 1.0
Fullstack / Plugin Engineer TypeScript, Node, IPC 0.8
UX / QA Figma, accessibility, Playwright 0.5
DevOps (fractional) CI, Flatpak, security hardening 0.2

10.3 Roadmap → Sprintlevel Tasks

Sprint Goal Key Tasks Exit Criteria
S0 (2 wks) Project bootstrap • Repo + CI skeleton
• SQLite schema + migrations
marlin-cli init & basic scan
• Hyperfine perf baseline
CLI scans dir; tests pass; artefact builds
S13 (M1, 6 wks) Tagging + virtual folders MVP • Indexer daemon in Rust
• CRUD tags/attributes via CLI & DB
• Dolphin plugin: sidebar + tag view
• KIO tags:// virtual folder
• Bulkedit dialog
100kfile corpus coldstart ≤3s; user can tag files & navigate tags://Urgent
S46 (M2, 6 wks) Sync & version control • Changelog table + diff viewer
• LiteFS replication PoC
• WebSocket delta sync
• Conflict queue UI + lastwritewins fallback
Two devices sync metadata in <1s roundtrip; rollback works
S79 (M3, 6 wks) NLP search & Visual Builder • Integrate Tantivy FTS + ONNX intent model
• Toggle exact vs natural search
• QML Visual Builder with live query string
NL query "docs Alice edited last week" returns expected set in ≤300 ms
S1013 (M4, 8 wks) Plugin marketplace & mobile companion • IPC sandbox + manifest spec
• Sample plugins (image EXIF autotagger)
• Flutter readonly client
• LDAP/OIDC enterprise auth
First external plugin published; mobile app lists smart folders

10.4 Tooling & Infrastructure

  • Issue tracking – Jira → labels component/indexer, component/ui.
  • Docs – mkdocsmaterial hosted on GitHub Pages; automatic diagram generation via cargo doc + Mermaid.
  • Nightly Perf Benchmarks – Run in CI against 10k, 100k, 1M synthetic corpora; fail build if P95 query > target.
  • Security – Dependabot, Trivy scans, optional SLSA level 2 provenance for releases.

10.5 Risks & Mitigations

Risk Impact Mitigation
CRDT complexity Delays M2 Ship LWW first; schedule CRDT refactor postlaunch
File system event overflow Index corruption Debounce & autofallback to full rescan; alert user
Crossdistro packaging pain Adoption drops Stick to Flatpak; AppImage only for power users; collect telemetry (optin)
Scaling >1M files on slow HDD Perf complaints Offer "index on SSD" wizard; tune FTS page cache

10.6 Budget & Timeline Snapshot

  • Total dev time ≈ 30 weeks.
  • Buffer +10 % (3 weeks) for holidays & unknowns → 33 weeks (~8 months).
  • Rough budget (3 FTE avg × 33 wks × $150k/yr) ≈ $285k payroll + $15k ops / tooling.