This commit is contained in:
thePR0M3TH3AN
2025-05-16 22:24:55 -04:00
parent 45d4f57733
commit 9674611990
3 changed files with 186 additions and 169 deletions

144
README.md
View File

@@ -2,21 +2,24 @@
# Marlin # Marlin
**Marlin** is a lightweight, metadata-driven file indexer that runs 100 % on your computer. It scans folders, stores paths and file stats in SQLite, lets you attach hierarchical **tags** and **custom attributes**, takes automatic snapshots, and offers instant full-text search via FTS5. **Marlin** is a lightweight, metadata-driven file indexer that runs **100 % on your computer**.
It scans folders, stores paths and file stats in SQLite, lets you attach hierarchical **tags** and **custom attributes**, keeps timestamped **snapshots**, and offers instant full-text search via FTS5.
_No cloud, no telemetry your data never leaves the machine._ _No cloud, no telemetry your data never leaves the machine._
--- ---
## Feature highlights ## Feature highlights
| Area | What you get | | Area | What you get |
| -------------- | --------------------------------------------------------------------------------- | | ------------------- | ----------------------------------------------------------------------------------------------------- |
| **Safety** | Timestamped backups (`marlin backup`) and one-command restore (`marlin restore`) | | **Safety** | Timestamped backups (`marlin backup`) and one-command restore (`marlin restore`) |
| **Resilience** | Versioned, idempotent schema migrations zero-downtime upgrades | | **Resilience** | Versioned, idempotent schema migrations zero-downtime upgrades |
| **Indexing** | Fast multi-path scanner with SQLite WAL concurrency | | **Indexing** | Fast multi-path scanner with SQLite WAL concurrency |
| **Metadata** | Hierarchical tags (`project/alpha`) & key-value attributes (`reviewed=yes`) | | **Metadata** | Hierarchical tags (`project/alpha`) & key-value attributes (`reviewed=yes`) |
| **Search** | Prefix-aware FTS5 across paths, tags, and attributes; optional `--exec` per match | | **Relations** | Typed file ↔ file links (`marlin link`) with backlinks viewer |
| **DX / Logs** | Structured tracing (`RUST_LOG=debug`) for every operation | | **Collections / Views** | Named playlists (`marlin coll`) & saved searches (`marlin view`) for instant recall |
| **Search** | Prefix-aware FTS5 across paths, tags, attrs & links; optional `--exec` per match <br>(grep-style context snippets coming Q3) |
| **DX / Logs** | Structured tracing (`RUST_LOG=debug`) for every operation |
--- ---
@@ -26,11 +29,11 @@ _No cloud, no telemetry your data never leaves the machine._
┌──────────────┐ marlin scan ┌─────────────┐ ┌──────────────┐ marlin scan ┌─────────────┐
│ your files │ ─────────────────────▶│ SQLite │ │ your files │ ─────────────────────▶│ SQLite │
│ (any folder) │ │ files/tags │ │ (any folder) │ │ files/tags │
└──────────────┘ tag / attr │ attrs / FTS │ └──────────────┘ tag / attr / link │ attrs / FTS │
▲ search / exec └──────┬──────┘ search / exec └──────┬──────┘
└────────── backup / restore ▼ └────────── backup / restore ▼
timestamped snapshots timestamped snapshots
``` ````
--- ---
@@ -38,7 +41,7 @@ _No cloud, no telemetry your data never leaves the machine._
| Requirement | Why | | Requirement | Why |
| ------------------ | ----------------------------- | | ------------------ | ----------------------------- |
| **Rust** ≥ 1.77 | Build toolchain (`rustup.rs`) | | **Rust ≥ 1.77** | Build toolchain (`rustup.rs`) |
| C build essentials | Builds bundled SQLite (Linux) | | C build essentials | Builds bundled SQLite (Linux) |
macOS & Windows users: let the Rust installer pull the matching build tools. macOS & Windows users: let the Rust installer pull the matching build tools.
@@ -48,78 +51,63 @@ macOS & Windows users: let the Rust installer pull the matching build tools.
## Build & install ## Build & install
```bash ```bash
git clone https://github.com/yourname/marlin.git git clone https://github.com/PR0M3TH3AN/Marlin.git
cd marlin cd Marlin
cargo build --release cargo build --release
# (Optional) Install the binary into your PATH: # (Optional) install into your PATH
sudo install -Dm755 target/release/marlin /usr/local/bin/marlin sudo install -Dm755 target/release/marlin /usr/local/bin/marlin
``` ```
---
## Quick start ## Quick start
For a concise walkthrough, see [Quick start & Demo](marlin_demo.md). For a concise walkthrough—including **links, collections and views**—see
[**Quick start & Demo**](marlin_demo.md).
## Testing ---
## Testing
Below is a **repeat-able 3-step flow** you can use **every time you pull fresh code**. Below is a **repeat-able 3-step flow** you can use **every time you pull fresh code**.
--- ### 0Prepare once
### 0 Prepare once
```bash ```bash
# Run once (or add to ~/.bashrc) so debug + release artefacts land # Put build artefacts in one place (faster incremental builds)
# in the same predictable place. Speeds-up future builds.
export CARGO_TARGET_DIR=target export CARGO_TARGET_DIR=target
``` ```
--- ### 1Build the new binary
### 1 Build the new binary
```bash ```bash
git pull # grab the latest commit git pull
cargo build --release cargo build --release
sudo install -Dm755 target/release/marlin /usr/local/bin/marlin sudo install -Dm755 target/release/marlin /usr/local/bin/marlin
``` ```
* `cargo build --release` builds the optimised binary. ### 2Run the smoke-test suite
* `install …` copies it into your `$PATH` so `marlin` on the CLI is the fresh one.
---
### 2 Run the smoke-test suite
```bash ```bash
# Runs the end-to-end test we added in tests/e2e.rs
cargo test --test e2e -- --nocapture cargo test --test e2e -- --nocapture
``` ```
* `--test e2e` compiles and runs **only** `tests/e2e.rs`; other unit-tests are skipped (add them later if you like). *Streams CLI output live; exit-code 0 = all good.*
* `--nocapture` streams stdout/stderr so you can watch each CLI step in real time.
* Exit-code **0** ➜ everything passed.
Any non-zero exit or a red ✗ line means a step failed; the asserts diff will show the command and its output.
--- ### 3(Optionally) run **all** tests
### 3 (Optionally) run all tests
```bash ```bash
cargo test --all -- --nocapture cargo test --all -- --nocapture
``` ```
This will execute: This now covers:
* unit tests in `src/**` * unit tests in `src/**`
* every file in `tests/` * positive & negative integration suites (`tests/pos.rs`, `tests/neg.rs`)
* doc-tests * doc-tests
If you wire **“cargo test --all”** into CI (GitHub Actions, GitLab, etc.), pushes that break a workflow will be rejected automatically. #### One-liner helper
---
#### One-liner helper (copy/paste)
```bash ```bash
git pull && cargo build --release && git pull && cargo build --release &&
@@ -127,15 +115,19 @@ sudo install -Dm755 target/release/marlin /usr/local/bin/marlin &&
cargo test --test e2e -- --nocapture cargo test --test e2e -- --nocapture
``` ```
Stick that in a shell alias (`alias marlin-ci='…'`) and youve got a 5-second upgrade-and-verify loop. Alias it as `marlin-ci` for a 5-second upgrade-and-verify loop.
---
### Database location ### Database location
* **Linux** `~/.local/share/marlin/index.db` | OS | Default path |
* **macOS** `~/Library/Application Support/marlin/index.db` | ----------- | ----------------------------------------------- |
* **Windows** `%APPDATA%\marlin\index.db` | **Linux** | `~/.local/share/marlin/index.db` |
| **macOS** | `~/Library/Application Support/marlin/index.db` |
| **Windows** | `%APPDATA%\marlin\index.db` |
Override with: Override:
```bash ```bash
export MARLIN_DB_PATH=/path/to/custom.db export MARLIN_DB_PATH=/path/to/custom.db
@@ -148,55 +140,57 @@ export MARLIN_DB_PATH=/path/to/custom.db
```text ```text
marlin <COMMAND> [ARGS] marlin <COMMAND> [ARGS]
init create / migrate database init create / migrate DB **and perform an initial scan of the cwd**
scan <PATHS>... walk directories & index files scan <PATHS>... walk directories & (re)index files
tag "<glob>" <tag_path> add hierarchical tag tag "<glob>" <tag_path> add hierarchical tag
attr set <pattern> <key> <value> manage custom attributes attr set <pattern> <key> <val> set or update custom attribute
attr ls <path> attr ls <path> list attributes
search <query> [--exec CMD] FTS5 query, optionally run CMD on each hit link add|rm|list|backlinks manage typed file-to-file relations
backup create timestamped snapshot in backups/ coll create|add|list manage named collections (“playlists”)
restore <snapshot.db> replace DB with snapshot view save|list|exec save and run smart views (saved queries)
completions <shell> generate shell completions search <query> [--exec CMD] FTS5 query; optionally run CMD per hit
backup create timestamped snapshot in `backups/`
restore <snapshot.db> replace DB with snapshot
completions <shell> generate shell completions
``` ```
### Attribute subcommands ### Attribute sub-commands
| Command | Example | | Command | Example |
| ---------- | ---------------------------------------------- | | ----------- | ------------------------------------------------ |
| `attr set` | `marlin attr set ~/Docs/**/*.pdf reviewed yes` | | `attr set` | `marlin attr set ~/Docs/**/*.pdf reviewed yes` |
| `attr ls` | `marlin attr ls ~/Docs/report.pdf` | | `attr ls` | `marlin attr ls ~/Docs/report.pdf` |
| JSON output | `marlin --format=json attr ls ~/Docs/report.pdf` |
--- ---
## Backups & restore ## Backups & restore
**Create snapshot**
```bash ```bash
marlin backup marlin backup
# → ~/.local/share/marlin/backups/backup_2025-05-14_22-15-30.db # → ~/.local/share/marlin/backups/backup_2025-05-14_22-15-30.db
``` ```
**Restore snapshot**
```bash ```bash
marlin restore ~/.local/share/marlin/backups/backup_2025-05-14_22-15-30.db marlin restore ~/.local/share/marlin/backups/backup_2025-05-14_22-15-30.db
``` ```
Marlin also takes an **automatic safety backup before every non-init command**. > Marlin also creates an **automatic safety backup before every non-`init` command.**
> *Auto-prune (`backup --prune <N>`) lands in Q2.*
--- ---
## Upgrading ## Upgrading
```bash ```bash
cargo install --path . --force # rebuild & replace installed binary cargo install --path . --force # rebuild & replace installed binary
``` ```
The versioned migration system preserves your data across upgrades. Versioned migrations preserve your data across upgrades.
--- ---
## License ## License
MIT see `LICENSE` MIT see [`LICENSE`](LICENSE).

View File

@@ -1,6 +1,8 @@
# Marlin Demo # Marlin Demo 🚀
Below is the **“hello-world” demo** that matches the current master branch (auto-scan on `marlin init`, no more forced-migration noise, and cleaner build). Below is a **“hello-world” walk-through** that matches the current `main`
branch (auto-scan on `marlin init`, no more forced-migration chatter, cleaner
build). Everything runs offline on a throw-away directory under `~/marlin_demo`.
--- ---
@@ -8,11 +10,11 @@ Below is the **“hello-world” demo** that matches the current master branch (
```bash ```bash
# inside the repo # inside the repo
cargo build --release # build the new binary export CARGO_TARGET_DIR=target # <-- speeds up future builds (once)
cargo build --release # build the new binary
sudo install -Dm755 target/release/marlin /usr/local/bin/marlin sudo install -Dm755 target/release/marlin /usr/local/bin/marlin
``` # (cargo install --path . --locked --force works too)
````
*(`cargo install --path . --locked --force` works too if you prefer.)*
--- ---
@@ -21,96 +23,74 @@ sudo install -Dm755 target/release/marlin /usr/local/bin/marlin
```bash ```bash
rm -rf ~/marlin_demo rm -rf ~/marlin_demo
mkdir -p ~/marlin_demo/{Projects/{Alpha,Beta,Gamma},Logs,Reports,Scripts,Media/Photos} mkdir -p ~/marlin_demo/{Projects/{Alpha,Beta,Gamma},Logs,Reports,Scripts,Media/Photos}
# (zsh users: quote the pattern or enable braceexpand first)
# Projects # ── Projects ───────────────────────────────────────────────────
cat <<EOF > ~/marlin_demo/Projects/Alpha/draft1.md cat <<EOF > ~/marlin_demo/Projects/Alpha/draft1.md
# Alpha draft 1 # Alpha draft 1
- [ ] TODO: outline architecture - [ ] TODO: outline architecture
- [ ] TODO: write tests - [ ] TODO: write tests
EOF EOF
cat <<EOF > ~/marlin_demo/Projects/Alpha/draft2.md cat <<EOF > ~/marlin_demo/Projects/Alpha/draft2.md
# Alpha draft 2 # Alpha draft 2
- [x] TODO: outline architecture - [x] TODO: outline architecture
- [ ] TODO: implement feature X - [ ] TODO: implement feature X
EOF EOF
cat <<EOF > ~/marlin_demo/Projects/Beta/notes.md cat <<EOF > ~/marlin_demo/Projects/Beta/notes.md
Beta meeting notes: Beta meeting notes:
- decided on roadmap - decided on roadmap
- ACTION: follow up with design team - ACTION: follow-up with design team
EOF EOF
cat <<EOF > ~/marlin_demo/Projects/Beta/final.md cat <<EOF > ~/marlin_demo/Projects/Beta/final.md
# Beta Final # Beta Final
All tasks complete. Ready to ship! All tasks complete. Ready to ship!
EOF EOF
cat <<EOF > ~/marlin_demo/Projects/Gamma/TODO.txt cat <<EOF > ~/marlin_demo/Projects/Gamma/TODO.txt
Gamma tasks: Gamma tasks:
TODO: refactor module Y TODO: refactor module Y
EOF EOF
# Logs # ── Logs & Reports ─────────────────────────────────────────────
echo "2025-05-15 12:00:00 INFO Starting app" > ~/marlin_demo/Logs/app.log echo "2025-05-15 12:00:00 INFO Starting app" > ~/marlin_demo/Logs/app.log
echo "2025-05-15 12:01:00 ERROR Oops, crash" >> ~/marlin_demo/Logs/app.log echo "2025-05-15 12:01:00 ERROR Oops, crash" >> ~/marlin_demo/Logs/app.log
echo "2025-05-15 00:00:00 INFO System check OK" > ~/marlin_demo/Logs/system.log echo "2025-05-15 00:00:00 INFO System check OK" > ~/marlin_demo/Logs/system.log
printf "Q1 financials\n" > ~/marlin_demo/Reports/Q1_report.pdf
# Reports # ── Scripts & Media ────────────────────────────────────────────
printf "Q1 financials
" > ~/marlin_demo/Reports/Q1_report.pdf
# Scripts
cat <<'EOF' > ~/marlin_demo/Scripts/deploy.sh cat <<'EOF' > ~/marlin_demo/Scripts/deploy.sh
#!/usr/bin/env bash #!/usr/bin/env bash
echo "Deploying version $1..." echo "Deploying version $1"
EOF EOF
chmod +x ~/marlin_demo/Scripts/deploy.sh chmod +x ~/marlin_demo/Scripts/deploy.sh
# Media
echo "JPEGDATA" > ~/marlin_demo/Media/Photos/event.jpg echo "JPEGDATA" > ~/marlin_demo/Media/Photos/event.jpg
``` ```
*(copy the file-creation block from your original instructions — nothing about the files needs to change)*
--- ---
## 2Initialise **and** index (one step) ## 2Initialise **and** index (one step)
`marlin init` now performs a first-time scan of whatever directory you run it in.
So just:
```bash ```bash
cd ~/marlin_demo # <-- important: run init from the folder you want indexed cd ~/marlin_demo # run init from the folder you want indexed
marlin init marlin init # • creates or migrates DB
# • runs *first* full scan of this directory
``` ```
That will: Add more directories later with `marlin scan <dir>`.
1. create/upgrade the DB,
2. run all migrations exactly once,
3. walk the current directory and ingest every file it finds.
Need to add more paths later? Use `marlin scan <dir>` exactly as before.
--- ---
## 3Tagging examples ## 3Tagging examples
```bash ```bash
# Tag all project markdown as project/md # Tag all project markdown as project/md
marlin tag "~/marlin_demo/Projects/**/*.md" project/md marlin tag '~/marlin_demo/Projects/**/*.md' project/md
# Tag your logs # Tag your logs
marlin tag "~/marlin_demo/Logs/**/*.log" logs/app marlin tag '~/marlin_demo/Logs/**/*.log' logs/app
# Tag everything under Projects/Beta as project/beta # Tag everything under Beta as project/beta
marlin tag "~/marlin_demo/Projects/Beta/**/*" project/beta marlin tag '~/marlin_demo/Projects/Beta/**/*' project/beta
``` ```
--- ---
@@ -118,8 +98,8 @@ marlin tag "~/marlin_demo/Projects/Beta/**/*" project/beta
## 4Set custom attributes ## 4Set custom attributes
```bash ```bash
marlin attr set "~/marlin_demo/Projects/Beta/final.md" status complete marlin attr set '~/marlin_demo/Projects/Beta/final.md' status complete
marlin attr set "~/marlin_demo/Reports/*.pdf" reviewed yes marlin attr set '~/marlin_demo/Reports/*.pdf' reviewed yes
``` ```
--- ---
@@ -129,19 +109,19 @@ marlin attr set "~/marlin_demo/Reports/*.pdf" reviewed yes
```bash ```bash
marlin search TODO marlin search TODO
marlin search tag:project/md marlin search tag:project/md
marlin search "tag:logs/app AND ERROR" marlin search 'tag:logs/app AND ERROR'
marlin search "attr:status=complete" marlin search 'attr:status=complete'
marlin search "attr:reviewed=yes AND pdf" marlin search 'attr:reviewed=yes AND pdf'
marlin search "attr:reviewed=yes" --exec 'xdg-open {}' marlin search 'attr:reviewed=yes' --exec 'xdg-open {}'
marlin --format=json search 'attr:status=complete' # machine-readable output
``` ```
--- ---
## 6JSON output & verbose mode ## 6Verbose mode
```bash ```bash
marlin --format=json attr ls ~/marlin_demo/Projects/Beta/final.md marlin --verbose scan ~/marlin_demo # watch debug logs stream by
marlin --verbose scan ~/marlin_demo # re-scan to see debug logs
``` ```
--- ---
@@ -150,25 +130,43 @@ marlin --verbose scan ~/marlin_demo # re-scan to see debug logs
```bash ```bash
snap=$(marlin backup | awk '{print $NF}') snap=$(marlin backup | awk '{print $NF}')
rm ~/.local/share/marlin/index.db # simulate disaster rm ~/.local/share/marlin/index.db # simulate disaster
marlin restore "$snap" marlin restore "$snap"
marlin search TODO # should still work marlin search TODO # still works
``` ```
*(Reminder: Marlin also makes an **auto-backup** before every non-`init`
command, so manual snapshots are extra insurance.)*
--- ---
## 8Linking demo ## 8Linking demo
```bash ```bash
touch ~/marlin_demo/foo.txt ~/marlin_demo/bar.txt touch ~/marlin_demo/foo.txt ~/marlin_demo/bar.txt
marlin scan ~/marlin_demo # index the new files marlin scan ~/marlin_demo # index the new files
foo=~/marlin_demo/foo.txt foo=~/marlin_demo/foo.txt
bar=~/marlin_demo/bar.txt bar=~/marlin_demo/bar.txt
marlin link add "$foo" "$bar" # create link marlin link add "$foo" "$bar" --type references # create typed link
marlin link list "$foo" # outgoing links from foo marlin link list "$foo" # outgoing links from foo
marlin link backlinks "$bar" # incoming links to bar marlin link backlinks "$bar" # incoming links to bar
```
---
## 9Collections & smart views
```bash
# Collection
marlin coll create SetA
marlin coll add SetA '~/marlin_demo/Projects/**/*.md'
marlin coll list SetA
# Saved view (smart folder)
marlin view save tasks 'attr:status=complete OR TODO'
marlin view exec tasks
``` ```
--- ---
@@ -176,8 +174,10 @@ marlin link backlinks "$bar" # incoming links to bar
### Recap ### Recap
* `cargo build --release` + `sudo install …` is still the build path. * `cargo build --release` + `sudo install …` is still the build path.
* **`cd` to the folder you want indexed and run `marlin init`** — first scan happens automatically. * **`marlin init`** scans the **current working directory** on first run.
* Subsequent scans (`marlin scan …`) are only needed for *new* directories you add later. * Scan again only when you add *new* directories (`marlin scan …`).
* No more “forcing reapplication of migration 4” banner and the unused-import warnings are gone. * Auto-backups happen before every command; manual `marlin backup` gives you extra restore points.
Happy organising! Happy organising!
```

View File

@@ -1,36 +1,59 @@
Heres a slimmed-down, re-organized roadmap that groups related work into bigger milestones, highlights key deliverables (including a “demo” command and grep-style context snippets), and stages integrations for maximal developer velocity: # Marlin Roadmap 2025 → 2026 📜
| Phase / Sprint | Timeline | Focus & Rationale | Key Deliverables | This document outlines the **official delivery plan** for Marlin over the next four quarters.
| ---------------------------------------- | ---------------------- | --------------------------------------------------------------------------------------------------------- | ---------------------------------------------------------------------------- | Every work-item below is *time-boxed, testable,* and traceable back to an end-user benefit.
| **Sprint α: Bedrock & Metadata Domains** | **2025-Q2 (now6/01)** | Stabilize core schema, migrations, CI, and introduce the first metadata domains **with discoverability**. | - **CI:** `cargo test` + SQL migration dry-run coverage<br>- **Migrations:** |
* `links(src,dst,type)` > **Legend**
* `collections(name)+collection_files` > ✅ = item added/clarified in the latest planning round
* `views(name,query)`<br>- **CLI stubs** for `marlin link|unlink|list|backlinks`, `marlin coll`, `marlin view`<br>- **`marlin demo`** command that walks you through a mini-drive-by tutorial of link/coll/view flows | > Δ = new sub-deliverable (wasnt in the previous version)
\| **Epic 1: Scale & Reliability** | **2025-Q2** | Prototype “dirty-row” FTS to avoid per-row triggers, benchmark at 100 k files, and shore up CI for edge cases. | - Dirty-flag + `scan --dirty` reindex only changed rows<br>- Replace per-row triggers with periodic FTS rebuild<br>- End-to-end benchmarks on \~100 k files<br>- CI tests for dirty-scan edge-cases |
\| **Epic 2: Live Mode & Self-Pruning Backups** | **2025-Q2** | Automate continuous indexing & backup hygiene so Marlin “just works” in a real workspace. | - `marlin watch [dir]` via `notify` crate (create/modify/delete/rename)<br>- `backup --prune <N>` flag + post-scan auto-prune to latest N<br>- Daily/pr-merge auto-prune automation in CI |
\| **Phase 3: Content FTS & Annotations** | **2025-Q3** | Go beyond metadata—index file bodies, provide grep-style context, and add annotation support. | - New `files.content` column + migration<br>- Extend `files_fts` to include `content` and emit context snippets (`-C` style)<br>- `annotations` table + FTS triggers<br>- CLI: `marlin annotate add|list` |
\| **Phase 4: Versioning & Deduplication** | **2025-Q3** | Enable history, diffs, and duplicate detection with content hashing. | - Compute & store `files.hash` (SHA256)<br>- `scan --rehash` to refresh hashes<br>- CLI: `marlin version diff <file>` to show changes between revisions |
\| **Phase 5: Tag Aliases & Semantic Enhancements** | **2025-Q3** | Tame tag sprawl and seed AI-powered suggestions via embeddings. | - Enforce `canonical_id` on `tags`; CLI: `marlin tag alias add|ls|rm`<br>- `embeddings` table + `scan --embed` to generate vectors<br>- CLI: `marlin tag suggest`, `marlin summary <file>`, `marlin similarity scan` |
\| **Phase 6: Search DSL v2 & Smart Views** | **2025-Q4** | Offer a robust query grammar and “virtual folders” powered by views. | - Swap ad-hoc parser for a `nom`-based grammar with `AND`, `OR`, parentheses, wildcards…<br>- CLI: `marlin view save|list|exec` with aliases and pagination |
\| **Phase 7: Structured Workflows** | **2025-Q4** | Unlock full task, state, reminder & event workflows directly on files. | - `templates` + `template_fields` + validation engine<br>- CLI:
* `marlin state set|transitions add|state log`
* `marlin task scan|task list`
* `marlin remind set <file> <ts> "<msg>"`
* `marlin event add <file> <date> "<desc>"` + `marlin timeline` |
\| **Phase 8: Lightweight Integrations** | **2026-Q1** | Surface Marlin inside your editor/terminal before diving into a full GUI. | - **VSCode & terminal UI extension**: file-tree sidebar showing tags/attrs/links/annotations |
\| **Phase 9: Dolphin Sidebar Plugin (MVP)** | **2026-Q1** | Prototype a read-only Qt sidebar for Linux file managers—peek metadata without leaving your file browser. | - Qt plugin showing tags, attributes, links, and annotations alongside files |
\| **Phase 10: Full Edit UI & Multi-Device Sync** | **2026-Q2** | Ship an in-place metadata editor and optional sync layer for distributed workflows. | - Tag & view editors, task/reminder/event dialogs in GUI<br>- Choose/implement sync backend (rqlite, Litestream or custom) for optional read-only remote mounts |
--- ---
### Why this order? ## 1Birds-eye Table
1. **Lock down core schema & domains** (links, collections, views) **with a “demo” helper** so users can explore right away. | Phase / Sprint | Timeline | Focus & Rationale | Key Deliverables (Δ = new) | | |
2. **Scale & CI** unlocks safe indexing at volume, then | ----------------------------------------------- | ------------------------- | ------------------------------------------------------------------------ | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | --------------- | ------------------------------------------------------------------------------------------------------------------ |
3. **Live mode & auto-prune** keep your index fresh without manual steps. | **Sprint α Bedrock & Metadata Domains** | **2025-Q2 (now → 6 Jun)** | Stabilise schema & CI; land first metadata domains with discoverability. | Δ CI: `cargo test` + SQL dry-run<br>Δ Unit tests (`determine_scan_root`, `escape_fts`)<br>Δ Coverage: e2e `attr --format=json`<br>Δ Refactor: move `naive_substring_search` to shared util<br>Migrations: `links`, `collections`, `views`<br>CLI stubs: `link`, `coll`, `view`<br>`marlin demo` walkthrough | | |
4. **Content FTS + annotations** builds on an efficient, reliable plumbing layer—youll love grep-style context snippets. | **Epic 1 Scale & Reliability** | 2025-Q2 | Keep scans fast; bullet-proof CI at 100 k files. | Δ Dirty-flag column + `scan --dirty`<br>Benchmarks: full vs dirty scan (100 k)<br>Replace per-row triggers with periodic rebuild<br>CI edge-case tests | | |
5. **Versioning & semantic layers** ride atop a stable full-text index and annotation system. | **Epic 2 Live Mode & Self-Pruning Backups** | 2025-Q2 | Continuous indexing & hygiene—Marlin “just works”. | Δ `marlin watch [dir]` (notify/FSEvents)<br>Δ `backup --prune <N>` + auto-prune post-scan<br>Daily / PR-merge prune in CI | | |
6. **Advanced queries & workflows** expand power users toolsets before branching into GUIs and sync. | **Phase 3 Content FTS & Annotations** | 2025-Q3 | Index file bodies, grep-style context, inline notes. | `files.content` + migration<br>Extend `files_fts` (context snippets `-C`)<br>`annotations` table + triggers<br>CLI \`annotate add | list\` | |
| **Phase 4 Versioning & Deduplication** | 2025-Q3 | History, diffs & duplicate detection. | `files.hash` (SHA-256)<br>`scan --rehash` refresh<br>CLI `version diff <file>` | | |
| **Phase 5 Tag Aliases & Semantic Booster** | 2025-Q3 | Tame tag sprawl; seed AI-powered suggestions. | `canonical_id` on `tags`; CLI `tag alias …`<br>`embeddings` table + `scan --embed`<br>CLI `tag suggest`, `similarity scan`, `summary <file>` | | |
| **Phase 6 Search DSL v2 & Smart Views** | 2025-Q4 | Robust grammar + virtual folders. | Replace parser with **`nom`** grammar (`AND`, `OR`, `()` …)<br>CLI \`view save | list | exec\` with aliases & paging |
| **Phase 7 Structured Workflows** | 2025-Q4 | First-class task / state / reminder / event life-cycles. | ✅ State engine (`files.state`, `state_changes`)<br>CLI \`state set | transitions add | log`<br>✅ Task extractor (`tasks` table) + CLI<br>`templates`+ validation<br>CLI`remind …`, `event …`, `timeline\` |
| **Phase 8 Lightweight Integrations** | 2026-Q1 | Surface Marlin in editors / terminal. | VS Code & TUI extension (tags / attrs / links / notes) | | |
| **Phase 9 Dolphin Sidebar Plugin (MVP)** | 2026-Q1 | Read-only Qt sidebar for Linux file managers. | Qt plug-in: tags, attrs, links, annotations | | |
| **Phase 10 Full Edit UI & Multi-Device Sync** | 2026-Q2 | In-place metadata editor & optional sync layer. | GUI editors (tags, views, tasks, reminders, events)<br>Pick/implement sync backend (rqlite, Litestream, …) | | |
This grouping ensures every new layer rests on a solid, tested foundation—maximizing both developer speed and user delight. ---
## 2Narrative & Dependencies
1. **Lock down core schema & demo** *(Sprint α).*
Developers get immediate feedback via the `marlin demo` command while CI ensures migrations never regress.
2. **Scale & Live Mode** *(Epics 1-2).*
Dirty scanning, file-watching and auto-pruned backups guarantee snappy, hands-off operation even on six-figure corpora.
3. **Richer Search** *(Phases 3-6).*
Body-content FTS + grep-style snippets lay the groundwork; `nom` grammar then elevates power-user queries and smart views.
4. **Workflow Layers** *(Phase 7).*
State transitions, tasks and reminders turn Marlin from a passive index into an active workflow engine.
5. **UX Expansions** *(Phases 8-10).*
Start lightweight (VS Code / TUI), graduate to a read-only Dolphin plug-in, then ship full editing & sync for multi-device teams.
Every outer milestone depends only on the completion of the rows above it, **so shipping discipline in early sprints de-risks the headline features down the line.**
---
## 3Next Steps
* **Sprint α kickoff:** break deliverables into stories, estimate, assign.
* **Add roadmap as `docs/ROADMAP.md`** (this file).
* Wire a **Checklist issue** on GitHub: one task per Δ bullet for instant tracking.
---
*Last updated · 2025-05-16*