
* Rename `views` table references to `saved_views` for consistency. * Update migration step 3 to create `saved_views` instead of `views`. * Add a note that tag aliases (`canonical_id`) are deferred to DP-006 (v1.5). * Ensure all PRAGMA statements (`foreign_keys`, `journal_mode=WAL`) apply to every migration. * Adjust ER diagram section to reflect `saved_views` entity and its relationship.
6.5 KiB
DP-001: Schema v1.1 – Core Metadata Domains
Status: Proposed
Authors: @carol
Date: 2025-05-17
1. Context
We’ve landed a basic SQLite-backed files
table and a contentless FTS5 index. Before we build out higher-level features, we need to lock down our v1.1 metadata schema for:
- Hierarchical tags (
tags
+file_tags
) - Custom attributes (
attributes
) - File-to-file relationships (
links
) - Named collections (
collections
+collection_files
) - Saved views (
saved_views
)
Locking this schema now lets downstream CLI & GUI work against a stable model and ensures our migrations stay easy to reason about.
Note: Tag aliases and their canonical_id
support are deferred to DP-006 (v1.5).
2. Decision
Each migration will begin by enabling foreign-key enforcement and WAL journaling:
PRAGMA foreign_keys = ON;
PRAGMA journal_mode = WAL;
All foreign keys use ON DELETE CASCADE
so deleting a file, tag, etc. automatically cleans up dependents.
-
Bump to schema version 1.1 in our
schema_version
table. -
Provide four migration scripts, applied in order:
- 0001_initial_schema.sql – create core tables (
files
,tags
,file_tags
,attributes
), a contentless FTS5 table (files_fts
), core FTS triggers, and performance-critical indexes. - 0002_update_fts_and_triggers.sql – replace old tag/attr FTS triggers with
INSERT OR REPLACE
semantics for full-row refresh. - 0003_create_links_collections_saved_views.sql – introduce
links
,collections
,collection_files
, andsaved_views
tables. - 0004_fix_hierarchical_tags_fts.sql – refine FTS triggers to index full hierarchical tag-paths via a recursive CTE.
- 0001_initial_schema.sql – create core tables (
-
Expose this schema through our library (
libmarlin::db::open
) so any client sees a v1.1 store.
3. ER Diagram
Below is the updated entity-relationship diagram (PlantUML):
@startuml
entity files {
* id : INTEGER <<PK>>
--
path : TEXT
size : INTEGER
mtime : INTEGER
hash : TEXT
}
entity tags {
* id : INTEGER <<PK>>
--
name : TEXT
parent_id : INTEGER <<FK>>
}
entity file_tags {
* file_id : INTEGER <<FK>>
* tag_id : INTEGER <<FK>>
}
entity attributes {
* id : INTEGER <<PK>>
--
file_id : INTEGER <<FK>>
key : TEXT
value : TEXT
}
entity links {
* id : INTEGER <<PK>>
--
src_file_id : INTEGER <<FK>>
dst_file_id : INTEGER <<FK>>
type : TEXT
}
entity collections {
* id : INTEGER <<PK>>
--
name : TEXT
}
entity collection_files {
* collection_id : INTEGER <<FK>>
* file_id : INTEGER <<FK>>
}
entity saved_views {
* id : INTEGER <<PK>>
--
name : TEXT
query : TEXT
}
files ||--o{ file_tags
tags ||--o{ file_tags
files ||--o{ attributes
files ||--o{ links : "src_file_id"
files ||--o{ links : "dst_file_id"
collections ||--o{ collection_files
files ||--o{ collection_files
saved_views ||..|| files : "exec via FTS"
@enduml
Or in plain-ASCII:
┌────────┐ ┌────────────┐ ┌───────┐
│ files │1────*──│ file_tags │*────1─│ tags │
└────────┘ └────────────┘ └───────┘
│
│1
*
┌────────────┐
│ attributes │
└────────────┘
┌────────┐ ┌────────┐ ┌────────┐
│ files │1──*──│ links │*───1──│ files │
└────────┘ └────────┘ └────────┘
┌─────────────┐ ┌──────────────────┐ ┌────────┐
│ collections │1──*─│ collection_files │*──1─│ files │
└─────────────┘ └──────────────────┘ └────────┘
┌─────────────┐
│ saved_views │
│ (exec FTS) │
└─────────────┘
4. Migration Summary
File | Purpose |
---|---|
0001_initial_schema.sql | Core tables + contentless FTS + core triggers + indexes |
0002_update_fts_and_triggers.sql | Full-row FTS refresh on tag/attr changes |
0003_create_links_collections_saved_views.sql | Add links , collections , collection_files , saved_views |
0004_fix_hierarchical_tags_fts.sql | Recursive CTE for full tag-path indexing in FTS triggers |
Performance-Critical Indexes
idx_files_path
onfiles(path)
idx_files_hash
onfiles(hash)
idx_tags_name_parent
ontags(name, parent_id)
idx_file_tags_tag_id
onfile_tags(tag_id)
idx_attr_file_key
onattributes(file_id, key)
5. Example CLI Session
$ marlin init
Database initialised at ~/.local/share/marlin/index_*.db
Initial scan complete – indexed/updated 42 files
$ marlin link add ./todo.md ./projects/plan.md
Linked './todo.md' → './projects/plan.md'
$ marlin coll create "MyDocs"
Created collection 'MyDocs'
$ marlin view save tasks "tag:project AND TODO"
Saved view 'tasks' = tag:project AND TODO
$ marlin view list
tasks: tag:project AND TODO
$ marlin view exec tasks
~/Projects/Alpha/draft1.md
~/Projects/Beta/final.md
6. Consequences
- Backward compatibility: older v1.0 stores will be migrated on first open.
- Stability: downstream features (TUI, VS Code, web UI) can depend on a stable v1.1 schema.
- Simplicity: by consolidating metadata domains now, future migrations remain small and focused.
- Performance: v1.1 schema meets our cold-start P95 ≤ 3 s on a 100 k-file corpus (with CI-enforced benchmarks and the indexes above).
End of DP-001