EPecho-pdf docs
Concept

Semantic structure is a separate layer.

get_semantic_document_structure() writes semantic-structure.json alongside the page index. It adds heading and section structure without changing the shape of pages[].

Primary path

Agent-first extraction using local provider/model configuration.

detector = agent-structured-v1

Fallback path

Conservative heuristic fallback when no model is configured or extraction fails.

detector = heading-heuristic-v1

Downstream rule

Read detector and strategy metadata before assuming semantic richness or cache reuse.

FieldWhy it exists
detectoridentifies which semantic extraction path produced the artifact
strategyKeychanges when provider, model, or extraction budget changes enough to invalidate reuse
pageIndexArtifactPathlinks the semantic layer back to the stable page index
pageArtifactPathlets section nodes point back to the originating page artifact

Still not domain logic.

The semantic layer is general document structure. It should not encode datasheet-specific, EDA-specific, or other downstream product semantics.