YAML Frontmatter
YAML frontmatter: structured metadata that drives flat-file CMS and Docs-as-Code
YAML frontmatter is a block of structured metadata at the head of a Markdown file, fenced between two lines of three hyphens, that separates machine-readable data from the human-written content.
A Markdown file is plain text, and plain text has no fields. But the moment a system needs to know what a page is called, when it was published, which layout it uses or whether it is live, it needs a place for such details. Frontmatter is that place: a short YAML section that tells the generator how to treat the content below it. This page describes the format, what it is used for, and which YAML quirks it regularly breaks on.
The format
Frontmatter sits at the very start of the file and is delimited by two separator lines of three hyphens each. Between them lies a YAML document; below them follows the actual content in Markdown:
---
title: Getting Started
date: 2026-06-17
tags: [guide, intro]
published: true
---
# Getting Started
The actual text of the page begins here.
YAML stands for "YAML Ain't Markup Language" and is a data-serialisation format designed to be readable by humans. It maps the common building blocks: scalar values such as strings and numbers, key-value pairs (mappings) and lists (sequences). Structure comes from indentation with spaces, not from brackets. These two traits, the separation of data from text and the indentation-based syntax, are exactly what makes frontmatter so widespread and at the same time so error-prone.
How a generator reads the file
The build step of a flat-file system splits each file into two halves. The frontmatter block is parsed as YAML into structured data; the rest is rendered as Markdown into HTML. Both halves then feed a template that assembles the finished page:
flowchart TD
F["Markdown file<br/>--- YAML --- + content"] --> S{"Split<br/>at the --- lines"}
S -->|"upper block"| Y["Parse YAML<br/>title, date, tags ..."]
S -->|"lower block"| M["Render Markdown<br/>to HTML"]
Y --> T["Template<br/>data + content"]
M --> T
T --> P["finished page"]
In this model the data is part of the document rather than offloaded into a separate database. A page is a file, versionable in Git and readable without a running service. That is the core of the flat-file approach as Grav CMS implements it, and the reason frontmatter is the natural home for page metadata.
Data in the document
The value of frontmatter shows in concrete fields. Common ones are a title and a description for the page itself and for search engines, a date for sorting and archiving, a reference to a layout template, a publish switch, and a list of tags for categorisation. Which fields are valid is defined by each system: Grav, Jekyll and Hugo each know their own keys, but the principle is the same.
Because this data lives in the document itself, it belongs to the content and travels with it through every pipeline. A pull request changes text and metadata in one move, a review sees both side by side, and the history shows when a title or a publish status changed. That makes frontmatter a load-bearing part of Docs-as-Code, the approach of treating documentation like source code, versioned and delivered through automation via GitOps and Reconciliation. The same idea, keeping rule-based policy as versioned files, also carries Compliance as Code.
The YAML pitfalls
YAML is convenient to write and unforgivingly precise to parse. The most common defects in frontmatter come not from the content but from the syntax.
- Indentation with tabs. YAML permits only spaces for structure, never tab characters. A single tab, often inserted unnoticed by the editor, leads to a parse error. Inconsistent indentation also silently shifts which key a value belongs to.
- Implicit typing. Parsers infer the data type from an unquoted value. So
9.30becomes the number9.3, and a version string loses its trailing zero. When a string is intended, it belongs in quotes. - The Norway problem. Older parsers following YAML 1.1 read many words as a boolean, among them
yes,no,onandoff. The country codeNOfor Norway thus becomes the booleanfalse. YAML 1.2 narrowed this list totrueandfalse, yet many widely used parsers still operate on the old rules. Quoting the value stops the guessing. - Colons and special characters. YAML plain scalars mainly break on a colon-space or context-sensitive indicators. Values with a
:(colon-space), a#comment or a leading-belong in quotes.
The consistent countermeasure is simple and monotonous: when in doubt, quote, and keep indentation consistent with spaces. Stricter tools such as StrictYAML go further and treat every value as a string by default until a schema asks for something else.
Where it sits
Frontmatter is a convention, not a standard of its own: it combines Markdown and YAML without being specified itself. The separator line of three hyphens is not arbitrary; it matches the YAML document separator. Some systems also allow TOML or JSON instead of YAML in the frontmatter, but the pattern stays the same. Related but differently positioned is the Model Context Protocol, which does not store data in the document but connects AI models to data sources at runtime. Frontmatter solves the static question, MCP the dynamic one.
References
- Grav Page Headers (Frontmatter). Documentation of the frontmatter fields and taxonomy supported by Grav. (2026). learn.getgrav.org/17/content/headers
- Jekyll Front Matter. Describes frontmatter as a YAML block between two triple-hyphen lines at the start of a file, and the predefined variables. (2026). jekyllrb.com/docs/front-matter/
- StrictYAML The Norway Problem. Explains why implicit typing is dangerous, using the example of the country code NO parsed as false. (2023). hitchdev.com/strictyaml/why/implicit-typing-removed/
- YAML.org YAML Specification 1.2.2. The current language specification; revision 1.2.2 narrows the boolean values to true and false, among other corrections. (01.10.2021). yaml.org/spec/1.2.2/
Related topics
- Grav CMS, the flat-file CMS that keeps page metadata as frontmatter.
- GitOps and Reconciliation, the delivery of versioned files as the source of truth.
- Compliance as Code, the same idea applied to rule-based policy.
- MCP (Model Context Protocol), the dynamic binding of data sources at runtime.
Ask AI
These links open external AI services, the conversation and its content are sent to their providers.