mdtoc – Specification (v1)
1. Purpose and core principles
mdtoc is a deterministic CLI tool for processing individual Markdown documents.
Functions:
- generation of a table of contents (ToC)
- consistent heading numbering
- generation of stable anchor IDs and ToC link targets according to a selected slug profile
- removal of all artifacts generated by
mdtoc - generated-output validation of a document for CI
Core principles:
- The visible heading text is the only semantic source of truth.
- Heading numbers are derived and not persistent.
- Inline anchor IDs are computed from the unnumbered title according to
slug. - ToC link targets use the same
slugprofile, independently of whether inline anchors are enabled. - Generated content is fully reconstructible.
mdtocchanges a document only on the basis of a clearly defined managed structure.- The tool is idempotent.
Note: In this document, “formal” only means “clear enough for parsers, tests, and later code generation”. It does not mean a large architecture, but a small, robust contract framework.
2. Scope and non-goals
mdtoc intentionally processes only a small, unambiguous Markdown subset.
Supported in v1:
- single Markdown file
- ATX headings from
#to###### - defined ToC markers
- defined config block
- defined inline anchor form in headings
Not supported in v1:
- Setext headings
- GUI automation
- PDF generation
- multi-file processing
- a complete Markdown AST as a specification subject of
mdtocitself - partial processing such as
--toc-onlyor--anchors-only
Note: The restriction to a small Markdown subset is intentional. It keeps the parser, test cases, and debugging simple.
3. Explicit document structure
A document managed by mdtoc uses exactly this container structure:
<!-- mdtoc -->
[TOC CONTENT]
<!-- numbering=true min=2 max=4 slug=github anchor=true link=true toc=true bullets=auto -->
<!-- /mdtoc -->
Rules:
- The outer container consists of start marker, ToC area, optional config block, and end marker.
- If present, the config block must appear immediately before
<!-- /mdtoc -->. <!-- mdtoc -->may occur at most once.<!-- /mdtoc -->may occur at most once.- The config block may occur at most once.
- If the config block is absent, all default config values apply.
- If none of the outer markers is present,
generateinserts the complete container at the beginning of the file. - If only one of the outer markers is present, or if the start marker appears after the end marker, this is a parsing error.
- Everything between
<!-- mdtoc -->and the beginning of the config block is the managed ToC area. - If there is no config block, everything between
<!-- mdtoc -->and<!-- /mdtoc -->is the managed ToC area. - Foreign content in the ToC area is not deleted by
generate, but preserved as an HTML comment.
Note: The user can determine where the table of contents should appear by moving the ToC area.
Explanation: The complete container is the managed area. toc=off does not mean “no container”, but “an empty managed ToC area”.
Note: The explicit container structure is intentionally easier to read than implicit marker logic. It makes the area managed by mdtoc immediately visible.
4. Parsing rules
4.1 Principle
The specification describes managed behavior in a line- and position-oriented way.
An implementation MAY internally use a Markdown parser as long as the external behavior matches this specification exactly.
Explanation: For implementation in Go, an internal parser such as goldmark is useful, even though the managed rewrite rules remain described in a line-oriented way.
Current implementation note: The current implementation uses a self-contained line parser plus a small inline-text extractor; an alternative parser is still allowed as long as the external behavior remains identical.
4.2 Ignored regions
These regions are ignored when detecting markers and headings:
- Fenced code blocks with backticks:
- Start: a backtick fence according to the supported Markdown parser or supported v1 subset (a line beginning with three backticks)
- End: the corresponding closing backtick fence (the next line beginning with three backticks)
- Fenced code blocks with tilde:
- Start: a tilde fence according to the supported Markdown parser or supported v1 subset (a line beginning with three tildes (
~~~)) - End: the corresponding closing tilde fence (the next line beginning with three tildes (
~~~))
- Start: a tilde fence according to the supported Markdown parser or supported v1 subset (a line beginning with three tildes (
- Inline code spans:
- region between two backticks on the same line
- HTML comments:
<!-- ... -->- exception:
<!-- mdtoc -->,<!-- /mdtoc -->,<!-- mdtoc off -->, and<!-- mdtoc on -->
Not ignored:
- Blockquotes
Blockquotes are normal input lines.
They are not treated as a special region.
Practical consequence:
- A blockquote line begins with optional spaces and then
>. - A heading recognized by
mdtocmust begin with ahashesprefix directly in column 1. - Therefore, blockquotes cannot match the heading syntax and need no special treatment.
Interpretation:
- “Do not ignore blockquotes” explicitly does not mean that headings are created from them.
- It only means that
mdtocdoes not need a dedicated blockquote mode.
4.3 Parsing order
Processing logically runs in this order:
- Determine ignored regions or Markdown context.
- Recognize the outer
mdtoccontainer and config block only outside ignored regions. - Recognize headings only outside ignored regions.
- Semantically normalize managed artifacts.
- Derive the target state.
- Render the output.
Explanation:
- Without this order, markers or headings inside a code fence would be ambiguous.
- This exact ambiguity is intentionally excluded here.
5. Heading syntax
5.1 Candidates for headings
Only lines that begin directly at the start of the line with one of the following prefixes are headings for mdtoc at all:
hashes := "# " | "## " | "### " | "#### " | "##### " | "###### "
This also means:
- exactly one space must follow the
#characters - no spaces may appear before the
#characters
Note: The space is intentionally part of hashes here. This simplifies the parser: after the prefix, either the number, the anchor, or the title begins immediately.
5.2 Structure of a managed heading
Managed headings use exactly this schema:
heading_line := hashes [number SP] [anchor] title
number := DIGIT+ ("." DIGIT+)* "."
anchor := "<a id=\"anchor_id\"></a>"
title := NONEMPTY_TEXT
SP := exactly one U+0020 space
Additional rules:
numberis optional.- If
numberoccurs, it appears directly afterhashesand is followed by exactly one space. anchoris optional.- If
anchoroccurs, it appears directly afterhashesor directly afternumber SP. - There is no space between
</a>and the first character of the title. - Inside the title, spaces and characters remain unchanged.
- Only headings that exactly follow this positional logic may be rewritten by
mdtoc.
Explanation:
- The missing space between
</a>and the title is intentionally preserved because it is part of the managed render format. - The motivation is no longer
dumengcompatibility, but an unambiguous and idempotent render schema.
Examples of valid managed headings:
# Title
## 1. Introduction
## <a id="introduction"></a>Introduction
### 2.1. <a id="api-overview"></a>API Overview
Examples that mdtoc does not treat as a managed structure:
# Title
## 1. Introduction
### 1.2 Introduction
### <a id="x"></a> Introduction
5.3 Meaning of the syntax
### 2024 roadmapis not a number because the first token does not end with..### 3D graphicsis not a number because the first token is not a purex.y.z.pattern.### 2.1. APIis a managed numbering syntax.
Note: The pattern ### 2.1. API is therefore intentionally reserved for mdtoc. Anyone writing a free heading in exactly this format is using the same syntax as the tool.
5.4 Supported Markdown subset
mdtoc is not a general Markdown parser.
For headings, v1 supports:
- ATX headings only
- only the heading syntax defined above
- no Setext headings
- no implicit or ambiguous special cases
The practical prefilter is therefore at least:
^#{1,6}
And the actual rewrite logic applies only to lines that also satisfy the remaining positional logic.
6. Small formal model
This section describes the minimal internal view that is helpful for clean implementation and tests.
6.1 Managed heading
Internally, this model is sufficient for a managed heading:
ManagedHeading
- line_index
- level
- title_markup // Title area as it appears in the document, but without managed numbering and without the managed inline anchor
- title_text // Plain-text interpretation of title_markup; source for ToC link text and anchor ID
- number // derived or empty
- anchor_id // derived or empty
Semantically important are only:
leveltitle_markuptitle_text
Derived from these are:
numberanchor_id
Explanation:
- The distinction between
title_markupandtitle_textremains useful even thoughtitle_textis derived separately from the raw line. - In the current implementation,
title_textis derived by a self-contained inline-text extraction step. - The authoritative value is the deterministic result of that extraction logic, not the output of an external Markdown renderer.
6.2 Document state
For mdtoc, a document is practically in one of these states:
-
unmanagedNo validmdtoccontainer is present. -
managedA validmdtoccontainer is present. The container may include a config block; if it does not, defaults apply. -
generated targetThe document byte-for-byte matches the output thatregenwould produce from the current content and container config.
mdtoc does not persist a state field. A stripped document is still a managed document, but it does not match the generated target and check therefore returns a mismatch until regen is run.
6.3 Processing pipeline
Processing always follows the same simple pattern:
parse -> normalize -> derive -> render
This means:
- parse: recognize container, config, and headings
- normalize: semantically remove managed numbers and managed anchors
- derive: recalculate numbers, anchor IDs, and ToC
- render: write the document back deterministically
Note: This is not meant to force a large AST architecture. It only defines which pieces of information are semantically relevant and which are render artifacts only.
6.4 Validity range of min and max
This version uses the following simple, easy-to-understand rule:
minandmaxin config, and--min-leveland--max-levelin the CLI, filter the same set of headings for- ToC generation
- numbering
- anchor generation
Practical consequence:
- During
generate, all managed numbers and managed anchors are first removed from all managed headings. - Then numbers and anchors are only re-applied to headings within the active level range.
- Headings outside the range remain in the document unchanged, but are no longer actively managed.
Cross-reference:
- The same rule is used again in section 10 for the ToC.
- This is intentional; both places describe the same contract from two different perspectives.
7. Config block
The config block is an optional HTML comment placed directly before <!-- /mdtoc -->.
It stores whitespace-separated key=value fields:
<!-- numbering=true min=2 max=4 slug=github anchor=true link=true toc=true bullets=auto -->
The same config may be written across multiple lines:
<!--
numbering=true min=2 max=4
slug=github anchor=true link=true toc=true bullets=auto
-->
Rules:
- The config block may be deleted completely; then all defaults apply.
- All fields use
key=value. - Field order is arbitrary.
- Unknown keys are ignored so newer generated configs remain readable by older versions.
- Duplicate known keys are invalid.
- Invalid known values are invalid.
- Boolean values accept
true|false|on|off; normalized output writestrue|false. - Allowed known keys:
numbering=true|false|on|offmin=<N>max=<N>slug=github|gitlab|crossnoteanchor=true|false|on|offlink=true|false|on|offtoc=true|false|on|offbullets=auto|*|-|+
minandmaxare positive integers.minmust not be greater thanmax.maxmust not be greater than 6.anchoris strictly Boolean. It only controls whether managed inline anchors are rendered.slugdefines the anchor/link slug algorithm globally, independently ofanchor.linkcontrols whether ToC entries are Markdown links or plain text list items.generatewrites all generator options into the config block when the output has no prior container, when a config block already existed, or when non-default config must be persisted.- If a managed container has no config block and the effective config is still default, rewrites preserve the absent config block.
--file,--help,--version,--verbose, and--raware not persisted.stripkeeps the config block if it exists.strip --rawremoves the complete container, including any config block.toc=offmeans: the managed ToC area remains part of the container, but is rendered empty.
There is no state field and no container-version field. Legacy <!-- mdtoc-config ... --> blocks are not part of this specification.
8. CLI interface
8.1 Commands
| Option | Description |
|---|---|
mdtoc --version |
Prints short version information. |
mdtoc --version --verbose |
Prints detailed version information. |
mdtoc --help |
Prints short help text. |
mdtoc --help --verbose |
Prints long help text. |
mdtoc [--file <name>\|<name>] [GENERATE OPTIONS] |
root mode: uses regen for valid managed input without generate overrides, otherwise generate. |
mdtoc [GENERATE OPTIONS] < INPUT.md |
root mode on stdin; same dispatch rule as above. |
mdtoc generate [--verbose] [OPTIONS] |
generates/updates ToC, numbers, anchors. |
mdtoc generate --help |
Prints long help text specifically for generate. |
mdtoc regen [--verbose] |
regenerates from the persisted container config. |
mdtoc refresh [--verbose] |
alias for regen. |
mdtoc regen --help |
Prints long help text specifically for regen. |
mdtoc refresh --help |
Prints the same help text as regen. |
mdtoc strip [--verbose] [--raw] |
removes ToC, numbers, anchors and optionally config. |
mdtoc strip --help |
Prints long help text specifically for strip. |
mdtoc check [--verbose] |
checks whether the document matches regenerated output. |
mdtoc check --help |
Prints long help text specifically for check. |
8.2 Options for generate
| Option | Default | Meaning |
|---|---|---|
--numbering <on\|off\|true\|false> |
on |
enable or disable heading numbering |
--min-level <N> |
2 |
minimum managed heading level (>=1) |
--max-level <N> |
4 |
maximum managed heading level (<=6) |
--slug <github\|gitlab\|crossnote> |
github |
select the slug algorithm for inline anchors and ToC link targets |
--anchor <on\|off\|true\|false> |
on |
enable or disable managed inline anchors |
--link <on\|off\|true\|false> |
on |
render ToC entries as Markdown links when enabled |
--toc <on\|off\|true\|false> |
on |
renders the managed ToC area when on, leaves it empty when off |
--bullets <auto\|*\|-\|+> |
auto |
choose the generated unordered-list bullet style |
--file <name> |
– | read and overwrite file |
--verbose |
off |
diagnostic and progress messages on stderr |
--help |
– | show help |
Input form rules:
- File-backed commands accept either a positional file argument or
--file <name>. - The positional-file shorthand applies both to explicit subcommands and to root mode.
- Exactly one input source is allowed per invocation.
Short forms:
| Option | Short form |
|---|---|
--numbering |
-n |
--anchor |
-a |
--bullets |
-b |
--file |
-f |
--verbose |
-v |
--help |
-h |
Compatibility note:
- The current CLI also tolerates the Go
flagpackage’s one-dash long-option spellings such as-toc,-anchor,-slug, or-numbering. - These are accepted as compatibility aliases for the documented double-dash generate option forms.
- Documentation and examples should still prefer the canonical double-dash form.
8.3 I/O and logging behavior
- With a positional file or
--file, the file is read and overwritten. - Without file input, document input comes from
stdinand document output goes tostdout. - If neither file input nor piped
stdinis provided, the command fails with an input-source error. - If more than one input source is provided, the command fails with an input-source conflict error.
- Successful commands produce no output except for
--help,--version, or--verbose. - Errors and diagnostic messages are written exclusively to
stderr. - Collected warnings are only printed in verbose mode.
Root-mode dispatch rules:
- If the input contains a valid managed container and no generate-control flags are explicitly set, root mode behaves like
regen. - If the input does not contain a valid managed container, root mode behaves like
generate. - If at least one generate-control flag is explicitly set, root mode behaves like
generateeven when a valid managed container exists.
Generate-control flags:
--numbering,-n--min-level--max-level--slug--anchor,-a--link--toc--bullets,-b
9. Commands
9.1 generate
Behavior:
- Parse the document.
- If no managed container is present, create the complete container at the beginning of the file.
- If marker structure or config is invalid: error and no change.
- Semantically remove existing managed artifacts:
- ToC content
- managed heading numbers
- managed inline anchors
- Determine relevant headings.
- Recalculate numbers if
numbering=true. - Recalculate
anchor_idfor all relevant headings usingslug. - Render managed inline anchors only if
anchor=true. - Re-render the ToC if
toc=true; otherwise render the managed ToC area empty. - Re-render headings.
- Re-render config according to section 7.
- Write the document back.
Additional rules:
- Numbering and anchor ID are strictly decoupled.
- Inline anchor IDs are computed from the unnumbered title.
- Duplicate IDs are resolved deterministically.
- Foreign content in the ToC area is not deleted, but preserved as an HTML comment.
--anchor onand--anchor trueenable inline anchors; normalized config writesanchor=true.--anchor offand--anchor falsedisable inline anchors; normalized config writesanchor=false.- On success, the result is idempotent.
Example of a rendered heading:
### 4.1. <a id="open-source"></a>Open source
Explanation:
- Additional user-defined inline elements in the title are not fundamentally forbidden.
- For the normative derivation of
anchor_id, however, it is not the raw markup that counts, buttitle_textaccording to section 6 and section 11.
9.2 strip
Behavior:
- requires a valid managed container
- removes managed ToC content
- removes managed heading numbers
- removes managed inline anchors
- keeps the outer container
- keeps the config block if it exists
After strip, this structure is still valid:
<!-- mdtoc -->
<!-- numbering=true min=2 max=4 slug=github anchor=true link=true toc=true bullets=auto -->
<!-- /mdtoc -->
Error case:
- no valid managed container -> error
- no implicit repair
9.3 strip --raw
Behavior:
- first attempts the normal structural parse
- if that succeeds, it removes the complete managed container, if present:
<!-- mdtoc -->- ToC content
- optional config block
<!-- /mdtoc -->
- additionally removes managed heading numbers
- additionally removes managed inline anchors
- if strict parsing fails, it falls back to locating only the outer managed markers and removes the container by marker bounds
- after fallback container removal, heading normalization is attempted again on the remaining body text
Conservative rule:
- If it cannot be determined with certainty whether a number or an inline anchor is managed, the content remains unchanged.
- If fallback parsing was needed, a warning is collected; this warning is only emitted in verbose mode.
- If fallback container removal succeeds but heading parsing still fails afterward,
strip --rawreturns the original parsing error.
Use cases:
- damaged config
- migration
- complete removal of
mdtocmanagement - tests
9.4 regen
Behavior:
- requires a valid managed container
- reads the persisted normalized config from the existing managed container, or uses defaults when no config block is present
- regenerates the document into the generated target state
- writes the updated normalized config according to section 7
refreshis a supported command alias with the same behavior
Error case:
- no valid managed container -> error
9.5 check
Behavior:
- requires a valid managed container
- reconstructs the target state from the current document content and config
- compares target state and actual state byte-for-byte
- returns
0if both are identical - returns exit code
2on mismatch
No side effects:
checknever modifies the document
Interpretation:
checkalways reconstructs the generated target state.- A stripped managed document is therefore valid input but fails
checkuntilregenis run.
Note: “byte-for-byte” sounds more formal than it is in practice. What it means is: check computes the same text that generate or strip would write, and compares exactly that.
10. ToC rules
The ToC is based on all managed headings within min to max, inclusive.
Render rules:
- One heading produces exactly one ToC entry.
- The hierarchy follows the heading level.
- Each additional level relative to
minis indented by two spaces. - Every entry is a Markdown list item.
- If
link=true, the list item text is a Markdown link. - If
link=false, the list item contains plain text only. - The list marker is chosen according to
bullets.
Example:
* [1. Introduction](#introduction)
* [1.1. API](#api)
Bullet selection:
- with
bullets=*,-, or+, the configured marker is used exactly - with
bullets=auto,mdtoccounts unordered-list markers in the body text outside fences, generic HTML comments, and excluded regions - recognized body list markers are
*,-, and+followed by one space - if one marker has the highest count, it is used
- ties are resolved in the fixed order
*>->+
Displayed in the link text:
- with
numbering=true:number + title - with
numbering=false: title only
Link target when link=true:
- with
anchor=true:#+anchor_id - with
anchor=falseandnumbering=false:#+ slug(title_source) - with
anchor=falseandnumbering=true:#+ slug(number + " " + title_source) - collision handling follows the same per-document slugger behavior described in section 11
Behavior of anchor:
- with
anchor=true,mdtocrenders a managed inline anchor - with
anchor=false,mdtocdoes not render a managed inline anchor; ToC links target the renderer-derived heading ID based on the rendered heading text anchordoes not select the slug algorithm
Behavior of slug:
slug=githubuses the GitHub-compatible rules in section 11.3slug=gitlabuses the GitLab rules in section 11.7slug=crossnoteuses the Crossnote / Markdown Preview Enhanced rules in section 11.8- For
slug=githubandslug=gitlab,title_sourceistitle_text. - For
slug=crossnote,title_sourceistitle_markup. - ATX closing hash markers are stripped from
title_text, but remain visible to the Crossnotetitle_markupslug path. Therefore## An ATX title with closing hash markers ####targets#an-atx-title-with-closing-hash-markers--whenslug=crossnote,anchor=false, andlink=true.
Explanation:
anchor=falseis therefore renderer-dependent because there is no managed inline anchor to pin the target.- Fully portable and renderer-independent ToC links are guaranteed only with
anchor=true.
Cross-reference:
- The actual norm for
anchor_idappears exclusively in section 11. - Section 10 only describes the use of the already computed ID in the ToC.
11. Slug and anchor ID specification
slug selects the algorithm used for managed inline anchor IDs and generated ToC link targets.
Inline anchor IDs are deterministically derived from the unnumbered title. ToC targets with anchor=false are derived from the rendered heading source, including a managed number prefix when numbering is enabled.
11.1 Goal
The generated values should be:
- stable
- deterministic
- readable
- compatible with the selected renderer/profile
- identical for inline anchors and ToC links when
anchor=true
11.2 Input for the derivation
For every managed heading, the following applies:
slug_source := title_text // github, gitlab
slug_source := title_markup // crossnote
anchor_id := slugify(slug_source)
The following also applies:
title_textis not the raw title string from the line.title_textis the plain-text interpretation oftitle_markup.- Managed numbering and the managed inline anchor are not part of
title_text. - ATX closing hash markers are stripped when deriving
title_text. - The current implementation derives
title_textwith these inline rules:- backtick code spans contribute only their visible content
- Markdown links and images contribute only their visible label or alt text
- HTML tags are removed
- inline formatting markers
*,_, and~are removed - whitespace is collapsed to single spaces and trimmed at the ends
Explanation:
- The current implementation intentionally keeps this extraction logic small and self-contained.
- Only this keeps slug/anchor generation, ToC link text, and profile behavior consistent.
11.3 GitHub-compatible basic rules
The function slugify MUST perform at least these steps:
- Input is
title_text. - Letters are converted to lowercase using Unicode lowercasing.
- Markdown formatting characters and inline markup do not contribute literal characters to the slug; only their visible text content counts.
- Unicode letters and Unicode decimal digits are preserved.
- Runs of whitespace and punctuation between preserved text parts are normalized to exactly one
-. - Leading and trailing runs of whitespace or punctuation do not create a leading or trailing
-. - If the resulting slug already exists in the same document,
-1,-2,-3, … is appended.
Interpretation:
- These rules follow GitHub’s documented basic rules in a form that is explicitly testable for
mdtoc. - For edge cases not documented there,
mdtocmakes additional decisions in the following subsections.
11.4 Explicit decisions for edge cases
Additionally, the following applies in mdtoc v1:
- Symbols, emojis, and other non-letter/non-digit characters are removed.
- Runs of whitespace/punctuation are not rendered as multiple
--,---, etc., but collapsed to exactly one-. - Collision resolution starts at the second occurrence with
-1. - If the normalized slug becomes empty,
mdtocuses the fallbacksection. - Further collisions on this fallback are resolved as
section-1,section-2, …
Explanation:
- The fallback
sectionis a deliberatemdtocdecision. - GitHub’s public basic rules do not explicitly describe this empty-slug edge case.
11.5 Relationship to inline anchor syntax
If anchor=true, mdtoc renders exactly this form:
<a id="anchor_id"></a>
The following applies:
- the string in
id="..."MUST exactly match theanchor_idcomputed according to this section - with
anchor=true,anchor_idand the ToC link target are therefore the same string
Cross-reference:
- Section 5 only defines the position and render format of the inline anchor.
- The string inside
id="..."is normalized exclusively here in section 11.
11.6 Examples
Example 1
### Open source
→
open-source
Example 2
### This'll be a _Helpful_ Section About the Greek Letter Θ!
→
thisll-be-a-helpful-section-about-the-greek-letter-θ
Example 3
### Übergrößenträger & naïve façade – déjà vu!
→
übergrößenträger-naïve-façade-déjà-vu
Example 4
### 中文 русский عربى
→
中文-русский-عربى
Example 5
### 🚀 !!!
→
section
Example 6
Two identical headings ### API result in:
api
api-1
11.7. GitLab slug profile
If slug=gitlab, mdtoc MUST derive IDs according to the GitLab heading-ID rules documented for GLFM.
The GitLab profile applies these steps:
- Input is
title_text. - All text is converted to lowercase.
- All non-word text is removed.
- Spaces are converted to
-. - Two or more adjacent hyphens are collapsed to one.
- If the resulting ID already exists in the same document,
-1,-2,-3, … is appended.
For mdtoc, the GitLab profile is interpreted as follows:
- Unicode letters and Unicode decimal digits are preserved.
_is preserved as part of a word.- Existing
-characters are preserved and then normalized by the hyphen-collapse step. - Punctuation between preserved text parts is removed, not converted to a separator.
- Leading and trailing hyphens are trimmed after normalization.
- If the normalized ID becomes empty,
mdtocuses the fallbacksection.
The GitLab profile therefore differs from the GitHub-compatible profile in important edge cases:
3.5becomes35in GitLab mode, but3-5in GitHub mode.A+Bbecomesabin GitLab mode, buta-bin GitHub mode.foo_barstaysfoo_barin GitLab mode, but becomesfoo-barin GitHub mode.
Examples:
## Version 3.5
## A+B
## foo_bar baz
In GitLab mode, these headings yield:
version-35
ab
foo_bar-baz
11.8. Crossnote / Markdown Preview Enhanced slug profile
If slug=crossnote, mdtoc derives IDs with the Crossnote / Markdown Preview Enhanced style used by the github-slugger plus uslug pipeline.
For mdtoc, this profile is interpreted as follows:
- The input is
title_markup, nottitle_text. - Text is trimmed and lowercased.
~and。are removed before slugging.- Whitespace is converted to a temporary separator before punctuation stripping.
- Unicode letters, Unicode decimal digits, combining marks,
_,-, and the temporary separator are preserved. - Other punctuation is removed.
- Repeated
-characters are collapsed. - The temporary separator is rendered as
-. - If the normalized ID becomes empty,
mdtocuses the fallbacksection. - Collision handling appends
-1,-2,-3, … starting at the second occurrence.
Important examples:
## 1.1. API
## An ATX title with closing hash markers ####
In Crossnote mode, these headings yield:
11-api
an-atx-title-with-closing-hash-markers--
12. Error behavior, logging, and exit codes
Error cases:
- missing or incomplete
mdtoccontainer - invalid config block
- parsing error
- invalid options
Basic rules:
- Errors are written to
stderr. - On errors, there is no implicit repair except for the explicitly allowed creation of a new container by
generatewhen nomdtocmanagement exists yet. - Successful commands write no status messages to
stdout.
Recommended exit codes:
0-> success1-> parsing, config, or CLI error2->checkfound a mismatch
13. Idempotence
Idempotence is part of the contract.
Examples:
mdtoc generate
mdtoc generate
=> no further change on the second run
mdtoc strip
mdtoc strip
=> no further change on the second run
mdtoc strip --raw
mdtoc strip --raw
=> no further change on the second run
Cross-reference:
- Idempotence is already defined in section 1 as a core principle and in section 9 as command semantics.
- Section 13 intentionally repeats the contract once again in test form.
14. Extensibility
Possible later extensions:
- alternative anchor styles
- alternative ToC formats
- versioning in the config block
- additional output formats
Note: These points are explicitly extensions. They should not make v1 unnecessarily complex.
15. Current implementation basis (informative)
The current Go implementation is intentionally self-contained.
Current basis:
- a line-oriented parser for
- the managed container
- fenced code blocks
- generic HTML comments
- exclusion regions
- heading candidates
- a small inline-text extractor for deriving
title_text - an internal slugger implementation for the GitHub, GitLab, and Crossnote slug profiles
Current implementation notes:
- heading recognition is intentionally restricted to the explicit ATX subset from section 5
- the normative slug and anchor rules from section 11 are implemented directly in
mdtoc - no external Markdown renderer is the normative source of
anchor_id - the current code does not require a full Markdown AST to preserve the documented behavior
Alternative implementations:
- Another implementation MAY use a Markdown parser library internally.
- Such an implementation MUST still preserve the external behavior defined by this specification.
- In particular,
title_text, slug generation, collision handling, ignored regions, and marker/config handling must remain behavior-compatible with the current implementation.
Explanation:
- The current implementation favors a narrow, deterministic parser over a full Markdown dependency.
- The actual domain logic of
mdtocremains small and explicit: finding containers, normalizing managed headings, deriving numbers and IDs, and rendering deterministically.