MonolithAn OCaml library for producing a single, monolithic Markdown document by intelligently parsing, detecting, and inlining linked Markdown files.
This library provides AST-based processing using the Cmarkit library to parse and traverse Markdown documents. It can automatically detect Table of Contents (TOC) sections and recursively inline referenced files, making it ideal for generating comprehensive documentation from distributed source files.
dedupe is enabledallow_remote is disableddedupe is enabled, the library does not attempt to intelligently resolve or restructure circular referencesmax_depth, which may leave some nested files un-inlined if the depth limit is too restrictivededupe is true), subsequent references create placeholder text and a backreference to the original inlined file, rather than duplicating content open Markdown_monolith
(* Create custom configuration *)
let config = {
default_config with
allow_remote = false; (* Keep remote fetching disabled for security *)
max_depth = 5; (* Limit recursion depth *)
dedupe = true; (* Prevent duplicate inlining *)
} in
(* Process a file *)
match monolith_of_file ~config "index.md" with
| Ok doc ->
let output = Cmarkit_commonmark.of_doc doc in
print_endline output
| Error msg ->
prerr_endline ("Error: " ^ msg)type config = {allow_remote : bool;If true, follow and inline remote links (HTTP/HTTPS URLs).
Default: false
Warning: Enabling this option may introduce security risks and excessive network usage. Remote content is fetched synchronously and there are no built-in rate limits or timeouts beyond the HTTP client defaults. Only enable this if you trust the remote sources.
Behavior: When false, attempting to inline a remote link will result in an error or the link being skipped.
max_depth : int;Maximum recursion depth for inlining files.
Default: 10
Constraint: Must be positive. Setting this to a very high value may cause stack overflow or excessive processing time.
Behavior: When the depth limit is reached during recursion, the processing fails with an error indicating the maximum depth was exceeded.
Example: A max_depth of 3 allows the root document (depth 0) to inline files (depth 1), which can inline files (depth 2), which can inline files (depth 3), but no deeper.
dedupe : bool;If true, do not inline the same file more than once.
Default: true
Behavior: When enabled, maintains an internal map of previously inlined file paths. If a file is encountered again, instead of inlining it a second time, a placeholder text "Duplicate Reference to: link" is inserted.
Use Case: Essential for preventing infinite loops in documents with circular references or shared sub-documents referenced from multiple places.
Trade-off: While this prevents duplication, it means repeated content must be accessed via anchor links in the final document rather than being present at multiple locations.
*)strict_commonmark : bool;If true, enforce strict CommonMark parsing rules.
Default: false
Behavior: Passed directly to the Cmarkit parser. When true, the parser follows the CommonMark specification strictly. When false, some common Markdown extensions may be accepted.
Recommendation: Use true for maximum compatibility and predictable parsing behavior. In particular refer to the Cmarkit documentation and Commonmark Spec for details.
add_newlines : bool;If true, add blank lines between inlined content blocks.
Default: true
Behavior: When a file is inlined, this option controls whether a blank line is inserted after the inlined content. This helps visually separate inlined sections in the final output.
Recommendation: Keep enabled for better readability of the generated monolithic document.
*)force_reconciliation : bool;If true, force link reconciliation and if no header is found to reconcile against, fail. Otherwise, if false, print a warning and proceed without reconciliation.
Default: false
Behavior: When enabled, documents that lack a top-level header will cause the processing to fail with an error during inlining. When disabled, a warning is printed to stderr, and the inlining proceeds without link reconciliation.
Recommendation: Enable this option only if consistent link reconciliation is critical for your use case and you want to enforce that all inlined documents have proper headers.
*)}val default_config : configThe default configuration with safe, conservative settings.
Default values:
allow_remote = false — Remote fetching disabled for securitymax_depth = 10 — Reasonable recursion limitdedupe = true — Prevent infinite loopsstrict_commonmark = false — Accept common Markdown extensionsadd_newlines = true — Better visual separationforce_reconciliation = false — Do not force link reconciliationbullet_ish_prefix prefix determines if a string looks like a list item prefix. This is important for identifying links that are part of lists (e.g., TOCs) versus regular paragraphs.
Returns true if prefix matches one of the following patterns:
*, -, +1., 1), 1.2., 1.2.3), etc.Returns false otherwise.
Whitespace: The function automatically trims leading and trailing whitespace from prefix before matching.
Example Usage:
bullet_ish_prefix " * " ;; (* true *)
bullet_ish_prefix "1." ;; (* true *)
bullet_ish_prefix "1.2.3" ;; (* true *)
bullet_ish_prefix "abc" ;; (* false *)
bullet_ish_prefix "" ;; (* false *)Critical for Inlining: This function determines which links are eligible for inlining. Only links that appear in paragraphs with bullet-ish prefixes (within list items) are considered for recursive inlining.
val monolith_of_file :
?config:config ->
string ->
(Cmarkit.Doc.t, string) Stdlib.resultmonolith_of_file ?config filepath produces a monolithic Markdown document.
Starting from the file at filepath, this function:
#header)Link Detection: The function only processes links that appear in list items with bullet-like prefixes. Links in regular paragraphs are not processed. See bullet_ish_prefix for supported patterns.
Path Resolution: File paths are resolved relative to the current document being processed. Absolute paths and URLs are supported based on configuration.
Return Value: Returns a Cmarkit AST (Cmarkit.Doc.t) on success, or an error message on failure. To convert the AST to a string, use:
match monolith_of_file "index.md" with
| Ok doc -> Cmarkit_commonmark.of_doc doc
| Error msg -> "Error: " ^ msgError Conditions:
config.max_depth)allow_remote = false)Performance Considerations:
max_depth) may consume significant stack spaceSide Effects:
allow_remote = true)