aboutsummaryrefslogtreecommitdiff
path: root/blog/content/notes/tech/document-formats.gmi
diff options
context:
space:
mode:
Diffstat (limited to 'blog/content/notes/tech/document-formats.gmi')
-rw-r--r--blog/content/notes/tech/document-formats.gmi97
1 files changed, 97 insertions, 0 deletions
diff --git a/blog/content/notes/tech/document-formats.gmi b/blog/content/notes/tech/document-formats.gmi
new file mode 100644
index 00000000..385c0c0e
--- /dev/null
+++ b/blog/content/notes/tech/document-formats.gmi
@@ -0,0 +1,97 @@
+# Document formats
+
+Most of the time, when writing a document, I want a document format with the following properties:
+
+* Fast to write using a plain text editor
+* Easy to parse into an AST
+
+An AST is a programming-friendly representation of a document. ASTs reduce the effort required to write tools such as a program that validates links in a document. Ideally, ASTs contain information to track a document element to the position it occupies in the original document. With this information, if you write a tool such as a spell checker, then you can highlight misspelled works precisely in the original document.
+
+On top of that, some features that I don't always need:
+
+* Math support
+* Sophisticated code blocks. For example, being able to highlight arbitrary parts of code blocks (not syntax highlighting).
+* Diagram support
+
+## Existing formats
+
+### Markdown
+
+* Easy to write using a plain text editor
+* Has good AST parsers with position information
+* Has math support
+* Does not support sophisticated code blocks
+* There are many extensions with support for math, diagrams, and many others
+* Is very popular and supported everywhere
+* However, there is a wide variety of variants and quirks
+* Especifically, because Markdown was not designed with parsing in mind, so tools based on different parsers can have differences in behavior
+
+### Djot
+
+=> https://djot.net
+
+It is very similar to Markdown, except:
+
+* It is designed for parsing, so independent parsing implementations are very compatible with each other
+* It is not so popular, so there are less extension and tool support
+
+### AsciiDoc
+
+=> https://asciidoc.org
+
+Compared to Markdown:
+
+* It's more complex to write, but mostly because it's different and more powerful
+* There are attempts to write better parsers, but good parsers with position information are not available yet
+* Supports sophisticated code blocks
+* It has a smaller ecosystem than Markdown, but many good quality tools such as Antora
+
+### Typst
+
+=> https://typst.app
+
+Checks all my boxes, except:
+
+* It is designed for parsing and it has an AST, but it is not easy to access
+* Currently Typst is very oriented towards generating paged documents (e.g. PDF)
+* It includes a full programming language, which is mostly good (very extensible), but this might increase complexity undesirably
+
+Typst is very new and is not yet very popular.
+
+=> https://codeberg.org/haydn/typesetter Typesetter is a desktop application that embeds Typst, so no additional setup is needed. However, Typesetter is only available as a Flatpak.
+
+### Verso
+
+=> https://github.com/leanprover/verso
+
+A Markdown-like closely tied to the Lean programming language.
+
+* Eliminates ambiguous syntax for easier parsing and is stricter (not all text is valid Verso)
+* Has a (Lean) data model
+* Designed for extensibility
+
+### TODO: other formats
+
+=> https://github.com/nota-lang/nota Nota (a document language for the browser)
+=> https://github.com/christianvoigt/argdown Argdown (for argumentation)
+=> https://github.com/podlite/podlite Podlite
+=> https://orgmode.org Org Mode (an Emacs-based tool based on a lightweight markup language)
+=> https://github.com/nvim-neorg Neorg (similar to Org Mode for Neovim)
+=> https://github.com/sile-typesetter/sile Sile (typesetting system)
+
+## Creating your own formats
+
+=> https://github.com/spc476/MOPML Someone created its own lightweight format using Lua and PEGs.
+=> https://tratt.net/laurie/blog/2020/which_parsing_approach.html "Which parsing approach" has information about choosing parsing approaches.
+
+## About gemtext
+
+=> https://geminiprotocol.net/docs/gemtext-specification.gmi
+
+Gemtext is an extremely minimalistic markup language designed for use with the Gemini protocol (an extremely minimalistic protocol similar to HTTP).
+
+The Gemini protocol and gemtext are intentionally designed to limit their power, in my opinion as a comment on the web.
+
+This document is gemtext-native for use in my own minimalistic publishing system.
+
+I also use it as a statement, although the limitations of gemtext can be significant in technical writing. For example, gemtext has no inline links, no inline verbatim code, only three levels of headings, etc.