html-is-a-tree
I use a static site generator to build this site, but I keep hitting its limitations.
Introduction
A long while ago, I used clojure for some web development and used
hiccup-style html templates for the first time. For the uninitiated, clojure
is a lisp dialect. This makes it possible
directly represent tree structures. For example, here is an HTML document:
[:html
[:head
[:link { :rel "stylesheet" :href "/style.css" }]]
[:body
[:article
[:h1 "Hello world!"]
[:p "This clojure data structure directly encodes an HTML document."]]]]
One can do more than just write static documents this way, as it becomes very easy to directly inject new structure using code:
(let [posts (load-documents "posts/**/*.md")]
[:html
[:head
[:link { :rel "stylesheet" :href "/style.css" }]]
[:body
[:article
[:h1 "Hello world!"]
[:h2 "Read my articles"]
[:ol (map
(fn [post] [:li (post-summary post)])
posts)]]]])
Since the whole document itself is just data, it becomes easy to translate from one form to another. Want to iterate every heading in the doc and make a table of contents? Easy. Want to collect every link, check if it has metadata, and make a list of citations at the bottom of your page? Easy.
However, I never found a static site generator that fully embraced this. I never looked that hard, especially since I had zola and was satisfied, and I knew it would take a lot of work to migrate.
I also had some big concerns which nagged away at me:
scss- I use scss to write my stylesheets, and didn't want to make the huge undertaking of translating to raw css if necessary.
markdown- I write my aritcles with markdown, and translate that to HTML, but I also use a lot of HTML and shortcodes embeded within my markdown documents.
clojureclojure(orguilescheme) feel like a heavy dependency, and I didn't want to deal with complicated language setup in addition to complicated dependency management for the site generator itself. A single binary that I can easily compile or install from my package manager is very very hard to beat.
Interstingly, each of these concerns has diminished over time. This has me evaluating my options again.
Just Use CSS
I found an article called You no longer need JavaScript which showed just how far CSS had come. I already make heavy use of CSS variables which largely eliminated the need for scss variables, but I still rely quite heavily on scss nesting syntax
However, nesting is now standards compliant for CSS!
I really could just write this site in plain CSS if I wanted to now, and I
don't expect I would lose much of anything. I use scss @mixins in a couple of
places, but I think it would be possible to use a simpler solution.
Limitations of Markdown
While text makes up the majority of my documents, I keep finding that I need nice ways to express structural data and freely mix structured data and text.
Markdown doesn't do this well; I can write HTML inside my markdown documents to escape the limitations, but I frequently run into issues.
Just one recent example: If I leave extraneous whitespace in some parts of my document, the markdown parser will read code as a preformatted text block instead of HTML or some other kind of data (due to the 4 leading space rule).
If I put the following inside my markdown document:
The nested p selector inside the figcaption has 4 leading spaces and a new
line in front, making it a code block, resulting in the following html:
This is easy to avoid when writing the document directly, but quite difficult to avoid with shortcodes. Sometimes, they have non-trivial template statements inside them. If I fail to use the proper whitespace-removal sytanx, the output of a shortcode may trigger a 4-space codeblock.
shortcodes are not like lisp macros. They do not operate on the AST of the document, they produce a
soup of letters.
Do I need markdown?
The thing is, while I do need a way to write mostly text documents conviniently, I don't necessarily need to use markdown. What I really want, is some way to transform a lightweight document into some sort of tree structure, and then perhaps transform that tree structure into HTML.
If a document like this:
- ---
I would like to transform it to something like:
[:section
[:h1 "Hobbies"]
[:p "My current hobbies include:"]
[:ul
[:li "3D art"]
[:li "live coding"]
[:li "sewing & embroidery"]
[:li "blogging"]]
[:p "But in the past I have enjoyed pottery, bookbinding, and sculpture.
I am always working on projects and endeavor to write about them
from time to time."]
]
And then process that somehow, perhaps rendering it to HTML, perhaps not. Ideally, the scripting would operate on the level of AST nodes.
A lisp-y scripting language feels like the perfect fit for something like this.
And, as it turns out, people have already experimented with writing mostly-text documents with lisp code inside; for example, skribe! It looks something like this:
(p [Everything inside square brackets is text, but one can inject code by using
lisp's ,(code "unquote") form: 1+4=,(+ 1 4)])
This is missing some of the niceities of markdown, but I find the idea of using
something like ,() to inject code which outputs new syntax nodes, represented
as lisp data structures compelling!
Enter Janet
At this point, the ideas were milling about in my head already and zola's limitations were getting harder to deal with. I found haunt, which seemed to do what I wanted, but I wasn't able to install it so I gave up.
Then I found janet! It was super easy to install, and it takes a lot of
inspiration from clojure, which is good since I've used clojure before. I got
hacking away and was able to convert tuples to HTML in a day!
(defn html/void-tag? [tag] (in html/void-tags tag))
(def- tab " ")
(defn- align [i] (string "\n" (string/repeat tab i)))
(defn html/attribute [key value] (string/format " %s=\"%s\"" key value))
(defn html/attributes [dict]
(reduce
(fn [i [k v]] (string i (html/attribute k v)))
""
(pairs dict)))
(defn html/void [indent name attr]
(string (align indent) "<" name (html/attributes attr) " />"))
(defn html/open [indent name attr]
(string (align indent) "<" name (html/attributes attr) ">"))
(defn html/close [indent name]
(string (align indent) "</" name ">"))
(defn html/render [indent el]
(let [element (fn [tag attr children]
(cond
(and (not (empty? children)) (html/void-tag? tag))
(error (string "Void element \":" tag "\" declared with children"))
(in html/void-tags tag) (html/void indent tag attr)
(string
(html/open indent tag attr)
;(map (fn [child] (html/render (+ 1 indent) child)) children)
(html/close indent tag))))]
(match el
([tag attr & children] (dictionary? attr))
(element tag attr children)
[tag & children]
(element tag {} children)
[tag]
(element tag {} []))))
It's so nice! And it was not too painful to write. This:
(print
(html/render 0
'[:html
[:head [:meta {}]]
[:body {:class "name"}]]))
Outputs this:
The translator doesn't properly handle text elements or escaping, but I think the proof-of-concept of representing the tree-structure of HTML is a success.
Hare-Brained Scheme
At this point, I started having ideas. janet is insanely cool! I decided
to dig into the janet C api via zig. It was super easy to install the
development package through my system package manager and link to it in a
build.zig file.
It took a bit more work to figure out how to actually run janet code from
within zig, but I was able to create a function which calls std.log.debug,
call that from janet code that I called from zig.
The ouroboros was complete, and I realized that I could potentially create
a pretty good static-site generator that takes inspiration from zine and
std.Build, but using janet as the orchestration and templating/translation
language.
The entry point could be a build.janet file, which exposes a builder API that
defines the dependency graph of the site, registering janet functions as the
"build pipeline" to transform files from 1 representation into to another until
they get "installed" to the build directory.
I could write a function that parses md or orgdown into a janet tuple,
splatting code inside ,() forms right into the data.
Since janet also lets one create handlers to use when importing files, I
could maybe re-use the import functionality to register those files in the
dependency graph for rebuild and watch support.
Suddenly things are feeling dangerously exciting!
And that is where I am now. I intend to prototype this in the most janky, single-threaded way possible and see where it goes :)