This page explains
Org-mode comes with out-of-the-box HTML export functionality (along other supported output formats). The default export settings generate the following artefacts along with the core content:
Usually, I do not need these additional artefacts.
Furthermore, there are exports settings, which you might want to control:
I took the following approach when dealing with these issues
init.el
init.el
Customisations
I needed to put a few customisations in init.el
:
'(org-export-allow-bind-keywords t)
This setting is important, as it enables in-buffer settings beyond the standard in-buffer variables.
'(org-html-extension "htm")
Due to long legacy, I still stick to .htm
instead of .html
. At
least in Emacs 25.1, it could not be set by an in-buffer setting.
'(org-html-text-markup-alist '((bold . "<strong>%s</strong>") (code . "<code class=\"c\">%s</code>") (italic . "<em>%s</em>") (strike-through . "<del>%s</del>") (underline . "<span class=\"underline\">%s</span>") (verbatim . "<code class=\"v\">%s</code>")))
This is my choice to map some core in-line text semantics to HTML.
The official org-mode manual contains in section 13.18.1 advice how to achieve "minimal HTML" export. I list here the respective lines to be put at the top of each org-mode file, that is to be exported as HTML:
#+BIND: org-html-head-include-default-style nil
No CSS styles in the HTML, because I add a link to a central CSS style sheet in a later step, using XML/XSLT technology (see trgensit).
#+BIND: org-html-head-include-scripts nil
No JavaScript code. I only use static web sites that do not need JavaScript.
#+BIND: org-html-preamble nil
No preamble. I do this in a separate step using XML/XSLT technology (see trgensit).
#+BIND: org-html-postamble nil
No post-amble. I do this in a separate step using XML/XSLT technology (see trgensit).
#+BIND: org-html-use-infojs nil
This line from the manual is not needed, because #+INFOJS_OPT:
switches on JavaScript code in the export file.
Note that from the above options, setting
org-html-head-include-scripts nil
is essential, because otherwise
the subsequent XSLT transformation will fail. The emitted JScript code
contains a line which xsltproc
complains about (org-mode version
9.4.4):
parser error : EntityRef: expecting ';' // @license magnet:?xt=urn:btih:e95b018ef3580986a04669f1b5879592219e2a7a&dn=publ ^
Unintentionally, the string &dn
is parsed as an "entity", which
is missing the terminating ;
.
I put some more configuration lines at the top of the file:
#+OPTIONS: num:nil toc:nil ^:{} H:4 tags:nil
#+BIND: org-export-with-creator nil
No creator sentence in the post-amble.
#+BIND: org-html-toplevel-hlevel 1
The #+TITLE
meta data becomes a HTML title
element. And the
org-mode exporter also puts a special first h1
heading based on
the #+TITLE
meta data content. As a consequence, by default all top
level org-mode headings (one "*" star) get output as h2
.
I want to remove the extra h1
due to the title meta data. I have not
found a way to switch it off by org-mode means, hence I need to
post-process the HTML.
With this setting, the one star headings become h1
and I can decide
the heading hierarchy on a case by case basis.
#+HTML_DOCTYPE: xhtml5
The default is xhtml-strict
, which is a bit more verbose than
xhtml5
.
#+BIND: org-html-mathjax-template ""
This suppresses further javascript code, which was introduced in Emacs
26.1 and is related to Latex. Note that this has to be an empty
string. nil
would cause an error.
#+SETUPFILE:
Rather than putting all these #+...
lines at the top of an .org
file, I
have a central file mysetup.org
containing these lines. Thus, the .org
file just need to have a first line:
#+SETUPFILE: ~/mysetup.org
This allows for easy central maintenance.
And: If you needed a special setting for a particular file, you could
still put this setting after the #+SETUPFILE...
line and it will
take precedence.
With these settings, I get quite close to a plain vanilla HTML file fitting my requirements. A few issues remain:
id
attributes, which start
with org...
. They are used to link from the table of contents. As
I do not use the TOC and as these ids change, I cannot use them as
link anchors myself. Hence I remove those idsdiv
elements, even nested ones, where I do
not see a purpose. Hence I remove those divsh1
element, that
stems from #+TITLE
img
element to an object
element for linking SVG files. I
want to keep img
elementsxmlns="http://www.w3.org/1999/xhtml"
in the output's html
element. While this is correct and reflects the wanted xhtml5
format, this namespace declaration breaks the XSLT processing, I do
down the line to generate web content. I would need to modify the
XSLT style sheets to explicitly cater for this namespace. I do not
see a benefit in doing so. Hence I remove the declaration.
To address above mentioned issues, I have coded a small elisp package
tr-org-html.el
. tr-org-html.el
contains a function
tr-org-html-export-to-html
that is calling the standard org-mode
HTML export function and then does a XSLT transformation to arrive at
a final, "amended" HTML file.
The XSLT transformation is done using the xsltproc
executable.
xsltproc
uses a XSLT style sheet as input, which I named
tr-org-html-trnsfrm.xsl
, and the HTML file that org-mode has
generated as 2nd input, to generate a new, final HTML file.
I publish the mentioned files here in ./trorghtml.zip under the GNU
public license. Please see file COPYING.txt
in the ZIP file. I do
not take any responsibility nor warranty for using this software nor
for this write-up.
Addressing the namespace issue required special gymnastics in the XSLT style sheet. I needed to do some online research, as the XSLT 1.0 documentation is silent about this issue. The issue seems to be explicitely addressed in XSLT 2.0.
Extract the files in ./trorghtml.zip and copy the files
tr-org-html.el
and tr-org-html-trnsfrm.xsl
to a directory in Emacs'
load-path
, e.g. /usr/share/emacs/site-lisp/
. I chose (Windows 10):
C:\Users\username\OneDrive\myPrograms\emacs-27.2\share\emacs\site-lisp\
.
The files tr-org-html.el
and tr-org-html-trnsfrm.xsl
must be in
the same directory
Your init.el
needs to contain the line
(require 'tr-org-html)
Beyond standard Emacs, you need xsltproc
in your path. xsltproc
seems available for all Linux distributions and also for MS Windows. I
trust it is also available for macOS.
A missing #+TITLE meta data element causes the XSLT transform to fail because of the entity ‎ (LEFT-TO-RIGHT-MARK) the exporter emits as the title element's content.
As I write this (2023-04), I am using Emacs v27.2 under MS Windows 10, as 27.2 seems to be the last version to support Windows 10 32 bit.
Before, I had used Emacs v25.1 and had addressed above mentioned issues by coding a "derived" HTML org-mode exporter. This had been serving me well for some years, but the derived exporter did not work any more under Emacs v27.2.