"In het verleden behaalde resultaten bieden geen garanties voor de toekomst"
About this blog

These are the ramblings of Matthijs Kooijman, concerning the software he hacks on, hobbies he has and occasionally his personal life.

Most content on this site is licensed under the WTFPL, version 2 (details).

Questions? Praise? Blame? Feel free to contact me.

My old blog (pre-2006) is also still available.

Sun Mon Tue Wed Thu Fri Sat
Powered by Blosxom &Perl onion
(With plugins: config, extensionless, hide, tagging, Markdown, macros, breadcrumbs, calendar, directorybrowse, entries_index, feedback, flavourdir, include, interpolate_fancy, listplugins, menu, pagetype, preview, seemore, storynum, storytitle, writeback_recent, moreentries)
Valid XHTML 1.0 Strict & CSS
/ Blog / Blog
Using MathJax math expressions in Markdown

For this blog, I wanted to include some nicely-formatted formulas. An easy way to do so, is to use MathJax, a javascript-based math processor where you can write formulas using (among others) the often-used Tex math syntax.

However, I use Markdown to write my blogposts and including formulas directly in the text can be problematic because Markdown might interpret part of my math expressions as markdown and transform them before MathJax has had a chance to look at them. In this post, I present a customized MathJax configuration that solves this problem in a reasonable elegant way.

An obvious solution is to put the match expression in Markdown code blocks (or inline code using backticks), but by default MathJax does not process these. MathJax can be reconfigured to also typeset the contents of <code> and/or <pre> elements, but since actual code will likely contain parts that look like math expressions, this will likely cause your code to be messed up.

This problem was described in more detail by Yihui Xie in a blogpost, along with a solution that preprocesses the DOM to look for <code> tags that start and end with an math expression start and end marker, and if so strip away the <code> tag so that MathJax will process the expression later. Additionally, he translates any expression contained in single dollar signs (which is the traditional Tex way to specify inline math) to an expression wrapped in \( and \), which is the only way to specify inline math in MathJax (single dollars are disabled since they would be too likely to cause false positives).

I considered using his solution, but it explicitly excludes code blocks (which are rendered as a <pre> tag containing a <code> tag in Markdown), and I wanted to use code blocks for centered math expressions (since that looks better without the backticks in my Markdown source). Also, I did not really like that the script modifies the DOM and has a bunch of regexes that hardcode what a math formula looks like.

So I made an alternative implementation that configures MathJax to behave as intended. This is done by overriding the normal automatic typesetting in the pageReady function and instead explicitly typesetting all code tags that contain exactly one math expression. Unlike the solution by Yihui Xie, this:

  • Lets MathJax decide what is and is not a math expression. This means that it will also work for other MathJax input plugins, or with non-standard tex input configuration.
  • Only typesets string-based input types (e.g. TeX but not MathML), since I did not try to figure out how the node-based inputs work.
  • Does not typeset anything except for these selected <code> elements (e.g. no formulas in normal text), because the default typesetting is replaced.
  • Also typesets formulas in <code> elements inside <pre> elements (but this can be easily changed using the parent tag check from Yihui Xie's code).
  • Enables typesetting of single-dollar inline math expressions by changing MathJax config instead of modifying the delimeters in the DOM. This will not produce false positive matches in regular text, since typesetting is only done on selected code tags anyway.
  • Runs from the MathJax pageReady event, so the script does not have to be at the end of the HTML page.

You can find the MathJax configuration for this inline at the end of this post. To use it, just put the script tag in your HTML before the MathJax script tag (or see the MathJax docs for other ways).

If you do need MathML processing, or need to typeset formulas outside of <code> elements too, an alternative approach would be to modify my code to add the mathjax_process class to the <code> elements (and any parent <pre> elements if needed). This class causes an option to be processed, overriding the skipHtmlTags blacklist (see docs for the processHtmlClass option). Then, instead of calling typeset on each <code> tag separately, just call MathJax.typeset() once at the end, to typeset the entire document. You might need to take some additional care around the single-dollar delimeters to prevent false positives again, though.

So, here is the script that I am now using on this blog:

<script type="text/javascript">
MathJax = {
  options: {
    // Remove <code> tags from the blacklist. Even though we pass an
    // explicit list of elements to process, this blacklist is still
    // applied.
    skipHtmlTags: { '[-]': ['code'] },
  tex: {
    // By default, only \( is enabled for inline math, to prevent false
    // positives. Since we already only process code blocks that contain
    // exactly one math expression and nothing else, it is also fine to
    // use the nicer $...$ construct for inline math.
    inlineMath: { '[+]': [['$', '$']] },
  startup: {
    // This is called on page ready and replaces the default MathJax
    // "typeset entire document" code.
    pageReady: function() {
      var codes = document.getElementsByTagName('code');
      var to_typeset = [];
      for (var i = 0; i < codes.length; i++) {
        var code = codes[i];
        // Only allow code elements that just contain text, no subelements
        if (code.childElementCount === 0) {
          var text = code.textContent.trim();
          inputs = MathJax.startup.getInputJax();
          // For each of the configured input processors, see if the
          // text contains a single math expression that encompasses the
          // entire text. If so, typeset it.
          for (var j = 0; j < inputs.length; j++) {
            // Only use string input processors (e.g. tex, as opposed to
            // node processors e.g. mml that are more tricky to use).
            if (inputs[j].processStrings) {
              matches = inputs[j].findMath([text]);
              if (matches.length == 1 && matches[0].start.n == 0 && matches[0].end.n == text.length) {
                // Trim off any trailing newline, which otherwise stays around, adding empty visual space below
                code.textContent = text;
                if (code.parentNode.tagName == "PRE")
      // Code blocks to replace are collected and then typeset in one go, asynchronously in the background

Update 2020-08-05: Script updated to run typesetting only once, and use typesetPromise to run it asynchronously, as suggested by Raymond Zhao in the comments below.

2 comments -:- permalink -:- 17:03
Copyright by Matthijs Kooijman