Selecting a license for typho

Written on 2024-09-13 in 3160 words ✍️.
Part of cs software-development digital-typesetting

Introduction

typho is my digital typesetting project. Any software project needs a lot of decisions to get finished. And decisions must be made wisely to lead to success. One decision concerns the choice of license for its source code.

My common approach to difficult topics is iteration. Think about it, draw some conclusions, and let the topic rest for a while. Iterate this until you found all answers to fundamental questions to resolve the topic. When I started implementing typho, I decided to implement an OpenSource project (acc. to OSI definition). In an reiteration, I chose to pick MIT license as my default license for everything. Indeed, my current public subprojects obey the MIT license. Is it the final decision? This now becomes relevant, because I am about to publish the first actual component of typho and want to avoid relicensing later.

An OSI article sponsored by Sentry triggered my thought process and in this article, I wanted to share my opinions and findings.

FOSS licenses

FOSS licenses have a long traditional and are associated with political and emotional preferences. The fundamental question of licensing of programs concerns the question how the product of programming may be used in which context in which way. FOSS refers to free open source software and is an umbrella term for free software and open source software. Those terms itself have detailed definitions, but (as done commonly) I will stick to OSI’s definition. FOSS allows people to redistribute and modify source code without limits without discriminating by technology and product. One question is whether the original product [author names] may be used to promote the derived work (no, acc. to the 3rd clause of BSD). Another question is whether derived work must be redistributed under the same license (yes, acc. to the copyleft clause).

Common licenses include:

SPDX provides unique identifiers to identify each license.

Commercial interests

But in essence FOSS contrasts itself with common commercial interests. Based on the fact that companies invest in programming activities to fulfill needs of customers, they have an interest to get advantages back for their investment. If company A invests money into a product and delivers it but company B takes this program and succeeds better to get customers to pay for it, company A loses interest to invest into development of any further product. Hence, if company A develops the product should get advantages back from its users without interference from company B. The question is now: which license is suitable to prevent company B to take advantage of the situation?

The simplest approach is to prevent company B’s access to the product. In the software development space, we sometimes have the situation, that “receiving the delivered product” is something different than “using the delivered product”. Specifically, there are two common scenarios:

  1. A webserver offers a service whose inner workings are unknown. It reminds me of ancient stories about solving cubic equations. In competitions, mathematicians proposed various cubic equations and had to find solutions for the involved variables. There was one mathematician whom one could propose [so-called depressed] equations and this mathematician would respond with the correct solution. One could take his solution and verify it is correct. But noone knew how he achieved this result. If the mathematician is the server and the other parties are clients, you have the same scenario. A server can hide how he attains the result without revealing the process.

  2. Furthermore, computers have the ability to compute really fast. Each step takes few nanoseconds and thus if one tells me all billion steps taken to attain a result, I am going to be overwhelmed. The representation to attain a result matters. The representation must be simplified to (at most) several hundred steps to understand the process in a meaningful timeframe. If the user now only receives the billion-steps solution, the computer can run these steps quickly to get the correct result. But only the person in possession of the few-hundred-steps solution is able to communicate it. Accidentally, the few-hundred-steps solution is also the one written by the programmer and the billion-steps solution is derived from it. In conclusion, a program might not be available in a meaningful representation to be able to understand it even though one can use it.

In technical terms, the two scenarios are service interfacing and compilation. And those can be used to limit company B’s access to the product.

I would like to emphasize that everyone should be mature enough to understand these commercial interests. If you spend your leisure time programming for your private project, it matters little which license, you select. This is based on the fact that your economic life does not depend on how people use your program. I wrote a lot of source code in my past given away for free, because I had other sources of income to survive - which is great and fine. However, the question here is how we can make FOSS programs sustainable; thus to turn “writing programs” into a source of income. And this question is a big and open one.

Non-FOSS licenses

All the big players in my field use non-FOSS licenses. This concerns Adobe (controlling the PDF standard), Microsoft (DirectWrite), and Apple (Core Text). And licenses in this field are usually meaningless. The common English phrase is “All rights reserved”. Companies usually distribute long legal texts removing most of the rights granted to the user by FOSS licenses and the phrase itself means that the company retains all those rights discriminating possibly by user or technology. Among commercial competitors in my field, you can read long legal texts, make difficult decisions which license you need, or only use the software in a narrow usecase thought of by the creator.

One specific approach, more open than “All rights reserved”, is OpenCore (from 2008) represented by Commons Clause (since 2018). It says that you may assume the freedoms of FOSS except for selling the core product. You may not take the core product and resell it for your own benefit. This allows you to study the product arbitrarily but the crucial part for commercial interest is counteracted: taking economic advantage of the work. Practically speaking the software license is not popular, because the definition of “core product” can be tricky and since it restricts your use of the software, it does not satisfy the OSI definition.

Licenses of competitors

Fundamentally, we now filter digital typesetting products (reusing my list from arewedigitaltypesettingyet.com) by their license. So we only consider FOSS licensed products[1]. This is my list and their licenses:

digital typesetting project license

LaTeχ

LaTeχ project public license v1.3c

Teχ

similar to Public Domain

typst

Apache License Version 2.0

SILE

MIT

speedata publisher

AGPL v3

sphinx

two-clause BSD

χeTeχ

MIT

LuaLaTeχ

GPL v2

pandoc

GNU GPLv2-or-later

Weasyprint

three-clause BSD

GNU Texmacs

GPLv3

PagedJS

MIT

MkDocs

two-clause BSD

Antora

Mozilla Public License Version 2.0

Patoline

GPL v2

nroff

CDDL v1 (license text)

troff

Bell Laboratories and (now) MIT

groff

GPL v3

opTeχ

Public Domain

PageBot

MIT (with a registered trademark)

rstgen / Nim DocUtils

MIT

Quarto

MIT

Crowbook

LGPL v2.1

pollen

MIT

NaturalDocs

AGPL v3

Lout

GPL v3

Apache FOP

Apache v2

quad

MIT

SATySFi

LGPL v3

SpanDex

Mozilla Public License v2

python-typesetting

MIT

wkhtmltopdf

LGPL v3

yex

GPL v2

imp

three-clause BSD

heirloom-doctools

CDDL v1.1

iText

AGPL v3

Vivliostyle

AGPL v3

10 times MIT licenses, 4 times BSD-based licenses, 2 times Apache licenses, and 14 times GPL-based (GPLv2 ×4, GPLv3 ×3, AGPL ×4, LGPL ×3) licenses can be observed. The crucial part of course is not frequency but weighted distribution based on size. The big projects LaTeχ (LaTeχ Project Public License), typst (Apache License v2), SILE (MIT), and speedata publisher (AGPL v3) use different licenses.

Licenses of dependencies

Another interesting question is the license of dependencies. Since typho is still in an early development stage and not yet published, I only want to list some basic dependencies of the current prototype:

software project license

HarfBuzz

MIT

icu4x

Unicode v3

xml-rs

MIT

clap

MIT and Apache 2.0

Lua

MIT

mlua

MIT

Python

PSF license agreement and zero-clause BSD

RustPython

MIT

On the shoulders of giants

One must always point out that developing software means standing on the shoulders of giants. The software above inspires one to develop better software by learning from existing software, studying its approaches, or explicitly reusing components. So the overall goal should always be “be aware what you depend on” and “give back to the community”. Giving back can mean financial support, but might also mean answering questions on social media, or sharing findings in a meaningful way.

Delayed Open Source Publication

The article mentioned in the introduction talks about Delayed Open Source Publication. It makes me wonder. The only working (independent) business model for digital typesetting seems to be software-as-a-service (SaaS). Besides publication of your source code, you also offer a service simplifying usage of this software. Service? We spoke about service interfacing. Can delayed open source publication (DOSP) work here?

Delayed Open Source Publication refers to the practice to release your software under a FOSS license not now, but at a later point in time. A common framework is the Business Source License, which defines a Change Date per license which tell when the software is subject to a FOSS license. The default time is “four years past release date”, but this can be overwritten as done by DragonflyDB (after 5 years) and ArcticDB (after 2 years). In general, I think it provides a beautiful framework and I cannot see why any software should not be FOSS after 5 years, but maybe I lack commercial experience.

  • Let’s admit it: 5 years is a terrible timeframe and far off FOSS. If you publish your software after 5 years, I likely lost all interest. As a technician, I might be motivated to look at a bug in more depth. If I cannot look at this bug with the latest distribution, I cannot be sure at all whether the effort is worth it. Maybe someone fixed it already in the 5-year timeframe? Maybe the bug the broken invariants leading to this bug changed and a potential bugfix has to look different? If I submit a security patch, I want to see integration of the bugfix now and not after 5 years.

  • Consider the timeframe of 6 months. If I offer my service running a codebase and after 6 months, I have to publish its source code under FOSS terms - is this a useful approach balancing user freedoms and economic interests? Does it suffice to prevent a competitor from duplicating your service and gaining economic advantages? I think it restricts competitors. Six months might be long for a young, immature software project and short for an established, mature service. But I left one question open: Is the source code available during the six-months time frame? If not, you lose the possibility that technicians lose interest studying the problem in detail and providing better bug reports.

In conclusion, I think all those ideas are interesting, but since it violates the user’s freedoms, it certainly does not follow the OSI definition of FOSS. Anything beyond six months makes me believe the accessible source code is likely different for the problem at hand. And if the latest source code is not readable immediately, it violates the intuitive notion of “Open Source”.

Sentry sponsored research of this article and had to take back its framing as “Open Source”. It is not a company with FOSS products acc. to OSI. It now tries to push a new terminology Fair Source including the Functional source license (since 2023). A derivative is the Fair Core license (since 2024), which better covers self-hosting besides SaaS. Its crucial part of the description is the following sentence:

The FCL is a Fair Source license that allows a user to read, use, modify, and redistribute a project, where "use" is any use case that does not compete with the business interests of the author.

Do not compete with my product for 2 years and, besides that, its license is the MIT license? This sounds admittedly open and fair. Similar to Open Core, the question is whether the definition of “competition” is narrow enough to work in practice.

The curious case of ‘Material for MkDocs’

I mentioned MkDocs above as a competitor. “Material for MkDocs” is a particular theme as well as addons for MkDocs. It also has a curious business model, namely a threshold pledge system. A set of features and their costs are defined. If you pay for a subscription, you have access to the latest features, but the legal rights concerning the features are unknown to me. If the monthly funding exceeds a threshold, the associated features are paid out and released to the public. As such from the perspective of the public, there is one product under the MIT license but internally they have a product with more features.

Be aware that this works badly in practice, because you essentially have to maintain two codebases and these additionally efforts slows your development progress. But since the model itself beautifully incentivizes funding, I still wonder how good it works out.

Conclusion

I think the MIT license would be a very liberal, trivial, and pragmatic choice for typho. If a model like “MIT license but no competition for 2 years” would be acceptable for the community, it would better balance technical and economic interests to finally achieve more sustainable models for FOSS. But at this point, I won’t make a final decision and need better user feedback.


1. I do believe the author of neatroff would not mind considering it as FOSS, but since no license at all is mentioned, it is not considered FOSS.