Rendering the Web with Pictures in Your Terminal : more than you would like to know about HTML, ANSI and the philosophy of communication.
by Ploum on 2022-03-24
I’ve often been baffled by the productivity of Open Source developers. But I may have found the secret. Having something else to do. As soon as you need to do something urgently, something non-computer related, programming open source seems really important.
So, in a new episode of "I should really have done something else with my life", please welcome the "Offpunk got a new HTML rendering" story, a long meditation on reading HTML and starting meaningful, philosophical discussions.
Discovering ANSI Codes
When I started using AV-98 as a Gemini client, I thought it was simple because it simply displayed the Gemtext content on the screen, only replacing URLs with numbers. I didn’t realise consciously that titles had to be coloured. But how can you send colours to a terminal? Isn’t that only white lines on black background ?
The answer is ANSI codes. ANSI codes are special characters that, when passed to a terminal, change the "mode" of the terminal. If you type "\e" followed by "[", the terminal understands it as the "Control Sequence Indicator". You can then type different numbers (separated by ";") and end with "m". The different numbers correspond to different modes. So typing "\e [1m" switch to bold and "\e [1;35m" (without the space) switch to bold and magenta. You can close all opened modes with "\e [0m".
It is limited to 8 background/foreground colours and a few other modes like bold, faint, italic, underline, …
It’s all fun until you realise that those 8 colours can be customised using another special code. So, in essence, your terminal can display any Unicode character with any background and foreground colour. (Not all terminals have the same colouring capabilities but, still…)
As I was discovering that through reading AV-98 code, I also wondered how hard it would be to display HTML pages in what was already called Offpunk. If I could display a gemtext page with blue title, how hard would it be to display some HTML? I would only need to transform HTML tags into ANSI codes. Right?
Why HTML is a Bad Way of Communicating
I hacked a first version displaying links in blue and, yes, it was working! This had been surprisingly easy! But the more I kept trying webpages, the more I realised how bad the situation was.
First of all, HTML is really hard to parse for a simple single reason: space and return characters may act completely differently depending on where they are in the page. In fact, except in <PRE>, one could say that you could drop returns. Except that, sometimes, they should be considered as a whitespace (if you don’t, some pages will havetheirtextgarbledwithoutspace).
But that’s only when the HTML is good. Most of the time, it isn’t. My initial HTML2ANSI routine started to grow organically to handle all the cases I was finding. It was try and error: add a space there, reload, see if it looks good, add another strip() there. No architecture, pure random code.
When looking at the source code, I found myself multiple times pondering if I should display what the people obviously intended to say or if I should display what the HTML told. One very small but obvious example: Wordpress websites using Jetpack have some of their picture duplicated. The reason is that one of the pictures is behind a "lazyloading" attribute while the other is in a "<noscript>" tag. Offpunk should obviously display the "<noscript>" because it doesn’t do JS but also displays the first one because it doesn’t parse the lazyloading CSS attribute. I don’t see how I can correctly not display the picture twice.
Tools are simply bad and produce awful HTML. If you are not convinced, look at the source of a page generated by the pseudo-minimalist Medium. When trying to parse HTML, it becomes obvious how we lost any meaning. People wrote, tools transformed it in a meaningless mumbo jumbo and I was trying to write a tool to find the original meaning back.
But tools are not the only ones to blame…
People are Bad at Writing
When I added support for bold and italic, something strange appeared. I realised the test pages I’ve been using since the beginning were now less readable. The text alone was good but people are somehow forced to add many fancy stuff over it. It’s like they fear not being remarked. Every HTML webpage looks like a child doing stupid stuff to attract his parent’s attention.
Open a book. Open 10 books around you. Try to find how many instances of bold sentences, varying styles and coloured words you have. Except in some very modern technical books, there’s none. You can sometimes find some italics to indicate a citation. It’s quite rare. I am, of course, referring to books that were published before writers started to use MS Word.
As Neal Stephenson said in "In the Beginnings was the command line", bold was at first only used for the title. Then MS Word came without any support for formal title/subtitle styling. The bold button was put prominently and people were expected to make their titles manually. But something else happened: people started to put bold and italics everywhere. Because they could.
When reading through so many webpages, it struck me how all this styling was looking like insecurity. Fear of not being read. Fear of not being cool. Fear of being boring. Fear of losing your audience. And how much better most of the texts were without any styling.
As I was reading "L’Âge des low-tech", by Philippe Bihouix, I came across one of those insecurities. The author put the word "NEVER" in all caps, probably hoping to make a point. It looks like the author felt insecure about his argument, felt that it was weak and tried to compensate. The point was something like "3D printing is not a solution because true constructions like houses can NEVER be 3D printed". Which is obviously false (or at least it should be explained why current 3D-printed buildings are not trully 3D-printed). Without the all cap, it would have been an error in a book, something understandable. With the all cap, the author lost a lot of credibility for many pages.
Any Mistake That Can Be Made…
So, in essence, people are using crappy tools that produce shitty code and they are using it as badly as they can. We can safely say that any error that can be done will be done.
The same is true for Gemini. Hopefully, Gemtext is so simple that there’s no tool, reducing the error surface. Errors can only be done on a line-by-line basis. Like that case when someone was speaking about a hashtag and the hashtag appeared at the beginning on the line, turning the line into a title.
But there’s one true error still possible with gemtext : Preformatted.
I’ve seen them all : people missing one preformatted line and thus inverting formatted/preformatted, people not closing an opened preformatted or, the worse : some gemlogs switching between formatted/preformatted for each paragraph. The only explanation I found was that it was looking cool in their test client. What if their client was switching the background colour for preformatted? Then their posts would render as a nice patchwork of colours.
This is a huge problem as Offpunk doesn’t wrap preformatted texts, assuming they are some sort of ASCIIART. Those paragraphs were thus displayed as a long non-wrapped line.
If a mistake can be made, it will be made.
It seems obvious when you are reading texts but, once a line is longer than your screen, that line should be wrapped. The line should continue on the next line. Wrapping is far from trivial if you don’t want to wrap in the middle of words. You need to count characters then, once a threshold has been reached, cut at the nearest word separation.
But remember what I told you about ANSI codes and colours. When you see a coloured line on a terminal, there are some invisible characters. Many of them. So the wrapping is confused and may wrap too soon.
If you have a Gemini title you want to be bold blue, the ANSI sequences to open and close everything will amount to 12 characters. Your wrap will be really short.
Not a huge problem until you start dealing with HTML lines where you can have 4 or 5 links per line. You may start to wrap before displaying any character at all!
ANSI modes should be conserved to the next line but I observed many singularities and, from what I’ve seen, it is a good practice to close all ANSI codes before the end of a line. That makes the wrap even harder: you need to identify all opened ANSI sequences, close them and reopen them on the next line.
That’s exactly the problem that python-ansiwrap is solving. And they did quite a good job with it.
As I was reading webpages with Offpunk, I realised once that I was really missing an important information. I loaded the page in Firefox and, yes, a whole paragraph, a citation, was missing. What happened? It took me the source code to realise that the citation was not a text but a picture.
The easy solution to avoid such problem was to render pictures as links. If you wanted to see a picture, you would simply follow the link and the picture would be displayed in your picture viewer (feh, in my case).
But, quickly, I started to open all the pictures, curious to see what they were. Most pictures on the web are useless. They serve to get a sense of "feeling". Like bold and styling, they only serve to attract attention to the content. I’m guilty as everyone. For years, I tried to find a catchy picture for every article on my blog. This was reinforced by the fact that a link without a picture has no chance to thrive on Twitter nor Facebook. It even became some sort of artistic ritual to pair a finished blog post with a totally unrelated picture.
As I discovered the console, more than 20 years ago, I started to play with libcaca, which was transforming pictures into asciiart. I did a little experiment but, on the small size of a website, a libcaca picture does not help at all. It is fun but there’s no way you could recognise anything.
At that point, the universe told be about Chafa. For the anecdote, I was removing unused packages from my computer when I found one called neofetch. I did a "apt-cache show neofetch" to see what it was and was intrigued by a dependency called "chafa").
Remember that your terminal can display any Unicode character with a custom foreground and background colour? Well, chafa uses that to transform any picture into a configurable block of coloured text. What is really interesting is that most pictures are recognisable. I also found "timg", which is doing exactly the same as chafa albeit slower.
I thought it would be fun to have chafa rendering pictures in Offpunk and tried it. It was just a proof of concept. But I was quickly convinced. It was working so well that I could easily tell if a picture was an important one I should open or a decorative one I could skip. In a word, it was "useful". One more tool to extract information that people are deliberately burying while trying to promote it.
With the 1.8 version, Chafa added support for a kind of "kitty-image-protocol", something I don’t understand but which allows the Terminal Kitty to display pixel-perfect pictures. Timg already had support for that. Pixel perfect pictures cannot be integrated into text but this meant that I could directly display pictures into Offpunk. And, yes, it works when you access the picture directly.
For best result, use Kitty terminal and Chafa 1.10. If you have an older Chafa, also install Timg to have the best of both worlds.
Rewriting All From Scratch
All this hacked-together plate of spaghetti was working quite well but was unmaintainable. The HTML rendering code was pure random guessing. Ansiwrap sometimes gave very strange results and the performances were horrible. A 40 characters wide line produced by chafa could contain up to 1000 characters, which were parsed by ansiwrap.
When reading the Ansiwrap source code, I realised that one hard task was to first identify ANSI codes in a text. Something I could avoid as I was doing it myself.
So, in essence, I was making a text full of ANSI codes then wrapping it with ansiwrap. Could I do it the other way?
I started a new renderer that would save the position of every future ANSI code without inserting them. Then, I would wrap it, line by line, inserting ANSI codes only after wrapping while being smart enough to close them before the end of the line and reopening them.
Chafa pictures would be inserted without being wrapped at all.
This was not an easy task and I even added a BETA switch to keep the old rendering engine while I was working on the new. But, quickly, I realised it was doing a better job. Instead of transforming HTML to ANSI, I was first trying to understand HTML then building a meaningful representation. For example, newparagraph() and newline() are two different functions because they have a different meaning. For the anecdote, the first rendering engine was adding so many blank lines that I needed to remove every block of more than three blank lines before displaying the result.
So, starting with 1.2, Offpunk as a whole new rendering engine, also used for Gemini, Gopher and RSS. Please test your favourite capsules and websites and report any problems.
Using the renderer without Offpunk
An HTML2ANSI renderer seems very useful and I’m wondering if I should release it as a standalone tool. You can already test it quite easily with the following python script within the same folder as Offpunk.
#!/usr/bin/env python3 from offpunk import HtmlRenderer f = open("index.html") content = f.read() f.close() rend = HtmlRenderer(content,"www.test.com") # display the result in less rend.display() # or put it in a variable body = rend.get_body() print(body)
Making the Web Readable
I’m proud to say that my first objectives have been met: removing any dependency to python-ansiwrap, improving performance with pictures and making the code more maintainable.
Outside of ansiwrap, Offpunk uses two libraries to parse the web: the venerable BeautifulSoup 4 (python-bs4), which does an outstanding job of parsing the HTML. Parsing is the act of transforming a text into an usable computer representation. Rendering is the opposite : transforming the computer representation into something an user can read/watch. The best the parsing, the easiest the rendering.
The other is python-readability which removes unwanted cruft from an HTML page. Python-readability transforms HTML into HTML but with less thing. Most of the time, it works really well, allowing Offpunk to display only the article that I want to read.
Sometimes, it removes too much. In that case, Offpunk allows bypassing readability with the command "view full" (or "v full"). Starting with 1.2, you can bookmark a page and its mode, so you automatically switch to "full" for some pages. This is done with a very naughty hack (adding "##offpunk_mode=full" at the end of the URL and crossing fingers for no URL having this string in their URL in real life).
Without python-readability, I could not have added web support to Offpunk. It would have been discouraging. But now that I have an improved HTML renderer, I’m starting to wonder if I could make python-readability an optional dependency. Would it be useful for anyone not to install python-readability?
There’s one good reason not to allow running Offpunk without python-readability. With readability enabled, most webpages have between 0 and 20 or 30 links. When running "offpunk --sync", this means making that many requests to prefetch contents and pictures.
But in "full" mode, most webpages have between 100 and 500 links. Most of them completely useless, of course. Using Offpunk to surf the web without readability might be a nightmare.
Starting a Discussion
I created my first website in 1998 and started my still active blog in 2004. For all those years, I was publishing to attract attention, readership. I was looking for some "success". I got some. Published books, was invited to give conferences, was offered jobs, was even twice recognized by strangers because of my blog. But it was never enough. It is still not enough. I don’t feel particularly successful.
For the last 20 years, I’ve tried to become a successful writer by doing anything that could attract attention to my writing. In the last years, I’ve been slowly evolving toward writing more, publishing less and doing marketing even less.
By being forced to parse all the greasy cruft that others put around their writing, I was forced to realise how I did (and might still do) the same. How the appearance is detrimental to the content. Anyone suggesting that gemtext should support colours should be forced to write an ANSI parser first. Anyone suggesting that Gemtext formatting is not enough to express correctly what they want to express should open an old book. Most people are trying to work around the fact that writing is hard. Really hard. They hope to share emotions and non-verbal clues through emojis, colours, boldening. The problem is that, unlike the written language, those expressions are not explicit. You don’t convey what you want to convey, you make it less understandable. If you can’t write without colours and emojis, it is probably because you don’t really know what you want to say. That’s why writing is so hard. It is hard but it pays off. When writing with only words, you not only communicate better with others, you also communicate with your present and future self.
« Ce qui se conçoit bien s’énonce clairement et les mots pour le dire viennent aisément. » (Boileau)
I’m still astonished by the emptiness of most successful webpages once displayed in Offpunk. Offpunk made the front page of Hackernews which led me to break my disconnection temporarily because I was curious about other links with more votes. They were shallow, empty and rarely had more than 200 words. A search about Offpunk on Google that same day revealed that the top pages were mostly computer remixes of the Offpunk README. The web is full of automatic content trying to mimic what has been successful. There were even two Youtube videos in the first Google results. Both of which are computer reading of the README while the website is displayed. The real Offpunk website was only on the second page, demonstrating how the web is buried in shitty auto-generated content and how bad the leading search engine is.
I did not receive more emails that day than when I post on this gemlog.
That success everybody is running after is fake, short-lived. It is not even finished that you must already run for the next shot.
With Offpunk, I still can read people I like on the web. I follow their RSS feeds. It makes me feel like I’m still on Gemini or Gopher. It’s funny how, each time I find something deep and meaningful to read on the web, it renders well in Offpunk. It feels a bit like meaning and technical presentation are somehow connected.
I also like to write on Gemini because I know that it can’t be successful. I can write long stuff like this one. It will be read by people that find it interesting. Not many. Maybe none. But if you made it until now, then it certainly had an impact on you. I’ve shared my thoughts with you. You will, consciously or not, process them. You may even one day produce something that will have been impacted by what you’ve read today. I may read you later without knowing that.
This whole post has been influenced heavily by a lot of people in the Gemini space. They made me think. You probably made me think. I’ve evolved thanks to what I’ve read. Sometimes, when reading you, I wanted to reply. But there was no short and easy way to reply. So my reply either got lost, which means it was not important, or got carried with my thinking and crystallised into this post or the following one.
We have started a long, slow and meaningful conversation. The kind of conversation upon which every human breakthrough was conceived. Thank you for participating in it.
This is just the beginning of our conversation.