Felix' Ramblings
 
>> I Hate: The Web

2022.12.20
Procrastinating by Creating This Website

As a professional procrastinator, there is always a list of side-projects laying around. One of these items is a website: I always wanted to create a website which serves as my own cheat sheet, because my memory is utter garbage and some of these notes might be helpful to other people as well. Given that I just got some ancient computer up and running which serves as a git-server/NAS I thought: Why not finally host the website on it as well? And well, once one site is set up, I might as well create a second, less-useful site for all of my ramblings which literally no one asked for.

Ok, let's talk websites. I kind of despise how the modern web turned out [0]. To make this brief: I want a static website with some personal styling to make it pretty. Static websites generators are a thing. I could have just picked any existing solution, spent a day (or more realistically a week) on configuring it to my liking and then forget about it. But, speaking from experience, it usually goes somewhat like this:

  1. Search for various generator-thingies, pick one out
  2. Try to use tutorials which are obsolete
  3. Configuring the thing to my liking straight up does not work
  4. I spend way too much time working around that issue shittily
  5. I hate everything and everyone involved
  6. The result is "meh" and I spent way too much time on it

Here comes the infamous thought: "Well, it can't be that difficult". And that kicked of my ~7 sessions of creating my own static website generator.

The very first step was to create the html/css that I want "manually" by hand. Now that I now what to generate, I created some mockup theoretical input in markdown, simply because I somewhat knew this format and thought it would be good enough for my type of stuff. After I sketched out some folder structure for posts, the coding part started. I ended up with the following structure:

That... doesn't sound particulary efficient, does it? Welp, when doing these projects I learned to do the simplest things first. All this re-walking of folders and trees made the memory allocation and content generation pretty simple.

Full disclosure: After some parsing issues I decided to make things easier for myself by modifying the markdown syntax I use for my sites just a bit. That way I didn't need to implement proper backtracking; instead I just need to do some lookahead. I'll probably still be ironing out bugs from time to time, but the result is usable enough for me already (you are looking at the website after all).

Benchmark time! I copied the input of my test post, which is supposedly almost 13 Kilobytes big, 1000 times and ran my generator. 1000 posts is really not a lot, but realistically, it's more posts than I'll ever have.


...

Found file:
Path:         .//1000/input.txt
Website Path: 1000
Title: Test Entry but this time with a very long title. How will it handle it?
Date:  2022.12.07


Converting all found files...
Parse and converting of "about" page...

Writing overview files...

________________________________________________________
Executed in  266.68 millis    fish           external
   usr time  137.05 millis  245.00 micros  136.81 millis
   sys time   84.82 millis   38.00 micros   84.79 millis
...<300ms? The power of really dumb C code combined with modern computers keeps surprising me. Which makes the whole "why the fuck is this program so slow?" even more frustrating. The best part: This is single-core-performance because, again, I did the simplest thing possible. This should be trivially parallizable (one task -> one thread), but that can wait given the current performance.

Ok, one last circle-jerk: How much code is it? I know this comparison is unfair, but let's take a look at two other static website generators first: Jekyll and Hugo. Obviously these two projects can do way, way, way more than my little program - but let me have this moment >:(


> cloc jekyll/
     751 text files.
     713 unique files.
      75 files ignored.

github.com/AlDanial/cloc v 1.94  T=0.25 s (2843.5 files/s, 249147.5 lines/s)
--------------------------------------------------------------------------------
Language                      files          blank        comment           code
--------------------------------------------------------------------------------
Markdown                        306           4925              0          19488
Ruby                            183           3615           2878          17167
Cucumber                         28            353             12           4522
YAML                             44            278            124           2203
SCSS                             18            433            235           2126
JavaScript                        5            115              6           1073
HTML                             74             63              9            966
Text                             11             32              0            888
Bourne Again Shell               13             47             42            222
JSON                              4              6              0            199
ERB                              13             56              0            168
CSS                               1             15             11             50
SVG                               3              0              0             32
XML                               1              3              0             29
Dockerfile                        1              7             22             26
CoffeeScript                      3              2              0             15
CSV                               1              0              0              3
PHP                               1              1              0              3
TOML                              1              0              0              2
XHTML                             1              0              0              1
Rmd                               1              0              1              0
--------------------------------------------------------------------------------
SUM:                            713           9951           3340          49183
--------------------------------------------------------------------------------
That's a lot of markdown. I suppose this is for documentation? I can't be bothered to find out to be honest. Apparently "Cucumber" is a programming language, so TIL I guess. Counting Ruby and Cucumber, it's 21k LoC. That sounds pretty reasonable, what about Hugo?

> cloc hugo/
    1646 text files.
    1596 unique files.
     394 files ignored.

github.com/AlDanial/cloc v 1.94  T=2.69 s (594.3 files/s, 89638.0 lines/s)
-------------------------------------------------------------------------------
Language                     files          blank        comment           code
-------------------------------------------------------------------------------
Go                             747          25623          19050         119192
Markdown                       551          11904              0          34177
HTML                           128            252             45          11847
JSON                            11              0              0           6903
CSS                             29            529           1011           5436
SVG                             62              3              7           1640
TOML                            19            274             58           1439
YAML                            11             45             14            537
JavaScript                      15             33             58            192
XML                              7              2              0            147
CSV                              1              1              0            129
Bourne Shell                     8             31             12             51
Dockerfile                       1             14             10             21
Text                             4              1              0             12
SCSS                             1              1              0              6
Sass                             1              1              0              5
-------------------------------------------------------------------------------
SUM:                          1596          38714          20265         181734
-------------------------------------------------------------------------------
120k LoC of Go? Nani the fuck?

You know what? I don't even want to know. Let's move on to my project:


> cloc asswg/
      29 text files.
      23 unique files.
      20 files ignored.

github.com/AlDanial/cloc v 1.94  T=0.02 s (963.2 files/s, 129576.1 lines/s)
-------------------------------------------------------------------------------
Language                     files          blank        comment           code
-------------------------------------------------------------------------------
C                                5            307            112           1915
CSS                              1             53             29            225
C/C++ Header                     4             32              3            170
Text                             4             33              0            164
HTML                             6              0              0             24
Bourne Shell                     3              5              2             20
-------------------------------------------------------------------------------
SUM:                            23            430            146           2518
-------------------------------------------------------------------------------
That's not even 2.000 Lines of C-Code. And it does what I want. Cool. While it would be interesting to compare the runtimes, I, for reasons stated above, don't want to learn these other two projects. Too Bad!

Additionally, let's not forget: Programming these things from scratch makes you learn a ton. This time around it was parsing stuff and why you usually tokenize your input first. I never understood why I should spend time writing a tokenizer when I could just work on the input directly. Apparently, you can do that, it's just a bit more annoying. Although I learned this the hard way, but I wouldn't want it any other way.

And now I have my own static website generator: A piece of software that I actually use. That has to come with some bragging rights. If something doesn't work - that's completely my fault and not some weird unspecified behaviour [1]. And for the things which do work I can feel pretty proud about. So I'll call this procrastination project a success.


[0]: The list of ramblings to be done really does fill itself.

[1]: I hope the irony of this sentence in conjunction with the fact that this project is written in C is not lost.


 
>> I Hate: The Web
 Felix' Ramblings