<?xml version="1.0" encoding="utf-8"?>
<rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:content="http://purl.org/rss/1.0/modules/content/">
    <channel>
        <title>Storyden Blog</title>
        <link>https://www.storyden.org/blog</link>
        <description>Storyden is a platform for building communities. A modern take on oldschool bulletin board forums. Designed to be the community platform for the next era of internet culture.</description>
        <lastBuildDate>Mon, 11 May 2026 13:14:15 GMT</lastBuildDate>
        <docs>https://validator.w3.org/feed/docs/rss2.html</docs>
        <generator>https://github.com/jpmonette/feed</generator>
        <language>en</language>
        <image>
            <title>Storyden Blog</title>
            <url>https://www.storyden.org/banner.png</url>
            <link>https://www.storyden.org/blog</link>
        </image>
        <copyright>Barnaby Keene</copyright>
        <item>
            <title><![CDATA[A forum and wiki with API access and an MCP server!?]]></title>
            <link>https://www.storyden.org/blog/forum-wiki-with-api-and-mcp-server</link>
            <guid isPermaLink="false">/blog/forum-wiki-with-api-and-mcp-server</guid>
            <pubDate>Sat, 28 Jun 2025 00:00:00 GMT</pubDate>
            <description><![CDATA[What a crazy concept! But it's not that crazy, it's reality. Storyden now supports API access tokens and, the most exciting part, MCP is coming!]]></description>
            <content:encoded><![CDATA[<blockquote>
<p>TL;DR: Access Keys for easier API integration, fully integrated and secure MCP server on the way soon!</p>
</blockquote>
<p>It took me a while to explore the latest trend. I usually sit back a bit when something new makes the rounds on X, Hacker News and all the other usual places.</p>
<p>To be fair, this may have been my downfall, I attempted to build a GPT-3 powered SaaS with a friend way back in 2020, ultimately we lost interest and I collosally failed to predict the explosion of GPTs in the coming years.</p>
<p>That being said, <a href="https://github.com/Southclaws/storyden/tree/model-context-protocol">there was an attempt</a> a few months ago to experiment with MCP. Back then, there were very few actual tools to work with MCP so I&#39;ve revisited it this week and I&#39;m quite excited!</p>
<h2>Guess we&#39;re doin AI now?</h2>
<p>Not quite. Storyden always aims to support the most minimal production-ready deployment possible. You can run it right now on your server with zero external dependencies. No OpenAI API keys, no PostgreSQL, no Redis. Everything is baked in, sane defaults, get going and enhance as you wish:</p>
<pre><code>docker run -p 8000:8000 ghcr.io/southclaws/storyden
</code></pre>
<p>If you&#39;re not interested in AI at all, no worries! It&#39;ll never grace your Storyden deployment unless you ask it to. (and provide an OpenAI API key of course...)</p>
<h2>The <code>Bearer</code> of Good News</h2>
<p>In short: Storyden finally gets <code>Authorization: Bearer</code> support! This has been on the list since almost the beginning. Until now, the only way to talk to the API was with a <code>Cookie</code> header (which was recently reworked to be a much more secure stateful token, rather than a JWT-esque stateless token)</p>
<h3>Access Keys</h3>
<p>These work like most other SaaS apps. You can create a &quot;Personal access key&quot; which gives you a token. It can be revoked, expire and all that good stuff.</p>
<p><img src="/blog/access-keys.png" alt="Access keys UI"></p>
<p>Not every member of a Storyden instance may do this, administrators must first issue a role <a href="/docs/introduction/members/permissions">with the permission <code>USE_PERSONAL_ACCESS_KEYS</code></a> in order for a member to be able to issue keys for their account.</p>
<p>Access Keys are essentially just another authentication method for an account. This means they inherit all the roles and permissions of the owner.</p>
<p>Which may have you asking, how do I implement a principle of least privilege?</p>
<h3>Bot Accounts</h3>
<p>Since permissions are simply bound to an account, and access keys simply provide access on behalf of an account, then creating scoped access is as simple as just creating another account!</p>
<p>&quot;Accounts&quot; in Storyden are lightweight, not bound to an identity such as an email or phone number. It was a conscious decision to use the word &quot;Account&quot; and not &quot;User&quot; for this reason.</p>
<p>In a future version, members with sufficient permissions will be able to create bot accounts with access keys. This feature will power new agentic workflows such as:</p>
<ul>
<li>Automated content moderators</li>
<li>Shared link indexing from community platforms such as Discord</li>
<li>Organising bots for tidying up wiki pages</li>
</ul>
<p>And pretty much anything you can think of by composing together MCP servers with your favourite agentic framework like Pydantic AI or workflow engine such as n8n.</p>
<h2>Go forth and build!</h2>
<p>This marks an exciting near-future for Storyden. Access keys provide a secure and easy way to build integrations.</p>
<p>Bot accounts prepare the platform for an agentic future.</p>
<p>What&#39;s next?</p>
<p>I&#39;m not sure but I&#39;ve heard it involves WebAssembly sandboxed plugins... 👀</p>
]]></content:encoded>
            <author>Barnaby Keene</author>
        </item>
        <item>
            <title><![CDATA[Effective content chunking for vector embedding large text documents]]></title>
            <link>https://www.storyden.org/blog/effective-content-chunking-for-llm-vector-embedding</link>
            <guid isPermaLink="false">/blog/effective-content-chunking-for-llm-vector-embedding</guid>
            <pubDate>Sat, 17 May 2025 15:24:00 GMT</pubDate>
            <description><![CDATA[Why Storyden chunks content, and how it turns rich text into useful structured data for AI-driven features like search and summarization.]]></description>
            <content:encoded><![CDATA[<p>Every piece of written content inside Storyden, be it a post, a page, a submission or a reply, flows through a single system: the <code>Content</code> type. It’s not just a blob of HTML. It’s a full-featured data structure that’s aware of links, references, summaries and chunks. And all of this is essential for ergonomic APIs as well as helping language models work well with the data.</p>
<p>Whether you’re building a recommendation engine, semantic search, or adding context-aware responses, you’ll need to treat raw content as more than just a string. This post explores how Storyden approaches this challenge, and why it does things the way it does.</p>
<h2>Text Is Not Enough</h2>
<p>A lot of content systems store HTML in a database. WordPress does this, and so does Storyden. HTML is portable, standardised and very well understood. However, on its journey from fingers to file, Storyden does a bunch of processing to better understand the actual content structure to do some useful stuff with it.</p>
<p>This &quot;useful stuff&quot; facilitates features like:</p>
<ul>
<li>Sanitise, because you can&#39;t trust the client, XSS is still a thing!</li>
<li>Run semantic search (though I&#39;m still skeptical of how useful this is...)</li>
<li>Summarize content for previews, cards, <code>&lt;title&gt;</code>, opengraph, etc.</li>
<li>Detect both external and internal links</li>
<li>Generate embeddings for an LLM to use in Retrieval-Augmented Generation</li>
</ul>
<h3>Sanitisation: Safety first</h3>
<p>Before anything else, raw input is passed through a sanitiser. This strips out unsafe tags or attributes, but still allows the Storyden-specific URI scheme, <code>sdr:</code> which are internal reference links between pages, posts, members and more (you can read more about that <a href="https://www.storyden.org/docs/introduction/content/references">here!</a>)</p>
<p>This approach lets members write, <code>POST</code> or paste rich content (even from suspicious sources) without worrying about <code>&lt;script&gt;</code> tags, malicious inline styles, <code>onclick</code>, etc.</p>
<h3>Structure over noise</h3>
<p>After sanitising, the content is parsed into a structured tree via Go&#39;s <code>html</code> package. This isn&#39;t used for rendering or changing the content, but it <em>is</em> useful for extracting things like:</p>
<ul>
<li>External links: external URLs so that Storyden can index them in the <a href="/docs/introduction/links">Link Aggregator</a></li>
<li>Internal links, or <a href="/docs/introduction/content/references">&quot;references&quot;</a>: <code>sdr:</code> URIs (e.g. <code>sdr://thread/xyz123</code>)</li>
<li>Media: image sources (and maybe videos one day? 👀)</li>
<li>Plaintext: the raw text content, useful for LLM-based summarisation (another thing I’m skeptical about in terms of usefulness, tbh)</li>
<li>Short Summary: a preview-friendly summary, capped at ~128 characters</li>
<li>Chunks: pieces of text with loosely defined boundaries</li>
</ul>
<h2>Chunking</h2>
<p>or: why you clicked this post in the first place probably.</p>
<p>One of the most important things the <code>Content</code> type does is chunking: breaking large blocks of text into smaller, semantically coherent units.</p>
<p><img src="/blog/chunking.png" alt="A diagram of a large HTML document getting split into small plain-text chunks"></p>
<h3>Why do this?</h3>
<p>Firstly, large language models operate within context limits and when you want to run inference using a piece of content, it&#39;s not always desirable to put the <em>entire</em> piece of content into the context window. Sometimes this is useful, like when I finish this post I might paste the entire thing into a GPT to proof-read it. But for other use-cases it&#39;s not going to yield the best results.</p>
<p>Secondly, and probably more importantly, the coordinates you get from vector embeddings become less and less localised the larger the text is. This isn&#39;t <em>always</em> the case technically, if you write 10 paragraphs about how cats enjoy laying in the sun, it will probably localise fairly well to a specific region in vector space. But people don&#39;t write like that. Forum users and directory curators don&#39;t write like that. Human writing bounces around, starting in one place and ending somewhere else.</p>
<p>So because semantic meaning matters, chunking needs to be aware of a few requirements:</p>
<ul>
<li>Chunk boundaries cannot be mid-sentence</li>
<li>Chunks must be small enough to represent a fairly self contained <em>unit of meaning</em>.</li>
<li>However, leniency must be allowed for longer spans of text, because it&#39;s humans behind the keyboard and humans are creative!</li>
</ul>
<h3>How does it work</h3>
<p>Storyden uses a hybrid approach to chunking, first breaking down high level then going per-paragraph to split further.</p>
<p>First, it walks the HTML tree for paragraph-style elements like <code>&lt;p&gt;</code>, <code>&lt;h2&gt;</code>, <code>&lt;blockquote&gt;</code>, etc. to get a list of root level blocks, think paragraphs, headings, code blocks, quotes, etc.</p>
<pre><code class="language-go">func (c Content) Split() []string {
	r := []html.Node{}

	// first, walk the tree for the top-most block-content nodes.
	var walk func(n *html.Node)
	walk = func(n *html.Node) {
		if n.Type == html.ElementNode {
			if /* omitted for brevity: n.DataAtom is a top-level block element */ {
				r = append(r, *n)
				return // return, as we don&#39;t want to recurse into the tree.
			}
		} else if n.Type == html.TextNode {
			r = append(r, *n)
		}

		for c := n.FirstChild; c != nil; c = c.NextSibling {
			walk(c)
		}
	}
	walk(c.html)

	chunks := chunksFromNodes(r, roughMaxSentenceSize)

	return chunks
}
</code></pre>
<p>Then, <code>chunksFromNodes</code> recursively splits them again down to a rough sentence-size boundary (<code>roughMaxSentenceSize</code> which is 350 characters, based on research of the English language*). Boundaries are currently English/Latin-based only*, using basic terminal-punctuation (periods, exclamation, question, etc.)</p>
<Callout>
  \* yes this is all English or European-latin only currently, see the
  conclusion for discussion and further research.
</Callout>

<pre><code class="language-go">func chunksFromNodes(ns []html.Node, max int) []string {
	chunks := []string{}

	for _, n := range ns {
		t := textfromnode(&amp;n)
		if len(t) &gt; max {
			chunks = append(chunks, splitearly(t, max)...)
		} else {
			chunks = append(chunks, t)
		}
	}

	return chunks
}
</code></pre>
<p>The core paragraph-level splitting is done in <code>splitearly</code>, which is applied when the first-pass chunk is too long (true in most cases.)</p>
<p>Leniency is applied by doubling the boundary (700 characters) and walking down until a punctuation boundary is found, this allows for larger chunks if someone wrote a very long paragraph.</p>
<p>In a worst case scenario (no boundaries were found, maybe you&#39;re discussing very long chenical names?) the last found space is used, or failing that, the upper boundary at position 700.</p>
<pre><code class="language-go">
func splitearly(in string, max int) []string {
	var chunks []string
	var split func(s string)
	split = func(s string) {
		if len(s) &lt;= max {
			chunks = append(chunks, strings.TrimSpace(s))
			return
		}

		upper := min(len(s), max) - 1
		if upper == -1 {
			// reached end of input stream
			return
		}

		lower := upper / 2
		boundary := upper
		fallback := -1
	outer:
		for ; boundary &gt; lower; boundary-- {
			c := s[boundary]
			switch c {
			// very rudimentary sentence boundaries (latin only at the moment)
			case &#39;.&#39;, &#39;;&#39;, &#39;!&#39;, &#39;?&#39;:
				break outer
			// worst case: no boundaries found, use the closest space
			case &#39; &#39;:
				if fallback == -1 {
					fallback = boundary
				}
			}
		}

		if boundary &lt;= lower {
			if fallback &gt; -1 {
				// worst case: no sent boundaries, split at fallback position.
				boundary = fallback
			} else {
				// worst case: no fallback either (the input string was a solid
				// block of text with no spaces or sentence boundaries.)
				boundary = upper
			}
		}

		left := strings.TrimSpace(s[:boundary])
		right := strings.TrimSpace(s[boundary+1:])
		chunks = append(chunks, left)

		if len(right) &gt; 0 {
			split(right)
		}
	}
	split(in)

	return chunks
}
</code></pre>
<p>The result is a list of pretty well-formed, meaningful text chunks which are now close to perfect for vector embedding. For example, this very post that you&#39;re reading, when run through the chunking algorithm, yields these first 5 chunks:</p>
<pre><code>Every piece of written content inside Storyden, be it a post, a page, a submission or a reply, flows through a single system: the Content type. It’s not just a blob of HTML. It’s a full-featured data structure that’s aware of links, references, summaries and chunks
</code></pre>
<pre><code>And all of this is essential for ergonomic APIs as well as helping language models work well with the data.
</code></pre>
<pre><code>Whether you’re building a recommendation engine, semantic search, or adding context-aware responses, you’ll need to treat raw content as more than just a string. This post explores how Storyden approaches this challenge, and why it does things the way it does.
</code></pre>
<p>(Note, this heading is probably not a useful extraction of semantic value now that I think about it... to resolve this, I&#39;d remove headings from that initial rood element gathering step at the start.)</p>
<pre><code>Text Is Not Enough
</code></pre>
<pre><code>A lot of content systems store HTML in a database. WordPress does this, and so does Storyden. HTML is portable, standardised and very well understood. However, on its journey from fingers to file, Storyden does a bunch of processing to better understand the actual content structure to do some useful stuff with it.
</code></pre>
<p>This means when you ask a question or use other LLM-powered features on Storyden, it can search the coordinates of each paragraph of each post or page, rather than the much more vague and &quot;averaged&quot; embedding of entire documents.</p>
<h2>Designed for RAG, but useful elsewhere</h2>
<p>A lot of modern RAG systems use chunking, with various different algorithms (though, at the time of building this in 2023/2024 there were not a lot of resources available on the topic discussing different approaches, especially for rich HTML trees.)</p>
<p>And while chunking is primarily done for semantic search (almost useless) and Retrieval-Augmented Generation (boring), it has knock-on benefits across Storyden:</p>
<ul>
<li>Summary descriptions: for OpenGraph cards and <code>&lt;title&gt;</code> tags.</li>
<li>Recommendations: Embedding at a more granular level allows more sophisticated recommendation algorithms.</li>
<li>Filtering: when building context for a prompt, you can more easily discard irrelevant chunks using metadata.</li>
</ul>
<h2>What about internationalisation?</h2>
<p>In short, it&#39;s hard. I&#39;m an NLP nerd and I wrote my thesis on it while working for a company doing lots of NLP analysis of Ministry of Defence documents, it was hard back then with just English and we used tools like <a href="https://spacy.io/">SpaCy</a>. Language models do make some things easier but there still exists the fundamental problem of data pre-processing. Which is key to training models, and sometimes even necessary when using models like GPTs.</p>
<p>Much of NLP at that time was very procedural, using dictionaries and lookup tables of word types, stopwords, stemming, sentence-splitting, etc. I&#39;m not sure how the industry has changed now but at that time, it was <em>very</em> manual in terms of procedural code running over text. There aren&#39;t many tricks you can use with language, especially English. Languages are messy, a product of ever evolving cultures with new words, grammatical structures, cases, slang and other elements popping up all the time. What I&#39;ve done here <em>may</em> work with <em>some</em> European languages but it definitely not work as well with Persian, Arabic, Korean, Urdu, etc.</p>
<p>The challenge isn&#39;t just in the boundary markers, sentence size and characters. It can go deeper, for example some languages don’t use spaces to separate words at all, even the concepts of “paragraph” and “sentence” aren’t universal. And then there are languages like German, somewhat fusional/agglutinated, where a single sentence can contain what feels like an entire essay thanks to compound nouns and nested clauses. Or fully agglutinated languages like Turkish.</p>
<p>A solution that&#39;s multi-language would probably need to be a lot more declarative and less procedural.</p>
<h2>Tasty chocolate chunks</h2>
<p>This whole system might seem like a lot of complexity but language models are no different to classic artificial intelligence or NLP: your success depends on the quality of the input data. Chunking in such a way that&#39;s somewhat semantically aware of the structure (not hard-cutting mid sentence, etc) yields better results in the (very informal and unscientific) benchmarks I&#39;ve run.</p>
<p>It also turns out splitting HTML is quite complex due to the different element types, leniency of HTML itself, and also just because the Go <code>html.Node</code> type is hella awkward to work with (but very powerful!)</p>
<p>If you’ve got a forum, directory, wiki, or anything that revolves around lots of human-written content, and you want to add actual intelligence on top of it, this approach will get you far.</p>
<p>You can try this out right now:</p>
<pre><code>docker run -p 8000:8000 ghcr.io/southclaws/storyden
</code></pre>
<p>Or <a href="https://www.storyden.org/docs/introduction#quickstart">check the getting started documentation.</a> Note: in order to enable LLM features (they are aggressively opt-in, as it&#39;s not for everyone) you must enable the Semdex (semantic index) by:</p>
<ul>
<li>providing a vector database - for quick testing you can use Storyden&#39;s embedded vector database, Chromem (<a href="https://www.storyden.org/docs/operation/configuration#local-semdex">read more</a>)</li>
<li>providing a language model - for now, OpenAI is the only supported provider (<a href="https://www.storyden.org/docs/operation/configuration#local-semdex">read more</a>)</li>
</ul>
<p>If you&#39;re interested in checking out how it works, you can read the <a href="https://github.com/Southclaws/storyden/blob/main/app/resources/datagraph/content.go">code</a> and <a href="https://github.com/Southclaws/storyden/blob/main/app/resources/datagraph/content_test.go">tests</a> on GitHub.</p>
<p>I hope this article was helpful, spread the word if you enjoyed it!</p>
]]></content:encoded>
            <author>Barnaby Keene</author>
        </item>
        <item>
            <title><![CDATA[Node properties and the EAV pattern]]></title>
            <link>https://www.storyden.org/blog/node-properties-eav-pattern</link>
            <guid isPermaLink="false">/blog/node-properties-eav-pattern</guid>
            <pubDate>Sat, 10 May 2025 11:25:01 GMT</pubDate>
            <description><![CDATA[A technical deep dive on Storyden's library node properties.]]></description>
            <content:encoded><![CDATA[<p>Evolving past the <a href="/blog/power-of-community-knowledgebase">forum roots</a> and building a wiki style knowledge base feature (which was actually apparently part of the plan since the start, which I <a href="https://github.com/Southclaws/storyden/issues/1#issuecomment-1300032121">recently discovered</a>) led to the <a href="/docs/introduction/library">Library</a> materialising as a core, almost flagship, feature of Storyden.</p>
<p>MediaWiki has this concept of &quot;info boxes&quot; which display basic attributes about the topic. They are present on almost every medium to large Wikipedia page and often link out to broader category pages such as locations, years, genres, styles, people, companies, etc.</p>
<p><img src="/blog/infobox.png" alt="A MediaWiki infobox"></p>
<p>These are essentially relations, in a big graph. However, the way MediaWiki implements them is just another piece of content on the page. If I click &quot;dreampunk&quot; in the above infobox for British-American vaporwave duo, ２８１４ (<a href="https://open.spotify.com/playlist/5oYeYuy5ehbf2nadYFKFM1?si=f730d6be83a54e3d">to which</a> much of the Storyden code was written) I land on the Wikipedia page for the dreampunk music genre, and on that page is a backlink to ２８１４ somewhere in the content.</p>
<p><img src="/blog/rel-oneway.png" alt="forward links"></p>
<p>Where this breaks down a bit is when I click the &quot;ambient&quot; link and land on the Wikipedia page for the ambient genre. There is no backlink to ２８１４. This is because those relationships are defined at the hypermedia layer. If I wanted to build a graph analysing the relationships between ２８１４ and their associated genres, similar acts, related people, etc. I would need to parse the content itself as the underlying relationships are not expressed in any other way.</p>
<p>Finding all the artists under &quot;ambient&quot; would not be possible solely from the &quot;ambient&quot; Wikipedia page; I would need to essentially scan every single Wikipedia page that exists and filter for those that have &quot;ambient&quot; in their &quot;Genres&quot; infobox.</p>
<p><img src="/blog/rel-backlink.png" alt="forward links"></p>
<p>Those of you who know relational database query planners would identify this as a &quot;full table scan&quot; as opposed to an index scan.</p>
<p>It&#39;s worth noting that this observation is not a <em>problem</em> for Wikipedia, its purpose is not to perform analytical processes on the knowledge graph, its purpose is to provide a free and open source of crowd-sourced and fact-checked information. One of the most important endeavours of our modern society.</p>
<p>For Storyden, we wanted to avoid having relationships buried in free text and instead make them first-class data citizens.</p>
<h2>Entity-Attribute-Value</h2>
<p>(not to be confused with Entity-Component System)</p>
<p>Storyden&#39;s goal is to make a community&#39;s collective knowledge organised, searchable and discoverable. Whether that&#39;s through discussion, curation or collection.</p>
<p>(the precursor to this was an indie fashion directory called Threadbase I started to build in 2018, but that&#39;s a story for another day!)</p>
<p>This made a relation graph an attractive concept to build in, something that did not exist in most &quot;wiki&quot; platforms. A big inspiration was Notion&#39;s &quot;database&quot; feature, where pages in the tree can exist within a structured table where attributes of the page itself become columns in that table. Essentially creating a very user-friendly relational database.</p>
<p><img src="/blog/node-properties.png" alt="An example of a page&#39;s properties table in Storyden"></p>
<p>So how do you implement a relational database inside a relational database?</p>
<p>There are two ways to do this:</p>
<ol>
<li><p>Actually just surface the relational database itself as an API.</p>
<p>This approach means that when you add a property to a page the application runs an <code>alter table add column</code> command against the database. Your database table structure <em>is</em> the user&#39;s interface into the properties and relations within the content itself.</p>
</li>
<li><p>Implement an entity-attribute-value pattern</p>
<p>The approach Storyden takes, where an additional table stores property names and values which are then related to the actual pages.</p>
</li>
</ol>
<p>Notion seems to use a hybrid of both approaches where SQLite acts as a relational cache with a &quot;real&quot; schema, then the cloud persistence implements some flavour of EAV. When you load a Notion page, the property queries run on the fast SQLite instance after the bulk of the data is loaded from the cloud store.</p>
<p>Storyden is much simpler and just has one database: SQLite or PostgreSQL, whichever floats your boat. And on-the-fly schema modifications sound complex and could make migrations a nightmare. So I opted for EAV.</p>
<p><img src="/blog/node-eav.png" alt="the basic structure of Storyden&#39;s EAV schema"></p>
<p>Now, while the EAV pattern offers flexibility without needing to keep tabs on the underlying database schema, it comes with tradeoffs. Both of these databases are heavily optimized for relational queries over fixed-column schemas, where indexes, statistics, and query planners can make efficient decisions based on known, static table structures.</p>
<p>With EAV, the key-value nature of the data model complicates what would normally be a simple column filter or join into multi-table joins and lookups.</p>
<p>For example, to sort a set of nodes that represent companies by their <code>founded_year</code>, it can’t use a direct index scan on a <code>founded_year</code> column, it would need to join the <code>properties</code> table to find the correct key and then filter or sort on the resulting rows. This makes it difficult for the query planner to optimize because the database cannot prebuild indexes across what are effectively row-based dynamic fields.</p>
<p>As with any technical decision, there are compromises. I chose EAV because it was (somewhat) easier to implement <em>for now</em>. I am but a sole developer and this product is not a money-maker, I don&#39;t have a team of people smarter than me (that would be great!) so I&#39;ve chosen a dumb solution. If I had chosen a dynamic schema approach, it would have increased the testing surface area massively, and a bug in that kind of system carries a higher risk. I chose low risk at the cost of slightly reduced performance.</p>
<h2>Properties today</h2>
<p>As of this post, the API is almost fully implemented, it only lacks data type implementations (which is a challenge in and of itself, given every cell is just a <code>text</code> type) so if you&#39;re a user of the API only, you can take advantage of this now!</p>
<p>The Storyden frontend currently exposes properties as a basic table on Library Pages. Table views, filtering and other features are on the near-term roadmap so keep an eye out!</p>
<h2>What&#39;s next?</h2>
<p>Now that this feature has landed in the API side, I have big plans for the knowledgebase side of Storyden&#39;s product offering. This unlocks:</p>
<ul>
<li>Database tables, like Notion but social!</li>
<li>Big directories that are easy to navigate, filter and search</li>
<li>Pre-filtered views of database nodes, referenced in other pages</li>
</ul>
<p>Who is this for? Some ideas themed on early adopter feedback sessions:</p>
<ul>
<li>Video game communities who want to keep track of item stats in a structured way</li>
<li>Curators who want to maintain a community contributed directory of resources</li>
<li>Gear nerds who want to catalogue their favourite tools, devices, etc.</li>
</ul>
<h2>Technical overview</h2>
<p>If you came here for the details, here&#39;s how Node Properties are implemented.</p>
<p>Library Pages are a tree structure, so internally they are called &quot;Library Nodes&quot;. Being a tree structure, this means each node may have many children. When properties come into play, this means all children of a given node <em>must</em> share the same set of possible properties. Properties are organised into &quot;property schemas&quot; to ensure this fact.</p>
<p>In data modelling terms, this means for every group of nodes with an identical <code>parent_node_id</code> the <code>property_schema_id</code> must be also identical. So all the nodes with some parent <code>A</code>, also must use the schema <code>X</code></p>
<p>The schema itself may have fields, fields are defined once to save space and make changing field names or types easy. This means that a node has one schema and that schema has many fields.</p>
<p>Property values are stored separately from the underlying property schema, because each node in a set will have many property values, each value maps to a schema field. This means that a node has zero or many property values but always zero or one schema.</p>
<p>&lt;Mermaid
  chart=&quot;erDiagram
    NODES {
        TEXT id PK
        TEXT parent_node_id FK
        TEXT property_schema_id FK
        TEXT sort
    }
    PROPERTY_SCHEMAS {
        TEXT id PK
    }
    PROPERTY_SCHEMA_FIELDS {
        TEXT id PK
        TEXT name
        TEXT type
        TEXT sort
        TEXT schema_id FK
    }
    PROPERTIES {
        TEXT id PK
        DATETIME created_at
        TEXT value
        TEXT node_id FK
        TEXT field_id FK
    }</p>
<pre><code>NODES ||--o{ PROPERTIES : has
PROPERTY_SCHEMAS ||--o{ PROPERTY_SCHEMA_FIELDS : defines
NODES }o--|| PROPERTY_SCHEMAS : uses
PROPERTIES }o--|| PROPERTY_SCHEMA_FIELDS : for
</code></pre>
<p>&quot;
/&gt;</p>
<p>Other than the schema itself, properties are quite loose, a set of children may hold a subset of property values. For example, given 3 nodes under &quot;Items&quot; only one or two of those nodes may have a property value for &quot;Weight&quot;.</p>
<h3>Some query use-cases</h3>
<Callout>
  These excerpts from the Storyden source code are dated 10th of May 2025, in
  the event these change, check out the [latest source
  code](https://github.com/Southclaws/storyden).
</Callout>

<p>An API consumer will care about a few different perspectives of this data structure. For example, you may want to get a single node without its children and see its &quot;child node schema&quot;. This would be the schema that all child nodes of that schema share. This &quot;child node schema&quot; is not actually <em>stored</em> on the parent node itself because root nodes have no parent and this would restrict the ability for root level nodes to have properties. For this reason, the schema itself is referenced by the nodes directly, allowing root level nodes to hold a schema and properties.</p>
<h4>Querying property schemas of nodes</h4>
<p>To solve this use-case, the siblings and the parent are queried to gather all fields. So if the parent node also has a schema, you get both in one query:</p>
<pre><code class="language-sql">with
  sibling_properties as (
    select
      ps.id         schema_id,
      min(psf.id)   field_id,
      min(psf.name) name,
      min(psf.type) type,
      min(psf.sort) sort,
      &#39;sibling&#39; as source
    from
      nodes n
      left join nodes sn on sn.parent_node_id = n.parent_node_id
      inner join property_schemas ps on ps.id = sn.property_schema_id
      or ps.id = n.property_schema_id
      inner join property_schema_fields psf on psf.schema_id = ps.id
    where
      n.id = $1
    group by ps.id, psf.id
  ),
  child_properties as (
    select
      ps.id         schema_id,
      min(psf.id)   field_id,
      min(psf.name) name,
      min(psf.type) type,
      min(psf.sort) sort,
      &#39;child&#39; as source
    from
      nodes n
      inner join nodes cn on cn.parent_node_id = n.id
      inner join property_schemas ps on ps.id = cn.property_schema_id
      inner join property_schema_fields psf on psf.schema_id = ps.id
    where
      n.id = $1
    group by ps.id, psf.id
  )
select
  *
from
  sibling_properties
union all
select
  *
from
  child_properties
order by source desc, sort asc
</code></pre>
<p>Another use-case is you pull one node that&#39;s a child of a parent node. You want to see this node&#39;s schema and its values. This one is easy, the schema is already stored on the node itself so it&#39;s just a quick join against properties of that node - which is a direct relationship. However, because the properties themselves only store the values of properties, the schema and schema fields are still required in the query. Values are related to schema fields by the field ID.</p>
<p>Some useful observations for schemas and values:</p>
<ul>
<li>Schemas don&#39;t change often, usage patterns often see rare but relatively large bursts of mutations (when a user is setting up a new page or editing columns) followed by no changes for a while. This allows caching to come into play and this is easy to implement over the top of the current node repository as properties are queried separately, not joined against the node itself.</li>
<li>While schema data involves some pretty gnarly queries, the sizes of actual schemas are in the 10s of rows.</li>
<li>The actual bottleneck is in the node queries themselves, where parent nodes may contain hundreds or thousands of children so more work on optimisation will focus on these read paths rather than property value/schema write paths.</li>
</ul>
<h4>Sorting nodes by their EAV values</h4>
<p>One of the more complex and performance sensitive areas is querying all children of a node and operating on the property values to perform filtering or sorting. This is currently implemented as a separate query to pull the properties in a table result and joined in-application to the nodes being queried. The sorting itself is still performed in the database when pulling the property values. The ordered result is then used to sort the list of nodes in-application while mapping the results to the nodes.</p>
<pre><code class="language-go">
const querySortedByPropertyValue_sqlite = `
select
  n.id id
from
  nodes n
  left join properties p on n.id = p.node_id
  inner join property_schema_fields f on p.field_id = f.id and f.name = $1
where
  n.id in (%s)
order by
  case f.type
    when &#39;text&#39;      then p.value
    when &#39;number&#39;    then cast(p.value as real)
    when &#39;timestamp&#39; then cast(p.value as datetime)
    when &#39;boolean&#39;   then cast(p.value as integer)
    else p.value

  end %s

limit  %d
offset %d
`

const querySortedByPropertyValue_postgres = `
select
  n.id id
from
  nodes n
  left join properties p on n.id = p.node_id
  inner join property_schema_fields f on p.field_id = f.id and f.name = $1
where
  n.id in (%s)
order by
  case f.type when &#39;text&#39;      then p.value                            end %s,
  case f.type when &#39;number&#39;    then cast(p.value as numeric)           end %s,
  case f.type when &#39;timestamp&#39; then cast(p.value as timestamp)         end %s,
  case f.type when &#39;boolean&#39;   then cast(p.value as boolean)           end %s,
  p.value %s
limit  %d
offset %d
`
</code></pre>
<p>In classic SQL-doesn&#39;t-have-a-standard-that-anyone-cares-about fashion, we need two separate queries here, one for PostgreSQL/CockroachDB and another for SQLite.</p>
<p>Another irritating fact is you can&#39;t parameterise anything in a query, only certain types of syntax so certain parts of this query must be (dangerously) injected using string formatting primitives before being passed to the database&#39;s own argument mapping. So there&#39;s a mix of <code>$</code> positional arguments and <code>%</code> format specifiers. It&#39;ll probably stay this way for the next 50 years so don&#39;t hold your breath for improvements...</p>
<p>Okay, ranting aside, this query gets us a list of node IDs sorted by the given property value, based on its declared data type. The case-switch in the <code>order by</code> clause allows us to lexographically sort text while correctly sorting other types such as numbers, timestamps and booleans.</p>
<h4>Pulling the whole tree while filtering nodes with many children</h4>
<p>This use-case of &quot;database nodes&quot; that hold many children that looks somewhat like a relational database on the surface introduces another problem. Storyden&#39;s sidebar will give you the whole tree, and if one of those nodes has 1,000 children because it&#39;s being used as a &quot;database page&quot; with a bunch of properties rendered as a table, that&#39;s a problem.</p>
<p>So, to solve this, nodes have a column called <code>hide_child_tree</code> which, when true, will omit the <em>children</em> of that node (not the node itself) from the tree. This means the node that contains 1,000 children will still be visible in the sidebar, but its children will not and React will not try to render thousands of DOM nodes.</p>
<p>This is achieved by checking the <code>hide_child_tree</code> in the recursive part of the tree-traversal recursive CTE. What this does is it tells the query engine to stop recursing once that clause yields a false outcome resulting in the immediate row being emitted but none of its children will be iterated and the query continues to the next sibling to continue walking the tree.</p>
<pre><code class="language-sql">with recursive children (parent, id, sort, depth) as (
    select
        parent_node_id,
        id,
        sort,
        0
    from
        nodes
    where %s
union
    select
        d.parent,
        s.id,
        s.sort,
        d.depth + 1
    from
        children d
        join nodes parent_node on parent_node.id = d.id
        join nodes s on d.id = s.parent_node_id
    where
        parent_node.hide_child_tree = false
)
select
    distinct n.id       node_id,
    n.account_id        node_account_id,
    n.visibility        node_visibility,
    n.sort              node_sort_key,
    depth
from
    children
    inner join nodes n on n.id = children.id
    inner join accounts a on a.id = n.account_id

-- optional where clause
%s

order by
    depth, node_sort_key
</code></pre>
<p>Again, &lt;SQL rant /&gt;, <code>%s</code> is there at the bottom to dynamically inject more clauses to the query. No injection here as the rest of the code that constructs this query inserts <code>$</code> positional arguments and passes the user-supplied fields into the database&#39;s arguments not the raw string query. The code is a mess, please don&#39;t look for it.</p>
<h2>Conclusion</h2>
<p>It works. It&#39;s not perfect, and it is begging for improvements but it works. And it has fairly deep end-to-end test coverage so that&#39;s a win!</p>
<p>If you&#39;d like to contribute to this mess, please check out the project&#39;s GitHub page! <a href="https://github.com/Southclaws/storyden">https://github.com/Southclaws/storyden</a></p>
]]></content:encoded>
            <author>Barnaby Keene</author>
        </item>
        <item>
            <title><![CDATA[Gearing up for take-off]]></title>
            <link>https://www.storyden.org/blog/gearing-up-for-take-off</link>
            <guid isPermaLink="false">/blog/gearing-up-for-take-off</guid>
            <pubDate>Wed, 16 Apr 2025 00:00:00 GMT</pubDate>
            <description><![CDATA[Setting on a direction and full (ish) steam ahead!]]></description>
            <content:encoded><![CDATA[<blockquote>
<p>TL;DR: New site! New docs! Bit of a brand refresh and some reflections on direction. What 2025 holds for the launch of stable production-ready-to-use.</p>
</blockquote>
<p>The <a href="/blog/exploring-language-models">last entry</a> was a sort of &quot;how we got here&quot;. The conclusion was that the idea of a community knowledge-base is at the core of Storyden.</p>
<p>That doesn&#39;t remove the forum aspect, it plays on it. Internet forums of the 2000s, which I grew up on, solved a problem for that era. This is a new era with new problems so rehashing phpBB or SMF won’t cut it.</p>
<p>Most of my circles live on Discord, but good luck finding that great link someone shared last month.</p>
<Callout>
  **Why should you care?** You ever dig through Discord to find that one link
  someone shared? Want a collaborative, evolving directory for your group? Miss
  old-school forums but not their dated UX? That’s what Storyden is for.
</Callout>

<p>Since launching the project, I&#39;ve received some great feedback via <a href="https://airtable.com/shrLY0jDp9CuXPB2X">this form</a> one stuck out to me and really resonated with the north star that has guided Storyden&#39;s development over the last couple of years:</p>
<blockquote>
<p>Discord is too real time -- Discourse is too detached from people&#39;s daily work. So we need a way to create HN + Discord</p>
</blockquote>
<p>Which, really nails the origins and reasons I started the project. See any of the other articles on this site to dive deeper.</p>
<h2>Try it out right here, right now!</h2>
<p>Visit our community Storyden instance: <a href="https://makeroom.club/">https://makeroom.club/</a></p>
<p>Or <a href="/docs/introduction">get started here</a> and run one yourself:</p>
<pre><code class="language-sh">docker run -p 8000:8000 ghcr.io/southclaws/storyden
</code></pre>
<p><img src="/docs/introduction/local/browser_home.png" alt="Storyden web UI"></p>
<hr>
<p>So what next? Keep building out features ad infinitum? No.</p>
<p>This site is brand new, along with some touch-ups to the brand identity and a <em>whole</em> lot more documentation.</p>
<h2>New Site</h2>
<p>The new homepage for Storyden runs on <a href="https://fumadocs.vercel.app/">Fumadocs</a> which I was extremely happy to discover on an X thread. It&#39;s a documentation site builder for Next.js which replaces Nextra in all the ways I wanted it to.</p>
<p><img src="/blog/docs.png" alt="The new documentation site"></p>
<p>Nextra was great, but it suffered from a few weird issues I couldn&#39;t figure out, plus it was stuck on the old &quot;Pages&quot; mode of Next.js, making certain newer functionality of the framework more difficult to use.</p>
<p>Fumadocs runs on the latest Next.js version with the &quot;App&quot; directory system. As a Next.js nerd this pleases me somewhat! It also means I can more easily use layouts, opengraph images, custom page designs, etc. Great work Fuma Nama, no notes.</p>
<h2>Brand Refresh</h2>
<p>My partner said I should stop obsessing over design details and just ship the product. Which, yes, she has a point. But a bit of burnout, getting stuck on some complex features (node properties was a nightmare) and attempted fundamental changes to the frontend (RxDB is really complicated) was cured by a weekend of obsessing over typefaces, colour palettes, styleboards, kerning, more colours and other good design stuff I just really enjoy.</p>
<p>So yes, was a rebrand necessary before the product is even &quot;released&quot;? No. Was it necessary for my mental health? Certainly.</p>
<p>The new mark uses this beautiful typeface I found by the <a href="https://www.identity-letters.com/">Identity Letters</a> foundry, designed by <a href="https://www.linkedin.com/in/moritz-kleinsorge-62487b183/">Moritz Kleinsorge</a>. It&#39;s a perfect balance of smart-casual playfulness that fits Storyden so well.</p>
<p><img src="/brand/fullmark_colour_horizontal_campfire.png" alt="The Storyden wordmark">
<img src="/brand/fullmark_colour_horizontal_moonlit.png" alt="The Storyden wordmark">
<img src="/brand/fullmark_colour_horizontal_newspaper.png" alt="The Storyden wordmark"></p>
<p>The rest of the type is set with:</p>
<ul>
<li>Work Sans by Wei Huang for general interface body text</li>
<li>Hedvig Serif by Kanon Foundry for longform body text</li>
<li>Intel One Mono by Frere-Jones Type for code examples</li>
</ul>
<p><img src="/blog/typography.png" alt="Typography styleboard"></p>
<h2>What&#39;s ready?</h2>
<p>So, you can deploy Storyden <a href="http://localhost:3000/docs/introduction/vps">right now</a>. To production. And have lots of fun.</p>
<p>All the fundamentals are documented now, peruse the new pages and begin imagining how useful it could be to your community.</p>
<p>You can:</p>
<ul>
<li>Run a discussion forum</li>
<li>Let your members sign up with just a username, an email or go full OAuth2 with Discord, Google and GitHub (more on the way!)</li>
<li>Run a directory</li>
<li>Let your members contribute to the library and grow your collective information repository, whether it&#39;s links to cool sites, items in a video game organised by stats, your favourite indie clothing brands, whatever floats your boat!</li>
<li>Enable AI features for asking, semantic search and automated organisation</li>
</ul>
<p>You can&#39;t: (yet)</p>
<ul>
<li>Connect to Discord/WhatsApp/Slack: for me, these would close the gap between where my folks hang out and where we want to store all the great links they share. There are a few features I want to get right here so this bit is top of mind for the next iteration.</li>
<li>Organise nodes by property in database views: I am shamelessly copying Notion here but in the spirit of solving my own problems, I want to go deep down the directory route and make Storyden the easiest place to build well-organised directories. The foundation is there (node properties) but the UI needs love and the APIs need smarter filtering, grouping, etc.</li>
<li>Extend the platform: this is a huge one, and what I believe makes any software product long-lived. Go beyond the boundaries I&#39;ve created and really turn your Storyden into whatever you want. Spoiler: it involves extremely simple WebAssembly stdin/stdout RPC.</li>
</ul>
<h2>What&#39;s next?</h2>
<p>On the product side: those three things right above ☝️</p>
<p>On the marketing side: figure out how to really gain proper traction, find the people who share the problems that I&#39;m solving for myself and sell the dream!</p>
<p>Hosted: for the non-technical folks, a simple hosting platform. Harder than it seems but fun to build and surprisingly cheap enough to offer just a free tier for the time being (full transparency: business administration is time consuming and not something I really want to deal with right now) reach out if you want to be in the first cohort!</p>
<p>Open source: there have been a lot of interesting problems solved in the dark depths of the Storyden codebase, one goal is to pull out some of those bits and libraryise them for easier use in the open source community.</p>
<Callout>
  Curious? [Star it on GitHub](https://github.com/Southclaws/storyden), spin it
  up, or drop me feedback - I read everything.
</Callout>

<p>Anyway, I&#39;ve got a friend&#39;s wedding to attend now so, until next time 👋</p>
<p>Barney</p>
]]></content:encoded>
            <author>Barnaby Keene</author>
        </item>
        <item>
            <title><![CDATA[Exploring Language Models]]></title>
            <link>https://www.storyden.org/blog/exploring-language-models</link>
            <guid isPermaLink="false">/blog/exploring-language-models</guid>
            <pubDate>Sat, 21 Sep 2024 00:00:00 GMT</pubDate>
            <description><![CDATA[AI buzzwords, social primitives and directories: A brief update on the development of Storyden from the past year.
]]></description>
            <content:encoded><![CDATA[<p>I (<a href="https://barney.is">Barney</a>) started this project for a few reasons. Partly to scratch an itch I had for building a product end-to-end, do the design, the branding, the marketing and development; releasing the whole process as an open-source endeavour. Another reason was I was frustrated with the state of forum software on the market while searching for platforms for communities I&#39;m involved in.</p>
<p>These goals have evolved over 2024 as I&#39;ve narrowed the focus of the project, found a direction where I&#39;m happy with the long term viability and market fit. I&#39;ve also had the opportunity to dogfood the product in a few production scenarios, which has lead to some areas to advance a lot faster than others, more on that later.</p>
<h2>Let&#39;s get the buzzword out of the way first</h2>
<p>Unless you&#39;ve been living under a rock, you know generative transformers and language models are the thing right now. As with most of my peers, I&#39;m not of the opinion this stuff is going to replace software engineers, copywriters, artists or any other profession. There&#39;s a lot of power behind GPTs which isn&#39;t just generating mediocre blog posts. (as I&#39;m writing this, copilot is giving me the most mid suggestions and mundane sentances)</p>
<p>Storyden, so far, does not have any form of content generation aside from a single summarisation prompt, which honestly isn&#39;t very good. It&#39;ll probably stay like that as my target audience and existing early users simply aren&#39;t interested. I&#39;ve had the opportunity to chat to a lot of people in various industries, primarily people who hold writing as a core part of their strategy (whether its thought leadership, community building, newsletters, content marketing, etc.) this year and a common sentiment is that their writing must come from them, it&#39;s very personal and that personal touch matters.</p>
<p>Where I think language models can shine is surprisingly not so much in the spotlight: recommendation algorithms and (very) fuzzy search.</p>
<h3>The Semdex</h3>
<p>In classic fashion, I invented a term in the codebase, semantic + index = semdex. Silly, I know. The Semdex is essentially a big graph database, but there are no edges connecting nodes. Instead, use vector embeddings much like RAG search. I have some strong opinions on search, but more on that in the next section.</p>
<p>Throwing everything into a vector database and not worrying too much about building complex edge relationships has been quite a nice mental model for me. It&#39;s also very fast, even with a few thousand items (though, further benchmarking with larger datasets is needed.)</p>
<p>What this unlocks is an almost democratisation of recommendation algorithms. One weekend I dumped a ton of vertical videos from TikTok into the Semdex, wired up watch time, likes and comments and as I scrolled through the feed, it gradually became more and more tailored to my &quot;interests&quot;. I would have never imagined I could build a &quot;recommendation algorithm&quot; when I studied machine learning in university, it all went over my head and seemed so much like a black box with an awfully slow test-train feedback loop. I&#39;ve started to think of LLMs as a sort of higher level abstraction on top of ML fundamentals (well, it sort of is I guess, but I&#39;m no expert!)</p>
<p>Of course you wouldn&#39;t really want to launch a endless scroll vertical video app built on a single Go server, there&#39;s a ton more to consider like not feeding previously viewed videos, content moderation, performance, etc. TikTok itself put out a <a href="https://arxiv.org/pdf/2209.07663">great paper</a> which is obviously a lot more sophisticated but it does share some DNA.</p>
<p>And this had me thinking a lot about social platforms in general, particularly over-recommendation. Why recommend <em>only</em> things you like, what about recommending things you may not like or agree with. Which are all things you can do if you own the data and the algorithm. And yes, pluggable algorithms are something I&#39;ve toyed with.</p>
<p>Which is really what Storyden is all about, get your community off Reddit and stop feeding them free training data. Self host your data, your embedding engine, own it all.</p>
<h2>Forum?</h2>
<p>So the landing page says &quot;A forum for the modern world&quot; and so far I&#39;ve just talked about LLM nonsense like every other VC backed tech product over the last year (Storyden is not VC backed, for the record...) am I straying from the original path?</p>
<p>Well, a little. I&#39;ve gone back and forth on the forum idea quite a bit, but I keep seeing people on various platforms say things along the lines of &quot;what happened to all the old internet forums, why is everything just reddit now?&quot; so I think there&#39;s still a place for a product like this.</p>
<p>What I&#39;m trying to balance is innovating a <em>little</em> on the timeless idea while keeping the DNA of what made the forums I grew up on so fun.</p>
<p>A common indie hacker trope is that you&#39;re just building to solve your own problems and somewhere along the way, it becomes a viable business endeavour. But a pitfall is you&#39;re building for your echo chamber, your age group, your demographic. <a href="https://www.businessinsider.com/jim-rohn-youre-the-average-of-the-five-people-you-spend-the-most-time-with-2012-7">You are the average of the 5 people you associate with most</a> and all that. I&#39;ve framed this project as something for the future, so I can only borrow so much from the past before it alienates the next era of internet users. If a WordPress of the 2030s is to exist, what attributes should it borrow from the social media platforms most of the youth are using today? On top of that, what are the most <em>useful</em> features that are actually valuable to people, and what&#39;s just marketing fluff. (<em>something something AI powered blockchain NFTs</em>)</p>
<h2>Social Primitives</h2>
<p>The last 15 years has seen a small assortment of ideas that permiate almost all &quot;social&quot; products. The likes, the shares, the threads, etc. Most of this stuff is not technically challenging at all, you can clone Twitter in a weekend and if you get past the cold-start problem by hitting the right niche with the right messaging, you can build a fairly active and successful space, <a href="https://posts.cv/">posts.cv</a> did this and it&#39;s a wonderful breath of fresh air compared to what Twitter has evolved into. Substack is another example that carved out a niche that became a fairly large successful platform rivalling WordPress and similar.</p>
<p>So with this, I believe it&#39;s important to pick the primitive building blocks that make sense because familiarity matters. Then innovate on top of that with &quot;nice UX&quot; (whatever that means today) and a light sprinkling of features that lead to <a href="https://blog.joinodin.com/p/popsicle-moments">popsicle moments</a>.</p>
<p>It&#39;s not social at all, but I think <a href="https://linear.app/">Linear</a> is a fantastic example of a beautiful product in a boring space that manages to be a joy to use. Outside of the impressive technical implementation (CRDTs and such) the experience of using Linear adds a layer of delight on top of what&#39;s normally quite a boring process (product and project management.)</p>
<p>These are my north stars for Storyden, a product that&#39;s a joy to use, that&#39;s familiar enough to not alienate, but innovative enough to be useful for the next era of internet culture.</p>
<h2>Information directories and curation</h2>
<p>I use TikTok, every day, it&#39;s my digital junk food and in moderation I think it&#39;s fine. <a href="https://www.nytimes.com/2022/09/16/technology/gen-z-tiktok-search-engine.html">TikTok has become a search engine</a>, and what started as a dancing video app eventually competing with Google, the monopoly that totally isn&#39;t (😉) a monopoly was not on my 2024 bingo card.</p>
<p>Most of the value I derive from this app is rooted in curation. Primarily clothes and music. I&#39;ve discovered artists I never would have with Spotify&#39;s obsession with playing the same songs. (which I think is that over-recommendation problem I mentioned earlier) and independent fashion brands that I would have never found on Instagram. Some creators I follow do both music and fashion all on the same account.</p>
<p>The problem is while TikTok is becoming a macro-level search engine for the next generation, it&#39;s actually not great at the micro level. If I want to find a video <a href="https://www.tiktok.com/@zagua999">ZAGUA</a> made about a particularly interesting brand 6 months ago, that&#39;s just not happening.</p>
<p>A lot of creators get around this by setting up a Notion database, like <a href="https://quickthoughts.notion.site/">this research archive</a> by <a href="https://www.tiktok.com/@lthlnkso">QuickThoughts</a> where he&#39;s painstakingly listed every uploaded video along with detailed research notes. And if I want to drop a comment or chat with other fans, I have to go back to TikTok, find the video then leave a comment which will be lost to the void because anything older than a day is essentially nonexistent on the platform.</p>
<p>So, a common indie hacker trope is that you&#39;re just building to solve your own problems? I suppose so. This area of Storyden is more on the experimental muse side of things. Figuring out the marketing messaging for this feature is a little tough, &quot;social Notion&quot;? &quot;WordPress with spreadsheets&quot;? It&#39;s also a feature that I haven&#39;t actually validated, as far as I know, nobody is asking for this. But I&#39;m building it anyway because I <em>feel</em> like there&#39;s a success trajectory somewhere along the way. It&#39;s also just fun, for some reason.</p>
<h2>The Backend x Frontend race</h2>
<p>Due to how I&#39;m using Storyden with a few existing &quot;customers&quot; (I use that term lightly, revenue is zero and probably will be for a long time) I&#39;m often building bespoke frontends rather than using the reference implementation in the open source repository. The benefit here is the API-driven design is really proving itself, as I can just pull the OpenAPI specification, generate a client and build out a fairly nice React frontend in a weekend. The downside here is that the backend is <em>way</em> ahead of what the reference frontend exposes. Everything in this post is a feature implemented, tested and (somewhat) documented in the backend Golang codebase. The frontend, and thus the public &quot;demo&quot; at <a href="https://makeroom.club">makeroom.club</a> is a little behind.</p>
<p>There are a few early features in development that will most likely land before the end of the year. And if you&#39;re building a bespoke frontend implementation on top of the Storyden platform API, you can already start using them.</p>
<h2>Events</h2>
<p>A few early users have mentioned event organisation in-product, a sort of Luma alternative that&#39;s built in with all the normal things you&#39;d expect from an events platform, like invites, RSVPs, ical integration. This one is still early and build started in the summer.</p>
<h2>Roles</h2>
<p>For an embarrassingly long time, there was a single boolean flag called <code>admin</code>. That was the role-based-access-control mechanism. Last month, I finally got around to replacing this with a proper role system with granular permissions. I&#39;ve modelled roles after the Discord permission system, which I think is a good balance between flexibility and simplicity. Roles can hold permissions, or they can just be aesthetic and give a bit of colour to your username.</p>
<h2>Authentication</h2>
<p>Emails landed as an optional way to authenticate. If you&#39;ve read any of the other blog posts about core values, you know that I consider email addresses to be optional. Privacy conscious hosts can omit the need for an email address for sign-up and opt for just dealing with handles. But for those who want to use Storyden as a more traditional forum, emails are now an option.</p>
<p>There&#39;s also an unfinished branch called <code>saml</code>... #enterprise</p>
<p>Web3 is something I put on the home page back when I designed that over 2 years ago. That claim, I am sad to say, still has not come to fruition. Mostly because I know absolutely nothing about Web3 technology. Fake it &#39;til you make it, right? I should probably remove that claim for now...</p>
<h2>The future</h2>
<p>After doing almost zero marketing (apart from these sporadic blog posts) the GitHub repository still has almost 100 stars, so that&#39;s neat. Not huge numbers but it&#39;s something of a signal that there&#39;s <em>some</em> interest. I will eventually do the usual Hacker News, Reddit, Product Hunt rounds, but in classic perfectionism mindset I am hesitant to try until a few more forum basics (which I keep putting off) are implemented.</p>
<p>As for business plans, there&#39;s none yet, Storyden will always remain open source and I am very aware of the recent license shenanigans with Redis, ElasticSearch and friends. This is mostly blocked by the frontend issue mentioned above, movement there will unlock usability of the more powerful features and make the product a more compelling offering. Also, multi-tenant hosted instances are hard. If you want to chat, offer advice or help in any way, shoot me an email <code>barney</code> at symbol <code>hey dot com</code>.</p>
<p>And that&#39;s it, 2024 in a nutshell so far. I&#39;m cautiously excited about the next chapter of Storyden, and I hope you are too.</p>
]]></content:encoded>
            <author>Barnaby Keene</author>
        </item>
        <item>
            <title><![CDATA[The power of a community-driven knowledgebase]]></title>
            <link>https://www.storyden.org/blog/power-of-community-knowledgebase</link>
            <guid isPermaLink="false">/blog/power-of-community-knowledgebase</guid>
            <pubDate>Fri, 02 Feb 2024 00:00:00 GMT</pubDate>
            <description><![CDATA[The power of a community knowledgebase can mean the difference between a mediocre group chat and a valuable long-lasting public resource. But why?]]></description>
            <content:encoded><![CDATA[<p>I&#39;m writing this from a Eurostar train on my way to Brussels to attend <a href="https://fosdem.org/">FOSDEM 2024</a> along with some good friends I&#39;ve known for years thanks to the internet.</p>
<p><img src="/blog/train_400x398.png" alt="The view out of the window"></p>
<blockquote>
<p>The view out of the window on this bleak winter morning in the French countryside.</p>
</blockquote>
<p>Those good friends I know because of a multiplayer game we all played years ago. There were forums for asking questions and showing off what you&#39;d made and a wiki for documenting the scripting APIs and other things like models, textures, skins, vehicles, etc.</p>
<p>I love Discord, it&#39;s what we use nowadays for this community and a few others, it&#39;s what we used to organise the first FOSDEM trip, our trip to Poland, Iceland, Switzerland and many other lands. Projects have been born there, jobs have been found, advice given and memories made. But it&#39;s not the be-all-end-all of online community experience.</p>
<h2>Asynchronous still has its place</h2>
<p>Discord, Slack, WhatsApp and others fit into a category of communication tools known as &quot;synchronous&quot;. This means that, most of the time, you need to be online at the same time as the other members to communicate. There&#39;s message history but it&#39;s rare you can or even want to keep up with every single message.</p>
<p>One of the side effects of this is discussions, links shared and advice can get lost. Even with fairly modern search tools it&#39;s not always easy to find that one link someone shared a few months ago that you know you need now. What makes this worse for the data side is that conversations are often interspersed and there can easily be two or three discussions happening at once. People may find this easy to follow, especially if you&#39;ve grown up with IRC, MSN groups or WhatsApp chats. But for a computer system it&#39;s very difficult to figure out what&#39;s related to what.</p>
<pre><code>Amir: @Southclaws did you see my PR?
Southclaws: @Michael have you tried AsyncAPI yet?
Amir: LMAO
Amir: &lt;twitter.com/...&gt;
Michael: not yet, but it looks nice
Southclaws: @Amir yeah I replied on gh
Marcel: @Amir wanna game?
Southclaws: lmao nice
Southclaws: yeah I&#39;m thinking of async api for storyden websockets
</code></pre>
<blockquote>
<p>try and string this tiny snippet from Makeroom&#39;s Discord chat together into independent threads...</p>
</blockquote>
<p>Searching this word-by-word may be easy but topic based semantic search becomes difficult. While a relatively new technology, it does a better job with fewer, larger pieces of text. Chats are difficult to index and difficult to search because you&#39;re often not quite sure what you&#39;re looking for. It&#39;s all a very fuzzy problem space but traditional fuzzy-match-based-search may not work as well.</p>
<p>Now, it&#39;s worth noting the old forums weren&#39;t much better. In fact, they were much worse because your search queries were often exact matches rather than somewhat fuzzy. I can search for &quot;repos&quot; and Discord is smart enough to turn that shorthand plural into a singular token (&quot;repo&quot;) and show me results for both &quot;repo&quot; and &quot;repos.&quot;</p>
<p><img src="/blog/discord-search.png" alt="Searching for repos in Discord"></p>
<p>But I think what makes most of the difference is how content is written and organised. In a synchronous platform, members will write a high volume of very small messages, some of which are single words or even single characters. In an asynchronous such as GitHub issues, Discourse and traditional forums, members will (most of the time) write a lower volume of longer messages.</p>
<p>The interfaces of both of these are designed around this idea. Single line text box where media is treated as an &quot;attachment&quot; and formatting is minimal vs. larger text area, maybe with WYSIWYG formatting and richer content such as images and videos interspersed with text.</p>
<p>And it shows, Discourse has found a solid market in the customer support space. Discord added &quot;Forum channels&quot; which are effectively somewhere between chat and forums. GitHub issues are used for bug tracking but they also added &quot;Discussions&quot; which is just an oldschool forum. The need for this slightly longer form and lower volume format is clear.</p>
<h2>Search is hard</h2>
<p>I&#39;m not all-in on semantic search either. I do find it aggrevating when I&#39;m trying to find what I <em>know</em> to be an exact word match in some website search and it&#39;s trying to Be Clever™ by using some kind of LLM based semantic search. This often doesn&#39;t help</p>
<p>Semantic has its place but you <em>need</em> to give users options. Not having options really ruins user experiences and it gives off this aura of &quot;We know better&quot; pretentiousness. It&#39;s one of the huge issues I personally have with Apple products!</p>
<p>So when <em>is</em> semantic search useful? When you&#39;re not quite sure what you&#39;re looking for but you have a rough idea. Language models are amazing in this problem space and it&#39;s one reason I think the latest wave of LLM hype is well earned compared to the days of natural language processing I spent way too many caffeine induced nights trying to get right in uni!</p>
<p>It also gets better the larger the dataset is! So...</p>
<h2>Knowledge aggregation is a team sport</h2>
<p>In a previous article about <a href="/blog/what-are-social-bookmarks-link-aggregators">link aggregators and social bookmarking</a> we discussed how useful and natural collecting a big pile of URLs is to most folks on the internet. And multiplying that by a group of people can have awesome effects.</p>
<p>A combination of chat apps, link aggregating and building a directory of resources around which a community interest grows. Clubs, mentorship, support, teams, gaming can all benefit from this approach.</p>
<p>Storyden&#39;s <a href="/blog/the-architecture-of-modern-forum-software">architecture</a> is centered around organising information. It&#39;s like a little baby version of <a href="https://www.google.com/intl/en_uk/search/howsearchworks/our-approach/">&quot;organising the world&#39;s information and making it universally accessible and useful&quot;</a>, you could say it&#39;s... <a href="https://www.youtube.com/watch?v=u1xrNaTO1bI">your own personal Google!</a> how about, &quot;organising your community&#39;s information and making it searchable and useful&quot;? If it wasn&#39;t such an obvious rip, I&#39;d use it on the landing page!</p>
<p>So if any of that resonates with you, why not give <a href="/docs">Storyden</a> a try? I&#39;d love to hear your feedback and input as we develop entirely in the open! Community style.</p>
<p>Thanks for reading! Maybe see you next <a href="https://fosdem.org/2025">FOSDEM</a> 👀? (yes, at the time of writing this is a 404 but I&#39;m sure it&#39;ll appear at some point in the year! my SEO will not like that but I&#39;ll 100% forget to come back and add a link...)</p>
]]></content:encoded>
            <author>Barnaby Keene</author>
        </item>
        <item>
            <title><![CDATA[What are Webauthn and Passkeys]]></title>
            <link>https://www.storyden.org/blog/what-is-webauthn-passkeys</link>
            <guid isPermaLink="false">/blog/what-is-webauthn-passkeys</guid>
            <pubDate>Sun, 28 Jan 2024 00:00:00 GMT</pubDate>
            <description><![CDATA[Storyden supports Passkeys as a fundamental authentication method. This allows users to register and sign in with just their device biometrics. This article discusses the pros and cons of Passkeys and how Storyden makes use of the technology to provide a secure and privacy friendly experience.]]></description>
            <content:encoded><![CDATA[<p>It&#39;s a little confusing because &quot;WebAuthn&quot; is the technical name, &quot;Authn&quot; is a shortening of &quot;Authentication&quot; because nobody wants to type long words any more (even though spicy autocomplete can do it for us now...?) and &quot;Passkey&quot; is the more &quot;branded&quot; and general term when displaying to users and customers. The terms technically aren&#39;t interchangeable, but for the sake of the user&#39;s experience and simplicity, they can be considered the same thing.</p>
<p>I won&#39;t go into the details of what it actually is. You can learn about that in <a href="https://css-tricks.com/passkeys-what-the-heck-and-why">this fantastic blog post by Neal Fennimore</a>. There are also some more technical articles that go into the details such as <a href="https://www.w3.org/TR/webauthn-2/">the W3 spec</a> and a <a href="https://web.dev/articles/webauthn-discoverable-credentials">web.dev article</a>.</p>
<h2>How and why Storyden uses Passkeys</h2>
<p>Storyden supports Passkeys as a fundamental authentication method. This means that you can use a Passkey to register and sign in to a Storyden site. This is a great way to keep your account secure and to make sure that you don&#39;t have to remember a password for every site you use.</p>
<h3>Email + Password is not the default any more</h3>
<p>Storyden&#39;s authentication model does not simply have an &quot;email&quot; and &quot;password&quot; column in its data model. Storyden is a platform for the next era of the web, so it would be foolish to fall into old ways, especially for authentication. Passwords can be enabled, but they are not assumed to be the default. You can operate a Storyden instance and completely disable passwords if you want.</p>
<p>Similarly, email is not considered a default either. The world is changing and people are much more conscious of privacy, security and technical monopolies on data. You may not want to sprinkle your email address all over the net. While there have been efforts by Apple, Mozilla, and others to create &#39;fake&#39; email addresses that forward to your real one, if you don&#39;t need someone&#39;s email, then why even require it? This is why Storyden operates on a username-first model, email is optional for transactional or newsletter purposes, but it&#39;s not a default.</p>
<p>As an operator of a Storyden community, you have the choice to use email + password as the login method for your members, but you also have the choice to allow full anonymity with a username + passkey combination.</p>
<h2>The benefits of Passkeys</h2>
<p>Passkeys are great in certain circumstances. They&#39;re not a silver bullet, but with growing support on devices, browsers and operating systems, they&#39;re becoming more and more viable as a default authentication method.</p>
<h3>Privacy</h3>
<p>The privacy implications are great, for users and administrators alike. By eliminating the need for passwords, it mitigates the risks associated with traditional authentication methods, such as phishing, brute force attacks, and password reuse. Which isn&#39;t just great for users, but also for operators and administrators because it reduces the responsibilities and risks associated with storing and managing passwords.</p>
<p>Because it utilizes public key cryptography to provide a secure and (almost) seamless way for users to access their accounts. This not only safeguards sensitive user data but also reduces the likelihood of unauthorized access to accounts - especially admin accounts!</p>
<h3>User-Friendly Experience</h3>
<p>In addition to bolstering security, WebAuth offers a user-friendly experience - assuming you&#39;re on a compatible dev ice. With no passwords to remember, users can enjoy a hassle-free login process. This not only streamlines access to Storyden but also eliminates the need for password resets and the associated frustrations.</p>
<p>It&#39;s not all perfect though, I&#39;ll get into why that is in the next section.</p>
<h2>The downsides of Passkeys</h2>
<p>Passkeys are great, but they&#39;re not perfect. There are some downsides to using them, and it&#39;s important to be aware of them before you decide to use them as a default authentication method.</p>
<h3>Dealing with multiple devices</h3>
<p>All services on which I&#39;ve recently enabled a Passkey have treated them as a form of 2FA rather than your primary login method. This isn&#39;t just because of support, but also difficulties with being locked out of your account. The spec does not define a way to recover accounts or synchronise keys. This is left up to vendors such as Apple, Microsoft and Google.</p>
<p>So, if you&#39;re <del>locked in to</del> enjoying the Apple ecosystem on all your devices, you&#39;re pretty much covered as iCloud&#39;s keychain will synchronise an end to end encrypted backup of your keys between your devices. This means I can sign up to a Storyden instance on my Macbook, then log in to that same account on my iPhone without having to do anything. It&#39;s pretty great.</p>
<p>Outside of the (walled) garden of eden though, it gets a bit tricky. Windows Hello supports keys, but the device sync is not clear if you have an Android or iPhone, you&#39;ll need to do some extra work to get that set up. Not everyone will know that&#39;s even a requirement, which introduces a risk of having your key only on one device without knowing how to transfer it. If you lose or factory-reset that device, you&#39;re kinda screwed.</p>
<h3>Password managers to the rescue?</h3>
<p>1Password and other password managers offer a pretty nice experience for this, but it&#39;s far from a widely adopted, and if you&#39;re already using a password manager, the benefits aren&#39;t immediately clear as a user.</p>
<p>I&#39;m a tech nerd so I&#39;m clued in on this stuff but that&#39;s a minority, not many folks I know outside of tech use password managers, so Passkeys could risk being locked out.</p>
<h3>Domain changes require careful planning</h3>
<p>Passkeys are tightly coupled to the domain name of the service. Partly to piggyback on the security of HTTPS and related domain verification, but the downside of this is that changing domains will not be so simple.</p>
<p>As an administrator of a platform, you may want to change domains (or need to) and currently, this process is not well defined. You need to do a bunch of work to prepare users for this change, and then those users also need to action that change on their devices. It&#39;s no secret that people rarely read product updates so communicating this is a challenge.</p>
<h2>How Storyden treats Passkeys</h2>
<p>Passkeys are great, but not a silver bullet. Storyden also doesn&#39;t want to enforce a password because ultimately, that&#39;s up to operators to decide. So, Storyden treats each authentication method equally and recommends that users set up at least two methods of authentication. These can be anything:</p>
<ul>
<li>a Passkey + a Web3 wallet</li>
<li>a password + a Google login</li>
<li>a Phone number and a Passkey</li>
</ul>
<h2>In conclusion</h2>
<p>Operators can enable whichever authentication methods they want, depending on their community&#39;s values and goals. If you want to go fully decentralised, you can use Passkeys and a Web3 wallet. If you don&#39;t mind using centralised services, an OAuth2 provider such as Google and Passkeys are fine, and if you want to support traditional email + password, you can do that too.</p>
<p><a href="/docs">Get started</a> with Storyden today and build a secure and privacy friendly community ready for the next era of the social web.</p>
]]></content:encoded>
            <author>Barnaby Keene</author>
        </item>
        <item>
            <title><![CDATA[The Architecture of Modern Forum Software]]></title>
            <link>https://www.storyden.org/blog/the-architecture-of-modern-forum-software</link>
            <guid isPermaLink="false">/blog/the-architecture-of-modern-forum-software</guid>
            <pubDate>Thu, 05 Oct 2023 00:00:00 GMT</pubDate>
            <description><![CDATA[How Storyden is architected for the modern web while making no compromises on compatibility, accessibility and speed for the next era of internet culture.]]></description>
            <content:encoded><![CDATA[<p>Storyden, that&#39;s the modern forum software I&#39;m referring to. Even though it&#39;s <a href="./what-are-social-bookmarks-link-aggregators">more</a> than just a forum! But anyway, let&#39;s get into the innards!</p>
<Callout>
  Some of what's discussed here is subject to change based on the decisions of
  contributors, user needs or other circumstances. Generally, what I value is
  the rationale behind the tools of choice rather than the tools themselves.
  Software is also known to expire and sometimes deprecated components need to
  be swapped out for security or usability concerns. This post should be updated
  if that does happen, but it's always worth checking the repository for the
  gory details.
</Callout>

<h2>Starting in the middle</h2>
<p>At the core, Storyden&#39;s behaviour is defined as an <a href="https://www.openapis.org/">OpenAPI</a> specification. This specification is hand-written, optimised for readability because it&#39;s <a href="https://github.com/Southclaws/storyden/blob/main/api/openapi.yaml">intended to be read</a>.</p>
<details>
  <summary>
    I'm very proud of the silly ASCII headers optimised for editor minimaps!
  </summary>
  ![silly fun ascii banners!](/blog/ascii-banners.png) [vscode extension for generating
  these here](https://marketplace.visualstudio.com/items?itemName=BitBelt.converttoasciiart)
  I use the font "Collosal" and I'm very certain these derive from a [very old website](https://patorjk.com/software/taag)
  I found as a kid by [Patrick Gillespie](https://patorjk.com/blog/about/)
</details>

<p>I opted to write the specification and generate the code because I really value static, declarative documents from which the boring bits can be mass-produced. I don&#39;t really enjoy writing <code>func(w http.ResponseWriter, r *http.Request)</code> functions by hand, dealing with decoding the JSON and encoding the responses and errors. OpenAPI allows me to work with functions that look like this:</p>
<pre><code class="language-go">AccountUpdate(ctx context.Context, request openapi.AccountUpdateRequestObject) (openapi.AccountUpdateResponseObject, error) {
  // access `request.Name`
  // respond with an `Account` struct
  // handle errors by returning (nil, err)
}
</code></pre>
<p>It also allows me to generate client code for both Golang (for end-to-end tests, my favourite flavour!) and TypeScript. The frontend for calling the above example looks like this:</p>
<pre><code class="language-ts">const updatedAccount = await accountUpdate({ name: &quot;Southclaws&quot; });
// updatedAccount: { name: &quot;Southclaws&quot;, ... }
</code></pre>
<p>Which eliminates a metric ton of work for me and other contributors!</p>
<p>But it&#39;s more than just a time saver, it&#39;s documentation and, most importantly, a <em>contract!</em> A contractual interface that&#39;s agreed upon by developers before diving into implementation details behind the interface.</p>
<p>Now I don&#39;t take the spec part <em>that</em> seriously, it&#39;s a useful layer to write some details about things that may not be obvious just from the operation name and parameters. But there&#39;s no formal MAY, SHOULD, MUST lingo in there, it&#39;s just a useful source of truth from which everything else is built on.</p>
<h3>Speaking of APIs...</h3>
<p>Also, this makes Storyden API-driven. You can bin the stock frontend and build your own if you want! You can also build other services in whatever language you want that call these APIs to automate certain tasks where WebAssembly plugins and integrations aren&#39;t quite enough.</p>
<h3>Content-type driven handlers</h3>
<p>Another neat thing you can do with OpenAPI is define HTML form friendly handlers. Most JSON APIs accept <code>application/json</code> from a <code>fetch</code> request. But if we want to support JS-less frontends that want to use HTML forms in all of their natural beauty, it&#39;s as simple as:</p>
<pre><code class="language-yaml">AccountUpdate:
  content:
    application/json:
      schema: { $ref: &quot;#/components/schemas/AccountMutableProps&quot; }
    application/x-www-form-urlencoded:
      schema: { $ref: &quot;#/components/schemas/AccountMutableProps&quot; }
</code></pre>
<p>This <a href="https://spec.openapis.org/oas/latest.html#request-body-object"><code>requestBody</code></a> schema permits two content types that use the exact same underlying schema. This re-use means certain endpoints <a href="#subset-of-endpoints-support-forms">✳︎</a> can trivially be set up to support non-JS clients as long as their HTML form field IDs match the fields documented in the schema.</p>
<p>HTML forms are important because some folks disable JavaScript for good reasons: privacy concerns, bandwidth constraints and device processing power.</p>
<Callout id="subset-of-endpoints-support-forms" emoji="*️⃣">
  Not every endpoint is currently worth supporting HTML forums because only a
  subset of basic functioanlity is implemented in the default Storyden frontend.
  In theory, it would be possible to support all functionality in some way but,
  currently (at the time of writing, 2023) I am but a sole developer and I must
  prioritise! Most of the time, interactive menus/modals/drawers/etc can be
  substituted for full standalone pages that implement a basic `<form>` with
  the same fields.
</Callout>

<h2>The Backend</h2>
<p>I&#39;ve already teased some Go code so if you&#39;ve not checked out <a href="https://github.com/Southclaws/storyden">the repository</a>, by now you can probably guess the language of choice.</p>
<p>I chose Go because I like Go, and the things I like about Go fit quite nicely into <a href="/blog/building-running-administrating-modern-forum-software#innovating-on-a-timeless-idea">Storyden&#39;s goals</a> particularly how you can compile it almost anywhere for almost anywhere else to a single binary. No deep trees of dynamic source files to package up, no version managers to get confused about, plus it&#39;s got a decent type system!</p>
<p>I won&#39;t go into more detail about the codebase itself, it&#39;s a pretty standard idiomatic Go codebase with a few opinionated bits like initialisation-time dependency injection, that&#39;s a topic for another post.</p>
<h3>The data model</h3>
<p>This section won&#39;t go into every single table but a brief overview of the most important parts.</p>
<h4>Posts</h4>
<p>The most interesting and important part of the model is the <code>posts</code> table. It&#39;s organised as a directed acyclic graph where each post has two parent relationships:</p>
<ul>
<li>root post:<ul>
<li>if the post is a reply within a thread then this is the first post in that thread</li>
<li>if the post is the start of a thread, this is empty<ul>
<li>you can find all threads by simply querying for posts with no root</li>
</ul>
</li>
</ul>
</li>
<li>reply-to:<ul>
<li>you can also reply to specific posts within a thread, independent of the root post</li>
<li>the reply-tree is similar in principle to that of Reddit, Hacker News or Lobste.rs</li>
</ul>
</li>
</ul>
<p>There are a few benefits to this approach, a lot of older forums would model &quot;Threads&quot; and &quot;Replies&quot; as two separate tables, but this often leads to some duplication of common fields as well as making certain operations a bit more awkward such as merging two threads, moving posts between threads or promoting posts to top-level threads.</p>
<p>There are some downsides though, Threads and Posts are not identical so there are a few fields that are only used for one but not the other (such as <code>slug</code> and <code>title</code>.)</p>
<h4>Authentication</h4>
<p>If you look at the &quot;Account&quot; schema, you may notice the lack of two fields that are usually a standard in any database schema with a &quot;user&quot; model:</p>
<ul>
<li>email</li>
<li>password</li>
</ul>
<p>This omission is intentional. While <a href="/blog/building-running-administrating-modern-forum-software">ideating</a> Storyden, one of the values I chose was that Storyden is a platform the the next era of internet culture (or something like that...) and the two things I&#39;m not entirely certain will be guaranteed in 20 years time are emails and passwords.</p>
<p>Okay, the emails one is a stretch, but passwords <a href="/blog/what-is-webauthn-passkeys#email--password-is-not-the-default-any-more">I strongly believe should be optional</a>.</p>
<p>And so, this fact is true right down to the data model. Instead of encoding these concepts as fundamentals on the account table, they exist as &quot;Authentication methods&quot; which use a separate table on a one-to-many basis against accounts.</p>
<p>This makes it trivial to facilitate a <a href="/blog/what-is-webauthn-passkeys#how-storyden-treats-passkeys">choice</a> of authentication methods for each account and allows individual communities to customise exactly how they want to allow users to register and log in.</p>
<h3>Database tools: SQL and Ent</h3>
<p>I have a <a href="https://southcla.ws/sql">complicated</a> relationship (ha!) with relational databases, but it&#39;s a very necessary evil for such a project. While I tend to avoid pasting raw SQL into string-literals in favour of code-generated type-safe interfaces, there&#39;s a healthy balance of both.</p>
<p><a href="https://entgo.io">Ent</a> does most of the CRUD legwork, raw SQL does anything that requires a recursive CTE or an optimised join. That&#39;s really all there is to it. The schema has no migration strategy at the time of writing, but this will likely become a necessity as the product matures. I&#39;ll likely choose <a href="https://atlasgo.io">Atlas</a> for that task but suggestions are always welcome!</p>
<p>The main reason I chose Ent was the code generation part, the vast majority of boring queries are CRUD and don&#39;t really require too much complication or custom code. Ent also generates the structs too and provides a fairly neat way to traverse the graph of relations.</p>
<h3>API: OpenAPI generated code</h3>
<p>The OpenAPI specification mentioned above is turned into Go code using a library called <a href="https://github.com/deepmap/oapi-codegen">oapi-codegen</a> which does a decent job of generating all the schemas and interface. All developers need to do is satisfy the interface.</p>
<h2>The Frontend</h2>
<p>My frontend tool of choice hasn&#39;t really changed since I started doing frontend work professionally. Storyden uses Next.js because I like React but also like server-side rendering and shipping HTML.</p>
<p>Next.js has had an admittedly rocky 2023 since the &quot;App directory&quot; chaos and there are lots of new frameworks on the block trying to dethrone it, but I&#39;ve never really been one to hop between frameworks (terrible frontend dev, aren&#39;t I?)</p>
<h3>User interface</h3>
<p>My weapon of choice for styling is <a href="https://panda-css.com">Panda</a>, which by no surprise is a code-generation tool. Panda allows you to specify a design system as a (fairly) declarative document and generate all the code and CSS statically. This means the frontend doesn&#39;t need to run JavaScript to style things.</p>
<p>Which may sound odd but... look, the frontend world has had a rough decade okay!</p>
<p>So we&#39;re shipping static HTML and static CSS like <a href="https://thebestmotherfucking.website">the Good Old Days</a>. Great! But how do you <em>make it pop</em> when everything is static?</p>
<p>Well, I lied, it&#39;s not all static, it&#39;s <a href="https://www.gov.uk/service-manual/technology/using-progressive-enhancement">✨ Progressively Enhanced 💫</a> <em>(very few things make me proud to be British, but the government design system is just amazing)</em> which means static HTML and CSS gets sent to the browser to render everything fast, then bits of JavaScript join the party a little later to jazz it up a bit.</p>
<p>What this means in reality is we can have all the bells and whistles of what you&#39;d expect from a modern web <strong>application</strong> while still retaining the qualities of what makes a great web <strong>site</strong>.</p>
<h4>Ark UI</h4>
<p>For the actual components, I&#39;ve chosen <a href="https://ark-ui.com">Ark UI</a> which is a neat little headless component library providing all the standard widgets you might expect on a user interface. It pairs quite nicely with Panda CSS and together these two tools power the entire layout and interface elements of Storyden.</p>
<Callout>
  Chakra, Panda and Ark are all from [the same amazing
  team](https://github.com/chakra-ui)!
</Callout>

<h4>The road from Chakra to Panda</h4>
<p>A short side note, Storyden (and most of my products) started life with <a href="https://chakra-ui.com">Chakra UI</a>, which is an amazing library by the very talented Segun Adebayo. For various reasons I chose to move away from Chakra UI after Segun published <a href="https://www.adebayosegun.com/blog/the-future-of-chakra-ui#zero-runtime-css-in-js-panda">this post</a> and I discovered that Panda is a better fit for the project.</p>
<p>To learn more about why Storyden moved from, <a href="https://twitter.com/Southclaws/status/1742274927133151614">I wrote a short thread about that</a>. And if you&#39;re interested in the technical details of <em>how</em> to migrate from Chakra UI to Panda CSS, <a href="https://southcla.ws/how-to-migrate-from-chakra-ui-to-panda-css">I also wrote a guide</a>!</p>
<h3>SWR</h3>
<p>The underlying request state for the React code is managed by <a href="https://swr.vercel.app">SWR</a>, a neat little library from the Vercel team which I fondly remember the release of. It does a few handy things that facilitate instantaneous reactivity to interactions that result in mutations and data access.</p>
<p>I won&#39;t go into the details but it&#39;s a fantastic tool for building web applications that feel like local apps.</p>
<h3>How OpenAPI is used</h3>
<p>For the client code generation, I chose a tool called <a href="https://orval.dev">Orval</a> which generates code which utilises <a href="#SWR">SWR</a> as well as all the TypeScript types that match the OpenAPI schemas and the Go structs on the other end.</p>
<h4>Data retrieval</h4>
<p>Getting data (via GET requests) is done via hooks that look like this:</p>
<pre><code class="language-ts">export const useAccountGet = &lt;
  TError =
    | UnauthorisedResponse
    | NotFoundResponse
    | InternalServerErrorResponse,
&gt;(options?: {
  swr?: SWRConfiguration&lt;Awaited&lt;ReturnType&lt;typeof accountGet&gt;&gt;, TError&gt; &amp; {
    swrKey?: Key;
    enabled?: boolean;
  };
}) =&gt; {
  const { swr: swrOptions } = options ?? {};

  const isEnabled = swrOptions?.enabled !== false;
  const swrKey =
    swrOptions?.swrKey ?? (() =&gt; (isEnabled ? getAccountGetKey() : null));
  const swrFn = () =&gt; accountGet();

  const query = useSwr&lt;Awaited&lt;ReturnType&lt;typeof swrFn&gt;&gt;, TError&gt;(
    swrKey,
    swrFn,
    swrOptions
  );

  return {
    swrKey,
    ...query,
  };
};
</code></pre>
<p>Which roughly just wrap a <code>useSwr</code> hook call and sprinkle in some type annotations.</p>
<p>Note that there&#39;s no actual schema validation happening here with a tool such as <a href="https://zod.dev">Zod</a> because the assumption is that the backend is conforming to the OpenAPI specification too. Given that Storyden is in control of both sides of this in the monorepo, it&#39;s a compromise I&#39;m willing to make.</p>
<h4>Data mutation</h4>
<p>Mutations to data such as create, update and delete (POST, PUT, PATCH and DELETE) are done via functions that look like this:</p>
<pre><code class="language-ts">/**
 * Update the information for the currently authenticated account.
 */
export const accountUpdate = (accountUpdateBody: AccountUpdateBody) =&gt; {
  return fetcher&lt;AccountUpdateOKResponse&gt;({
    url: `/accounts`,
    method: &quot;patch&quot;,
    headers: { &quot;Content-Type&quot;: &quot;application/json&quot; },
    data: accountUpdateBody,
  });
};
</code></pre>
<p>Which can be easily called in event handlers such as button clicks or form submissions, etc. The <code>fetcher</code> is a client written by hand which handles a few extra details such as CORS, cookies and errors.</p>
<h4>Server Side Rendering</h4>
<p>SSR and RSC are a hot topic right now, but I won&#39;t go into why. I&#39;m bullish on it and I find the mental model productive (though the reality is a little rough around the edges.)</p>
<p>For a full rundown, I highly recommend <a href="https://www.joshwcomeau.com/react/server-components/">this post by Josh Comeau</a>!</p>
<p>Storyden&#39;s view of this is that <em>any</em> content consumption screen <strong>must</strong> be server side rendered. That is any feed of posts and the posts themselves, as well as other stuff like the knowledgebase and people&#39;s profiles.</p>
<p>How this works in the code is all pages start life as <code>async</code> function components:</p>
<pre><code class="language-tsx">export async function FeedScreen(props: Props) {
  const data = await server&lt;ThreadListOKResponse&gt;({
    url: `/threads`,
    params: {
      categories: [props.category],
    } as ThreadListParams,
  });

  return &lt;Client category={props.category} threads={data.threads} /&gt;;
}
</code></pre>
<p>This performs the initial API call with any query parameters passed in from the Next.js page load. It then passes the result to a component called <code>Client</code> which is in another file.</p>
<Callout>
  One thing that's important about Next.js is that there are **two** trees it
  cares about: the component tree and the module tree. How these trees are
  structured has ramifications on how server side components work.
</Callout>

<p><code>Client</code> which is defined in a sibling module looks like something like this:</p>
<pre><code class="language-tsx">&quot;use client&quot;;

export function Client(props: { category: string; threads: ThreadList }) {
  const { data, error } = useThreadList(
    {
      categories: [props.category],
    },
    {
      swr: {
        fallbackData: props.threads &amp;&amp; { threads: props.threads },
      },
    }
  );

  if (!data) return &lt;Unready {...error} /&gt;;

  return &lt;MixedPostList posts={data?.threads} /&gt;;
}
</code></pre>
<p>As outlined earlier, these generated hooks such as <code>useThreadList</code> are thin wrappers around <code>useSwr</code> so there are a few important things happening here:</p>
<ul>
<li>the first argument contains the query parameters for the actual API endpoint, these are often the same as the parameters in the browser&#39;s address bar.</li>
<li>the <code>swr</code> option in the second argument means the hook will immediately return the provided data while revalidating in the background. This is called <a href="https://swr.vercel.app/docs/prefetching.en-US#pre-fill-data">Pre-fill data</a> and it allows this client component to be rendered server-side using the data fetched in the server-only component above but continue to provide the benefits of SWR when it renders on the client.</li>
<li>because we&#39;re using <code>fallbackData</code>, the <code>data</code> part of the return value is <em>always</em> present but TypeScript forces us to check due to the discriminated union return type.</li>
<li><code>MixedPostList</code> renders immediately with the data we have on the server<ul>
<li>once the browser renders this, it&#39;ll render again after <code>useSwr</code> has re-fetched</li>
</ul>
</li>
</ul>
<p>Most screens in Storyden follow this pattern, with some extra bits that make certain things easier such as mutations and pagination. But it&#39;s pretty much the same idea throughout.</p>
<h2>Conclusion</h2>
<p>Ultimately, my goal is to make Storyden secure, modern and easy to contribute to. There&#39;s not much more to say on this, but feedback is always welcome so if you have opinions or thoughts on the direction any of this should move in, <a href="https://github.com/Southclaws/storyden/issues">open an issue</a>!</p>
]]></content:encoded>
            <author>Barnaby Keene</author>
        </item>
        <item>
            <title><![CDATA[What Are Social Bookmarks Link Aggregators]]></title>
            <link>https://www.storyden.org/blog/what-are-social-bookmarks-link-aggregators</link>
            <guid isPermaLink="false">/blog/what-are-social-bookmarks-link-aggregators</guid>
            <pubDate>Mon, 14 Aug 2023 00:00:00 GMT</pubDate>
            <description><![CDATA[Link aggregation is a great way to build a knowledgebase through social bookmarking. Find out how Storyden can be the Library of Alexandria for your community in this article.]]></description>
            <content:encoded><![CDATA[<p>We&#39;ve all probably used bookmarks on the web, it&#39;s the most basic way to save a link. And there&#39;s a whole host of tools out there that make it easier, like <a href="https://getpocket.com">Pocket</a>, <a href="https://www.instapaper.com/">Instapaper</a> and <a href="https://hq.getmatter.com">Matter</a>.</p>
<p>Bookmarks are great for collecting things you&#39;ve read or want to read or are currently reading. And the ecosystem of apps built around this idea offer really great tools for organising those bookmarks.</p>
<p>Similarly, we send links all the time. Send them to friends, email them to coworkers, post in a Slack or Discord for others to enjoy. Sharing links is one of the most basic behaviours of &quot;The Social Web&quot; and ever since we started to utilise the internet for more than just static documents, &quot;Share&quot; has been a part of the lexicon.</p>
<h2>Enter: Link Aggregators</h2>
<p>Link aggregators are platforms that compile and organize hyperlinks to online content. They have become a significant part of the internet landscape. You may have opened this article from one!</p>
<ul>
<li><p><strong>Early 2000s: Emergence of Link Aggregators:</strong> Sites like Slashdot, Digg and Fark, on which members posted and commented on links to news and other pages and the community voted or commented on them.</p>
</li>
<li><p><strong>Mid-2000s: Rise of Reddit and Digg:</strong> Reddit, founded in 2005, and Digg, launched in 2004, became two of the most influential link aggregators. They introduced more sophisticated systems for voting and commenting, making user engagement a crucial part of content curation.</p>
</li>
<li><p><strong>Late 2000s to Early 2010s: Evolution and Competition:</strong> The landscape started to change with a new era. Digg, for example, underwent several controversial redesigns. Reddit continued to grow, becoming the more dominant platform, partly due to its community-focused approach and diverse range of user-generated subreddits.</p>
</li>
<li><p><strong>Mid-2010s Onwards: Diversification and Specialisation:</strong> Newer platforms like Voat or Product Hunt emerged for more specialised niches. Both political polarisation and the need for more purpose-built sites lead to new platforms with more focused audiences.</p>
</li>
<li><p><strong>2020s plus: Live and kicking:</strong> Link aggregators are very much alive and serving their communities well. But also many people are comfortable inside Discord, Slack, WhatsApp, etc. and these environments can make it difficult to surface useful resources via search tools.</p>
</li>
</ul>
<p>Throughout their history, the evolution of link aggregators reflects a desire for aggregating niche-specific material, essentially creating little Google search indexes specifically for one topic.</p>
<p><img src="/blog/reddit-users.png" alt="Reddit monthly active users from 2013 to 2022"></p>
<blockquote>
<p>Source: <a href="https://www.businessofapps.com/data/reddit-statistics">https://www.businessofapps.com/data/reddit-statistics</a></p>
</blockquote>
<h2>Knowledgebases and Social Bookmarks</h2>
<p><img src="/blog/bookmarks-folder.png" alt="Bookmarks folder in Chrome"></p>
<p>While centralised social networks like Reddit are great for simple communities sharing news stories and keeping lists of relevant resources, they can often be outgrown by more specific needs of the community itself.</p>
<p>Quite a few exmaples of this have resulted in completely bespoke sites being built from scratch, <a href="https://www.producthunt.com">Product Hunt</a> being a good example of this.</p>
<p>Knowledgebases have also played a big part in internet culture. &quot;Wiki&quot; style sites often spring up alongside fandoms and games to catalogue characters, episodes, levels, loot items, etc. and the most popular player in this space is probably <a href="https://fandom.com">Wikia, which later rebranded to Fandom</a>.</p>
<p><img src="/blog/fandom.png" alt="The dated mediawiki syntax editor"></p>
<p>Then there&#39;s these new-age personal knowledge management tools like <a href="https://notion.so">Notion</a> which provide a minimal-looking but incredibly powerful set of composable tools to build quite complex notebooks, databases and even websites. The downside of platforms like Notion is that they are focused on internal teams rather than public communities.</p>
<h2>Putting it all together</h2>
<p>Storyden (and here&#39;s the sell!) aims to combine these together into a platform that any community, no matter how big or how niche, can collect, organise and discuss resources. Whether you&#39;re mainly on Discord, Slack or WhatsApp, Storyden provides a source of truth to collect, tag and catalogue links and other content then make it all searchable for members as well as contribute to SEO!</p>
<p>There&#39;s also some AI sprinkled in there but that&#39;s for another post 👀</p>
]]></content:encoded>
            <author>Barnaby Keene</author>
        </item>
        <item>
            <title><![CDATA[Building, Running, and Administrating Modern Forum Software]]></title>
            <link>https://www.storyden.org/blog/building-running-administrating-modern-forum-software</link>
            <guid isPermaLink="false">/blog/building-running-administrating-modern-forum-software</guid>
            <pubDate>Fri, 03 Feb 2023 00:00:00 GMT</pubDate>
            <description><![CDATA[Why build a forum in 2023? What role do internet forums play today and what's the future of internet communities in an age of Reddit, WhatsApp and Mastodon?]]></description>
            <content:encoded><![CDATA[<p>With a convergeance on large platforms like Reddit and Discord, the main question I&#39;ve had asked when discussing Storyden is why bother?</p>
<p>Most of the time, that question occurs on Discord, and is asked by a friend who, like me, grew up on internet forums.</p>
<p>And while I&#39;m a lover of Discord (it&#39;s my community platform of choice) I&#39;m still bullish on <em>independent</em> and niche-specific bulletin board style sites having a place in the current and next eras of internet culture.</p>
<p>So here it is, Storyden is forum software for the modern age. Inspired by traditional boards like SMF and Discourse but with a focus on the needs and desires of a modern web.</p>
<p>If that piques your interest, please read on...</p>
<h2>State of the Art</h2>
<p>You can&#39;t talk about forums without mentioning Discourse, the current dominant player in the market. While a great piece of software, its main use-case in the wild appears to be customer support forums. That&#39;s a great niche and probably quite a profitable B2B market.</p>
<p>Then the list tails off from there, there&#39;s NodeBB, MyBB, phpBB, Invision Pro Boards, Vanilla and Forem boasting a nice mix of JavaScript, PHP and Ruby for the languages of choice.</p>
<p>I&#39;ve tried all of these, either to evaluate or use in production for a community of my own. They&#39;re all great in their own way and have skilled teams of contributors behind them.</p>
<h2>Time for something fresh?</h2>
<p>However, some common themes arose:</p>
<ul>
<li>complicated to deploy</li>
<li>difficult to containerise</li>
<li>ageing technologies/languages</li>
<li>lacking accessibility features for those with visual impairments</li>
<li>resource-intensive</li>
</ul>
<p>Which, aren&#39;t on their own issues at all and I&#39;m definitely not the one to shun software just because it&#39;s &quot;old&quot;. I&#39;ve often said I&#39;d be happy if some apps I use never received another new feature and I&#39;m perfectly happy with declaring products &quot;finished&quot; (constantly moving forward can be exhausting!)</p>
<p>But, at the same time, there&#39;s a lot of cruft in older software, dated terminology, unused features cluttering up settings pages and generally bits of technical debt that make adopting new ideas more difficult.</p>
<p>Some great friends (who I of course knew from online forums!) decided there was a big enough gap to build something interesting in the open source world.</p>
<p>And after lots of discussion, some hacking around and lots of coffee, <a href="https://github.com/Southclaws/storyden/issues/1">issue #1</a> was created with a rough plan of action.</p>
<h2>Innovating on a timeless idea</h2>
<p>I, and many others would argue that the world doesn&#39;t need <em>yet another</em> xyzBB. Another threads-posts-comments feed sorted by most recent reply. Another place to sign up, forget your password and never use because you&#39;re more active in Discord anyway.</p>
<Callout type="info" emoji="ℹ️">
  I would be also very happy to be proven wrong about this biased assumption!
  [Feedback is massively appreciated.](https://airtable.com/shrLY0jDp9CuXPB2X)
</Callout>

<p>Most folks I know who are active in &quot;communities&quot; are on either Discord, Slack or WhatsApp. And there&#39;s rarely a reason for those people to close one of those apps and open up another just to share a link or ask a question. &quot;You can&#39;t replace email&quot; is the adage that comes to mind.</p>
<p>While there&#39;s definitely a market for using bulletin boards which Storyden aims to fill with its own unique take, there must be more to it than just threads and posts.</p>
<p>Storyden&#39;s north star is facilitating the aggregation of communal knowledge. That may look like a bulletin board to some and a Slack bot to another, but what&#39;s most important is that whatever the solution is, it must be:</p>
<ul>
<li>Extremely easy to deploy anywhere</li>
<li>Built for the next decade(s) of technology</li>
<li>Accessible to all, regardless of device specs or ability</li>
<li>Cheap to run, energy-efficient and fast</li>
<li>Infinitely extensible with plugins or bring-your-own-frontend</li>
</ul>
<h3>The ingredients</h3>
<p>In no particular order, caffeinated and opinionated, this is my shortlist of how we get there:</p>
<ul>
<li>Single static binary: Golang wins hands down here</li>
<li>Container by default: it&#39;s how myself and everyone I know deploys nowadays</li>
<li>No complicated dependenies: the bare minimum production setup runs on SQLite</li>
<li>Progressively enhanced: the basics should work whether you&#39;re connecting from a city, a forest or a ship</li>
<li>Accessible to anyone: doesn&#39;t just mean the odd aria-role but ensuring things work for everyone</li>
<li>API driven: extensions, automations and building your own frontend are easier this way</li>
<li>Next.js for the default frontend: it&#39;s popular, well supported and I know it well!</li>
</ul>
<h2>Summary</h2>
<p>In summary, I&#39;m aiming to build something that&#39;s small, simple but scales well. It&#39;s a big task but I&#39;m confident it&#39;ll be a rewarding journey! I hope that being a fully open source project not only encourages contributions but also provides a quality codebase for others to learn from.</p>
<p>More posts to come detailing the inner workings and some decisions behind the project!</p>
]]></content:encoded>
            <author>Barnaby Keene</author>
        </item>
    </channel>
</rss>