Content Organization for Hugo

Having used Hugo for a some years now, while creating a few dozen entries, I have discovered a few “best practices” for organizing the raw (Markdown) content.

Consider a Directory for Each Post

Typically, each post (or page) on the published site corresponds to a single Markdown file in the contents directory. However, it is also possible to create a directory for each post: in this case, the Markdown input goes into a file called index.md in that directory. (The Hugo documentation refers to such an arrangement as a page “bundle”.) This arrangement is advantageous if there is supplemental material for the post, such as images, listings, or other files, because it keeps related items together.

It is somewhat natural now to create directories only “when needed” (that is, when there is supplementary material for a post), and to leave posts without such material as plain Markdown files. But his results in a somewhat heterogeneous directory layout. It also requires a little bit of rework (including renaming of files) when a post acquires additional files later on. For example, I have found it quite common to have an idea for a figure, or thumbnail, only while writing the post, not before (when I create the file itself).

To avoid this, it is worth considering to create a directory for each post by default. If there is no supplementary material, the directory contains only the raw input in index.md, but if, now or later, there ever are any other files, they have a natural place to go already.

Obviously, this is not necessary when it is certain that individual posts will never have supplementary files. But otherwise, always creating a directory for each post may be convenient.

Unpublished Directories for Backup Material

Having a directory for each post provides a logical destination not only for additional files that are to be published as part of the post, but also for “backup material” that should not be published. For instance, I occasionally have scripts or data files that I use to create an image, but that are not part of the blog post itself.

For such files, I have found it useful to create a subdirectory, called hugo-ignore, in each post’s directory. In config.toml, I then include a line:

ignoreFiles = [ "hugo-ignore" ]

that prevents this directory, and its contents, from being published when Hugo is run to create the public site.

I have found it convenient to have such an unpublished, ignored directory for each post — not only for backup material, but also as a general “working directory”, for notes, script outputs, and other work-in-progress.

Unsolved: Unstructured Parent Directory

Both of the previous items help to organize the content for each post, but they do not help organizing the posts themselves. For example, all posts (or their per-post directories) for the “Blog” section of this site go into a single directory. Hugo then creates yearly URLs (“permalinks”) for them, based on their publishing date and information in each post’s frontmatter. But within the actual source directory, all posts live in the same, unstructured namespace. That can make it difficult to find a specific post. It also (occasionally) leads to naming conflicts.

In retrospect, I might have preferred to organize the input files into yearly subdirectories, rather than relying on Hugo’s URL rewriting functionality. Or maybe not. Having to search through a bunch of yearly subdirectories can also quickly become tedious.

The optimal way to organize the input material for a growing number of blog posts is yet to be determined.

Tool to Parse Frontmatter?

It might be a good idea to have a tool that searches the frontmatter of Hugo input files, and can report on files based on tag or publication date. Such a tool should understand the input directory layout (including the possibility of per-post directories!), as well as the frontmatter format. This would not be difficult.

(It would be even better if Hugo would publish such metadata in an accessible format, but apparently this practice is not common among Jamstack tools.)

Use Archetypes as Style Guides

Hugo provides for “archetypes”: file snippets that Hugo uses to initialize new posts trough the hugo new command. Although archetypes are usually fairly bare, providing just a skeleton outline of the desired file, I have found it useful to include a great deal of information, most of which I remove, before finalizing a post for publication.

For example, I include a list of all “categories” (or tags) that I commonly use: to make sure I don’t introduce spurious new ones through misspellings. I also include information on conventions and style that I might otherwise forget. Rather than keeping this information in a separate file (which I would have to find and check), by including this information in the archetype it is already part of the post as I am authoring it. When the post is finalized, and I have checked it for consistency, this auxiliary information is removed from the finished post, prior to publication.