Too Many Config Files

I don’t have a great deal of experience with Clojure, but I have done a couple of little projects in Clojure recently, an I was struck by how easy Leiningen was to use. It’s beautifully simple compared to pretty much every other build tool I’ve seen.

A lot of repos have a huge number of config files. It used to be more often XML, but usually these days it’s JSON files, or YAML. It’s quite ridiculous really. I’ve seen some very small JavaScript projects which are tiny NPM packages, really simple programs, and yet it will have at least a dozen of these JSON or JS config files. All sorts of files telling various versions of various bits of tooling and linters and all sorts of crumbs that people have on their machines how to work. It’s an eyesore and it’s polluting your repo.

Why are you providing configurations for a such large number of tools?

  • Most people will not have all of these tools installed.
  • It makes it harder to see what’s going on.

The files uploaded to the repo should be pretty trim. You should be uploading as few as possible because you don’t want to artificially increase the cognitive load required to understand your repo. It becomes hard to see what you need and what you don’t. It becomes hard to see what’s really a dependency and what’s really just a preference of the author.

What I like to see is a simple file hierarchy, then at the top level: a readme.md, a .gitignore, maybe a makefile, and a build tool specific config file. And then beyond that, not much.

If you give me a nix config, a stack file, a dockerfile, a cabal file, and several JSONs and YAMLs, the first thing I have to do is figure out if you are giving me a choice of tools, or do I actually need all of them. I would hope that the readme would clear this up, but alas often it doesn’t make the matter clearer. If you really need shed-loads of config files, maybe some of them could be moved into a ./config directory?

Often it’s a really, really simple program; A tiny NVM package or a simple command line tool. The program does what it’s meant to do and might actually be quite nicely built, well written, and concise. So why does it have so many config files?

The Other Thing

Almost as bad as having too many top-level files is an unnecessarily deep file structure. Java ofcourse encourages you to create ludicrously nested directory hierarchies. I am not completely against Java’s approach here. Forcing new folders for each module does prevent everything being dumped in one big folder (unless ofcourse everything is in one big module). The root problem is the Java penchant for module names that are composed of many segments (e.g. com.example.data.map.hashmap.int.traversal). Haskell also uses nested folders in the same way, but the namespaces tend to have far fewer segments, usually just two or three. The problem is a Java cultural one, that a program isn’t real ENTERPRISE^TM Development if there aren’t at least fourteen segments in the namespace.

I want a directory structure that makes it clear where the application’s main source code and entry point is, where the library functions are, and where the tests are. I do not want to see a really deep nested tree of folders. I do not want to see too many files at the top level.

There should only actually be a deep nested tree if there are a lot of files. And there should only actually be a lot of files if it really is a big application, that’s actually doing lots of complicated stuff. Is it complicated, or are you making it complicated?

Overall

The key point here is: please don’t artificially increase the cognitive load required to understand the repo for your otherwise simple program.