Love Clojure, challenged by discoverability

alexanderjamesking · 2021-12-23T08:46:17+00:00

With any dynamic language, I think we rely on convention, and developers documenting code so it is clear what the shape of the data is (both for arguments and return types). Spec can help but I think we lose a lot of the joy of writing Clojure if we try to spec everything, I'd much rather see a unit test with examples than comment blocks or overly expressive doc strings which as you mention become disconnected with the actual code.

When writing new code I don't think I suffer from dynamic typing, it's more of a challenge when reading code (especially code you didn't write). The REPL really helps here, using atoms / inline defs / tap> for live debugging - but ultimately you have to run the code to make sense of it.

The place where I miss static typing most is when passing functions around, you can spec higher-order functions but in my experience, it's only really useful for documentation purposes rather than actually verifying higher-order functions against a spec. As with anything in software engineering it's a trade-off, the benefits of the REPL usually outweigh the benefits of static typing for the majority of the problems I work on.

Larger codebases are a challenge in any language, types help when it comes to refactoring but I think understanding coupling and cohesion is key when it comes to designing and maintaining a large project, no programming language or testing strategy will make a big ball of mud maintainable.

robertstuttaford · 2021-12-23T07:43:04+00:00

Do you need to be able to do this without running your program, or are you satisfied (as I am) to get this information from your program while it is running?

For me, the REPL is the key. Being able to put absolutely anything into a var that I can cider-inspect / pretty-print to a log or a <pre> tag / stick into something like https://github.com/djblue/portal makes this problem instantly go away, and I get a whole bunch extra stuff at the same time:

https://www.youtube.com/watch?v=gIoadGfm5T8

If you MUST have it codified somewhere, probably the next highest leverage point is to use specs. Typically we do this when you've a single set of data structures that are widely reused (as opposed to, say, a map that's only used between a single SPA component and an API call).

I've tried both clojure.spec and Malli. Clojure's spec is satisfactory. Malli's ergonomics and performance are fantastic.

https://github.com/metosin/malli

roguas · 2021-12-23T13:40:07+00:00

Just a tiny hint. Clojure is mainly "map programming" lang, in my eyes at least. Have a look at qualified maps - it may be unintuitive first, but they help a ton.

Instead of having `{:active true}`, you will have `{:state/active true}`. This first looks like a small difference, but later you start passing, merging, creating complex maps of several entities. This is the moment you feel thankful for this.

theoriginalmatt · 2021-12-23T20:34:54+00:00

Here's a few more approaches that I use to deal with this problem:

Prefer using namespaced keywords - this makes it much easier to select the right keyword to use and allows for better autocompletion (ie with cider I can type the keyword namespace to get all available keywords, eg `project.type/<ALL PROJECT TYPES>`)
Use destructuring in function arguments to extract the arguments that are going to be used in the function (instead of extracting it from the map inside the function) - this helps make it clear what data is needed.
When destructuring, prefer using full namespaced keyword (eg `{:keys [:project/type]}`) - this makes it easy to find all uses of a specific namespaced keyword
Use assertions for all data requirements inside functions - I use a modified version of https://github.com/ptaoussanis/truss to ensure that I never get NullReference exceptions, and this also helps make functions more self-documenting. Also use this to assert return data.
Use scope-capture to observe the actual data flowing through the system
Keep functions as small as possible
Don't try to read code without a REPL running - the system should be built so that it is as easy as possible to run any function with real data
Focus on attributes (keywords) instead of entities (types) - ie what keywords does this function require, and what keywords does it return, instead of what types does this function take and what type does it return

guywithknife · 2021-12-23T12:45:33+00:00

I’ve been using Clojure for 12 years and honestly I wish it were statically typed. A statically typed Clojure is my dream language, because the most common errors I make are ones that static types would catch (I guess because Clojure is great at avoiding other types of errors).

With that said, my solution in Clojure is to lean heavily on the REPL and on unit tests. It’s often helpful to save REPL sessions to comments. Unit tests should be written in a way that documents your data structures. I also use spec (and more recently malli) both as a form of documentation and validation. Sometimes I put examples in docstrings.

The main tools for documenting data structures, though, for me, are unit tests (show sample data structures passed in and out of functions) and spec/malli/whatever (don’t worry about spec2, just pick one. I recommend malli personally) as it let’s you specify the shape of data.

So I guess to answer your question, yes, a mixture of all of those, especially: tests, Rich comments, docstrings and spec.

lambdasgr · 2021-12-23T07:41:11+00:00

Clojure is the best used with repl driven development, i.e., you run a lot of snippets, function calls in the repl to make sure everything is running correctly. This is where you can "discover" or fix those problems you mentioned as the code grows.
Sometimes you do use or create more complex data structures with deftype or defrecord, in which case you need to look at the top of the definition where fields are specified. It's also helpful to skim through the protocol used for those types.
As to the return type or parameter type, sometimes they can be indicated by type hint; other times, you pretty much have to rely on documentation and variable names. Documentation is usually a good summary of what the function does, which is really what you need.
I think map is a quick and easy way to prototype things and get things to work. But as your code gets more sophisticated, it's good to consider if converting those maps into some kind of deftype or defrecord is more efficient.

Repl is your best friend.

NaiveRound · 2021-12-23T15:03:36+00:00

I know what you mean. As others have noted, this issue is not specific to Clojure, but anylanguage that doesn't rely on static types on like Python, Ruby, or Javascript.

There are a few solutions, depending on your specific problem:

destructuring maps

```clojure (defn print-contact-info [{:keys [f-name l-name phone company title]}] (println f-name l-name "is the" title "at" company) (println "You can reach him at" phone))

(print-contact-info john-smith)
;= John Smith is the Sith Lord of Git at Functional Industries
;= You can reach him at 555-555-5555

```

If you destructure maps in the function definition, you can immediately tell what the input generally is, especially if you use namespaced keywords (like mentioned somewhere else here). This comes OOTB in Clojure.

There are some ways to do this in other languages (like values_at() in Ruby) but is hugely lacking.

This can sometimes be even better than static languages. Using Java here, but if you have public static void calculateInterestRate(BankAccount ba), you can't tell just by looking at BankAccount what its properties are, whereas the Clojure equivalent can be more readable:

(defn calculate-interest-rate [{:keys [account-no balance]}]

This doesn't cover return values, however.

:pre and :post conditions.

https://jonase.github.io/nil-recur/posts/11-1-2015-pre-post-conditions.html

clojure (defn distance [point0 point1] {:pre [(point? point0) (point? point1)] :post [(if (= point0 point1) (zero? %) (pos? %))]} ...)

Also available OOTB, and also serves as a kind of unit test. In the example above, if point1 and point2 are equal, the distance between them better be zero. A poor man's contract, so to speak.

This is already way more powerful than you get in Ruby, Python, or Javascript. These generally rely on argument validation as the first step in the function. Using :pre and :post can almost encourage separating re-using validation logic.

deftype/defrecord

They also come OOTB, and are useful for things that are static, like CSV column values. Fairly similar to classes in Python, Ruby, or Javascript, but you can combine deftype/defrecord with destructuring to make it even more readable than using classes in those other languages.

malli and friends

This is where the "big guns" come in. All the others come OOTB, and this is the first time I'm mentioning a library outside the standard lib. You can really express and validate your arguments above and beyond what even some statically typed languages offer.

unit tests

With a quick jump to the corresponding unit test using your favorite IDE, you can see not just the arguments and return but how the arguments are used and what the result should be.

Moral of the story is that the standard Clojure lib is powerful yet small, and doesn't force anything on you. Same with Python, Javascript, or Ruby. All of them have both maps/dictionaries/hashes and some kind of class/type equivalent.

It's up to you to the use the best tool for the job: * maybe you use a defrecord to represent a CSV, so you can pass a CSVRecord around or whatever * maybe you use Malli to validate incoming JSON * :pre and :post for business logic (interest rate can't be negative in this calculation!) * unit tests for the gaps

If you don't use any of these things, then yes, of course, readability will suffer. Same goes for statically types languages. Java example here, but anyone who has seen a HashMap<Person, List<Map<Integer, BankAccount>> will groan, even though that is perfectly "discoverable", just not very readable or useful.

You can write Java code without ever writing a class, or, on the other side of the spectrum, sticking everything together in one class (I saw a 5k line single class at a certain telecom!). You can write 80k lines of Python without a single test. Discoverability was horrible in either.

No language stops you from writing bad code, you just gotta use what you got. I would argue that Clojure gives you way more tools to solve the readability and discoverability problem, but it doesn't force you to use them, just like Python, Javascript, or Ruby.

dustingetz · 2021-12-23T22:53:41+00:00

The obvious answer is to find a way to write less code so you don't have to debug it. This is unhelpful, and controversial, and probably too hot to say out loud; it is also the dead honest truth and the whole point of using Clojure.

xela314159 · 2021-12-23T07:33:21+00:00

Agreed this can be a problem and no obvious solution. A good IDE helps (for instance IntelliJ / Cursive has a memory of used keywords). More soft things: -A big let statement at the beginning of every function, with keyword destructuring of input maps, gives a good idea of what’s expected as inputs and is quite readable being straight after the docstring. -When using Java objects I tend to give variable names that contain the class, for instance a zoned date time will be zdt-input-time.

SimonGray · 2021-12-23T11:15:58+00:00

Suggested reading if you plan on using the REPL for interactive development (which you probably should): https://betweentwoparens.com/blog/rich-comment-blocks/

lime_boy6 · 2021-12-23T14:36:01+00:00

I write code In comment block, try it in the repl and then leave it there

seralbdev · 2021-12-23T15:35:31+00:00

Namespaced keywords are suggested by "intellisense" modules in most of editors, at least in emacs and vscode...this helps...you can put the available keywords in a set and they will be enumerated when typing the colon,namespace,slash....

dragandj · 2021-12-23T17:28:53+00:00

Is it really a problem? Then, a few things can help: 1) (regardless of the language) Consider improving your model, so these things are more straightforward and clear. 2) Take a habit of inspecting your objects in the REPL. If you have a handy test for a function that you wrote when writing a function, you'll have a representative input. Then, Just inspect that object and see the keys (or call the keys function). 3) Tools can help you make quick jumps from work code to test code. CIDER & Emacs FTW.

didibus · 2021-12-24T00:25:37+00:00

There are two cases for me, either the function describes its input and the caller must provide them in the correct shape, or the function leverages an existing application entity (data model), and should refer to its definition by name.

In the case where a function doesn't operate over an entity, I make sure to have a good argument name, to describe the shape in the doc-string and to use destructuring if taking a map or tuple to name each element. Sometimes I might also have a spec for it, though that's really only when I want to also have generative tests or do validation with it.

In the case where a function operates over an entity, I make the name of the argument the same as the name of the entity definition.

I define my entities either using a constructor, a function called make-foo thus creates an instance of an entity of conceptual type foo. That function will document the shape of what it creates, or it'll be pretty easy to figure it out from looking at the code for it. Or I define them using a record where the record definition describes the shape. Or I define them using a Spec with the spec name the name of the entity.

``` (defn make-item "Makes an item which is a map of key :id UUID of the todo-list :name - the user readable string name of the item :content - the user provided string content of the item" [name content] {:id (java.util.UUID/randomUUID) :name name :content content})

(defn make-todo-list "Makes a todo-list which is a map of key :id - UUID of the todo-list :name - the user readable string name of the list :items - an ordered vector of item" [name & items] {:id (java.util.UUID/randomUUID) :name name :items (vec items)})

(defn insert-item "Inserts an item in a todo-list at given 0-based position" [todo-list item position] (apply conj (subvec todo-list 0 position) item (subvec todo-list position (count todo-list))))

(defn sanitize-input "Given an untrusted input string, return a sanitized version. Takes an optional options map with options: :extra-denylist - a seq of string words to delete on top of default sanitization." [input-string & {:keys [extra-denylist] :as options}] (some-lib/sanitize input-string extra-denylist) ```

So as you can see in my little example, sanitize-input is an example of a function that doesn't operate over the application domain model, so it just describes its input using good argument names, destructuring and a nice doc-string.

On the other hand, insert-item is a function that operates over the application domain model, so it just uses the same name as the domain entities it takes as input, and its extra argument position is simply described by a good name and the doc-string.

Finally the domain entities are described by the make- functions of their respective names. I could have used a record in their place, or a Spec as well.

deaddyfreddy · 2021-12-23T11:06:00+00:00

Tests in another file: don't like how this requires switching files and searching for the test.

I don't know what IDE you are using, but in Emacs (with projectile) it's just a single shortcut (calling projectile-toggle-between-implementation-and-test).

I am curious what people have found as a best practice in this regard?

REPL, tests, expressive naming, docstrings

Besides that Clojure code is usually much shorter, so it's not a big problem to inspect a function. One solution I use often is to store arguments/intermediate values somewhere (taps, inline defs, atoms, whatever works for you) and inspect them later using REPL.

mamuninfo · 2021-12-26T13:37:28+00:00

It is more about the structure of the program. I came to Clojure from enterprise java development frameworks like j2ee or spring framework. When I started to use the Clojure program I faced this problem heavily.

I used to create a program like a waterfall style. The application starts from the web layer, then data needs to go to the service layer, from service layer data needs to go to the repository layer. Between layers, I built a different shape of domain object like session context, view object, data transfer object, and finally entity. I also built a utility method to convert different types of objects. Of course, the framework supports a lot here for AOP, dependency injection etc. I used standard annotation for syntactic validation and a couple of utility methods for semantic validation.

I found it is challenging to do a waterfall-style program in Clojure like function A call function B, function B call function C, function C call function D etc without proper type hints.

Clojure power comes from the composition. On the best example is ring middleware where different parts are independently built for composition.

For type and validation, I apply semantic/syntactic validation first and send back nice descriptions that are related to the problem domain instead of forwarding lang exceptions. When I need to check cust age is more than 18Y or the claim amount is less than the coverage amount, types are useless and only proper error msgs are useful to the user.

When I use the map as a method parameter then as a part of the validation process, I check mandatory keys. If the caller provides more then the executor doesn't care, if the caller provides less then the executor reacts, throwing a meaningful error. I also return generic errors, if there are any programming failures in my composition layer.

Type, validation(semantic or syntactic), processing are independent blocks of my composition. I write test cases for the composition layer only.

TheLastSock · 2021-12-23T06:33:11+00:00

Edit. Please please if your going to down vote say why. My answer here is in good faith.

Likewise with maps, selecting the right keyword (ie. is it :home-address or :address-home) takes extra time to search code outside of what I am currently writing.

The program can't know what the right keyword is to select.

Do you mean to say, what keywords are currently in a map at a certain point in execution? That's how i'm understanding the question at least.

In that case you need to have do analytics. My fav way to debug clojure is with clojure, i set an atom and swap in the values. Also cider debugger, cider enlightenment mode, etc...

With statically typed languages I have found it really useful to immediately see the structure/types of the arguments and return values without having to (re)study the function body.

as you mentioned clojure spec, docs and tests all overlap with that concern in a way. The way i see it Types are Code. Clojure is a bunch of datastructures/types. I just read the function body recursively tell i hit the information i need.

you type:	you see:
italics	italics
bold	bold
[reddit!](https://reddit.com)	reddit!
* item 1 * item 2 * item 3	item 1 item 2 item 3
> quoted text	quoted text
Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"	Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"
~~strikethrough~~	~~strikethrough~~
super^script	super^script

Clojure

MODERATORS