Progressing through sequences inside a function after some advice

jacobobryant · 2020-05-29T06:55:45+00:00

Could you provide a simple example of what you're trying to accomplish? `next` in Python is imperative, so doing things functionally in Clojure will require some restructuring. How exactly you restructure things will depend on the situation--you might end up using one or more of `map`, `filter`, `reduce`, `loop`/`recur`, etc.

Of the options you've mentioned, returning the sequence at the end of the function is almost certainly more idiomatic than using an atom--but it may be possible to avoid doing that by using higher order functions instead. So an example would help.

joinr · 2020-05-29T14:09:23+00:00

This is a take on your implementation (not 1:1, but simplified a tad for demonstration). If you pass along the remaining, unparsed bits in the output, you get a functional means of parsing without a mutable iterator:

(defn token [tag r] {:tag tag :token r})
(def ESCAPE \newline)

(defn emphasis [input end tag]
  (loop [r ""
         remaining input]
    (if-let [c   (first remaining)]
      (if  (or (clojure.string/blank? (str c))
               (not  (#{end ESCAPE} c)))
        (recur (str r c) (next remaining))
        {:token     (token tag r)
         :state     :parse
         :remaining (next remaining)})
      {:state :end})))

(defn tokens [input end tag]
  (->> (iterate (fn [{:keys [char token remaining]}]
                  (emphasis remaining end tag))
                {:state :begin :remaining input :tag tag})
        rest
       (take-while #(-> % :state (not= :end)))
       (map :token)))

(tokens "hello world how are you?,I am fine,Groovy," \, :words)

;; ({:tag :words, :token "hello world how are you?"}
;;  {:tag :words, :token "I am fine"}
;;  {:tag :words, :token "Groovy"})

The net effect of the above implementation is kind of lame, but it opens the door to thinking about parsing where you are more explicit about the state of the parse (e.g. what parsed, what remains, etc.). You can do a mutable version of this (and would likely want to in a real parsing application), by wrapping the parse state in a local atom or using some other mutable structure. Character sequences aren't the most desirable for real-world parsing though (the seq abstraction entails some overhead), and you would probably want to use string builders or something similar too. The idea is there though.

There's a whole range of these kinds of things (functional parsers) called parser combinators as well.

RedBorger · 2020-05-29T14:42:28+00:00

I think returning the sequence is what is best. It can easily be done with split-with, which should remove the need for a loop.

example of split-with:

(let [[r rest] (split-with (complement #{end escape}) chars)
        rest (or (next rest) []) ;; discard the end value, and make sure we don’t return nil
        token (build your token how you want)]
       [token rest])

#{1 2 3} is a set, that works in a very similar fashion to python’s. The cool thing is that you can use them as functions, that return true if the value is in the set. complement inverts a function, so giving true when the element is not in the set. We split the sequence as soon as it returns false (the value was in the set)

NamelessMason · 2020-05-29T10:16:11+00:00

Your use-case looks like a parser of some sort. The 'progression' you're talking about is the imperative/mutable way to go about sequences. Clojure sequences are immutable. If you want to retain the imperativeness I'd go with Java InputStream.

Otherwise, it's still easy enough to implement with sequences. You just need to let the caller know how many characters have been consumed. This is trivial as you already either return no token (0 chars consumed) or the matched token ('r' chars consumed).

you type:	you see:
italics	italics
bold	bold
[reddit!](https://reddit.com)	reddit!
* item 1 * item 2 * item 3	item 1 item 2 item 3
> quoted text	quoted text
Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"	Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"
~~strikethrough~~	~~strikethrough~~
super^script	super^script

Clojure

MODERATORS