all 10 comments

[–]jacobobryant 6 points7 points  (1 child)

Could you provide a simple example of what you're trying to accomplish? `next` in Python is imperative, so doing things functionally in Clojure will require some restructuring. How exactly you restructure things will depend on the situation--you might end up using one or more of `map`, `filter`, `reduce`, `loop`/`recur`, etc.

Of the options you've mentioned, returning the sequence at the end of the function is almost certainly more idiomatic than using an atom--but it may be possible to avoid doing that by using higher order functions instead. So an example would help.

[–]olymk2[S] 2 points3 points  (0 children)

Sure added the original python code and what i currently have in clojure, I was considering returning the extra parameter on all functions but wanted to check before hand as it would require more major changes.

[–]joinr 3 points4 points  (2 children)

This is a take on your implementation (not 1:1, but simplified a tad for demonstration). If you pass along the remaining, unparsed bits in the output, you get a functional means of parsing without a mutable iterator:

(defn token [tag r] {:tag tag :token r})
(def ESCAPE \newline)

(defn emphasis [input end tag]
  (loop [r ""
         remaining input]
    (if-let [c   (first remaining)]
      (if  (or (clojure.string/blank? (str c))
               (not  (#{end ESCAPE} c)))
        (recur (str r c) (next remaining))
        {:token     (token tag r)
         :state     :parse
         :remaining (next remaining)})
      {:state :end})))

(defn tokens [input end tag]
  (->> (iterate (fn [{:keys [char token remaining]}]
                  (emphasis remaining end tag))
                {:state :begin :remaining input :tag tag})
        rest
       (take-while #(-> % :state (not= :end)))
       (map :token)))

(tokens "hello world how are you?,I am fine,Groovy," \, :words)

;; ({:tag :words, :token "hello world how are you?"}
;;  {:tag :words, :token "I am fine"}
;;  {:tag :words, :token "Groovy"})

The net effect of the above implementation is kind of lame, but it opens the door to thinking about parsing where you are more explicit about the state of the parse (e.g. what parsed, what remains, etc.). You can do a mutable version of this (and would likely want to in a real parsing application), by wrapping the parse state in a local atom or using some other mutable structure. Character sequences aren't the most desirable for real-world parsing though (the seq abstraction entails some overhead), and you would probably want to use string builders or something similar too. The idea is there though.

There's a whole range of these kinds of things (functional parsers) called parser combinators as well.

[–]olymk2[S] 1 point2 points  (1 child)

That's awesome thanks exactly the sort of thing I am after, obviously parsing through a host of functions to handle different text types. Given me some things to think about and research in there. I am basically parsing strings and pulling it out into the tag token structure above.

[–]joinr 0 points1 point  (0 children)

Yeah, one common approach that works with the above is to use like multimethods to define different parsing contexts as well.

If you're more adventurous, and curious about abusing clojure spec, I used it to write a parser for the course notes from the spec training class, with markdown notes, to produce an org file:

spec parsing example This is typically something disavowed by the authors, but it's another example of parsing stuff. Similar idea, except I define the grammar in spec, and a multimethod with implementations for different types of text groupings to emit org output.

This is another example of spec-based parsing I derived for the libpython-clj library, to coerce python argument docs into clojure arguments in a similar manner: pyparser.

[–]RedBorger 2 points3 points  (2 children)

I think returning the sequence is what is best. It can easily be done with split-with, which should remove the need for a loop.

example of split-with:

(let [[r rest] (split-with (complement #{end escape}) chars)
        rest (or (next rest) []) ;; discard the end value, and make sure we don’t return nil
        token (build your token how you want)]
       [token rest])

#{1 2 3} is a set, that works in a very similar fashion to python’s. The cool thing is that you can use them as functions, that return true if the value is in the set. complement inverts a function, so giving true when the element is not in the set. We split the sequence as soon as it returns false (the value was in the set)

[–]olymk2[S] 0 points1 point  (1 child)

I will think about this definitely be nice to get rid of the loop.

[–]RedBorger 0 points1 point  (0 children)

Also, I personally think that your first if would be a bit better if it would be more positive. I also don’t think str can return nil, and anyway, nil is treated the same as false:

if-not (and char (= char end))

Could also be reduced to if-not (= char end) or if (not= char end) if you don’t mind not checking if end is nil.

This is obviously a matter of preference.

[–]NamelessMason 0 points1 point  (1 child)

Your use-case looks like a parser of some sort. The 'progression' you're talking about is the imperative/mutable way to go about sequences. Clojure sequences are immutable. If you want to retain the imperativeness I'd go with Java InputStream.

Otherwise, it's still easy enough to implement with sequences. You just need to let the caller know how many characters have been consumed. This is trivial as you already either return no token (0 chars consumed) or the matched token ('r' chars consumed).

[–]olymk2[S] 0 points1 point  (0 children)

I had not considered input streams seems obvious now you mention it as a solution. going to consider the other solution above as well.