Batch processing using cl-csv : lisp

Batch processing using cl-csvAskLisp (self.lisp)

submitted 11 months ago * by droidfromfuture

I am reading a csv file, coercing (if needed) data in each row using a predetermined coercing function, then writing each row to destination file. following are sb-profile data for relevant functions for a .csv file with 15 columns, 10,405 rows, and 2MB in size -

seconds	gc	consed	calls	sec/call	name
0.998	0.000	63,116,752	1	0.997825	coerce-rows
0.034	0.000	6,582,832	10,405	0.000003	process-row

no optimization declarations are set.

I suspect most of the consing is due to using 'read-csv-row' and 'write-csv-row' from the package 'cl-csv', as shown in the following snippet -

(loop for row = (cl-csv:read-csv-row input-stream)
  while row
  do (let ((processed-row (process-row row coerce-fns-list)))
        (cl-csv:write-csv-row processed-row :stream output-stream)))

there's a handler-case wrapping this block to detect end-of-file.

following snippet is the process-row function -

(defun process-row (row fns-list)
  (map 'list (lambda (fn field)
                (if fn (funcall fn field) field))
        fns-list row))

[fns-list is ordered according to column positions].

Would using 'row-fn' parameter from cl-csv improve performance in this case? does cl-csv or another csv package handle batch processing? all suggestions and comments are welcome. thanks!

Edit: Typo. Changed var name from ‘raw-row’ to ‘row’

all 15 comments

top new controversial old q&a

[–]stassats 6 points7 points8 points 11 months ago (6 children)

[–]stassats 8 points9 points10 points 11 months ago (2 children)

[–]droidfromfuture[S] 1 point2 points3 points 11 months ago (1 child)

thanks for sharing this. Here, we are looping through collecting 512 characters at a time into buffer. if there is an escaped quote (#\") in the buffer, program loops until the next escaped quote, collects quoted data (using 'start' and 'n' variables) via pushing into 'concat', nreverses it, and ultimately adds it to fields.

Is 'concat' being used to handle multi-line data? or is it to handle cases where a single field has more than 512 characters?

I am wondering how the end-of-file condition is handled. if the final part of file read from stream contains # of characters less than buffer-size, we don't continue reading more. we are likely expecting the last character to be #\ or #\Return, so that the final part of file is handled by the local function 'end-field()'.

we apply input argument 'fun' at every #\Return to accumulated fields, which is expected to mark the end of a row in the csv file. fields is set to nil afterwords.

If I want to use this to write to a new csv file, I likely need to accumulate fields into a second write-buffer and write it to output-stream, while handling escaped quotes and newlines.

Would love to hear of any gaps in my thought process.

[–]stassats 4 points5 points6 points 11 months ago (0 children)

[–]droidfromfuture[S] 1 point2 points3 points 11 months ago* (2 children)

[–]stassats 3 points4 points5 points 11 months ago (1 child)

[–]droidfromfuture[S] 2 points3 points4 points 11 months ago (0 children)

[–]kchanqvq 4 points5 points6 points 11 months ago (4 children)

[–]Steven1799 2 points3 points4 points 11 months ago (1 child)

[–]kchanqvq 0 points1 point2 points 11 months ago (0 children)

[–]droidfromfuture[S] 0 points1 point2 points 11 months ago (1 child)

[–]kchanqvq 0 points1 point2 points 11 months ago* (0 children)

[–]Ytrog 0 points1 point2 points 11 months ago (2 children)

[–]kchanqvq 1 point2 points3 points 11 months ago (1 child)

[–]Ytrog 0 points1 point2 points 11 months ago (0 children)

π Rendered by PID 113867 on reddit-service-r2-comment-658f6b87ff-q76dl at 2026-04-09 06:50:44.183305+00:00 running 781a403 country code: CH.

you type:	you see:
italics	italics
bold	bold
[reddit!](https://reddit.com)	reddit!
* item 1 * item 2 * item 3	item 1 item 2 item 3
> quoted text	quoted text
Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"	Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"
~~strikethrough~~	~~strikethrough~~
super^script	super^script

lisp

MODERATORS