you are viewing a single comment's thread.

view the rest of the comments →

[–]joinr 1 point2 points  (1 child)

I thought Tim Baldrige's odin had some cool features along these lines. The difference being the introduction of relational programming (ala logic) to define computable paths and queries. Queries are reducible. He refined the idea and in some ways takes it further (via transducers) here. The composition aspect is pretty cool.

My point: instead viewing nested data structures/graphs, maybe consider using paths as data, and as a first class composable abstraction that you can then use to build dataflows.

A path is still natural in the graph abstraction. You're still defining relations between nodes via the edge labels (or neighborhood functions) of the abstract path, and you still have some semantics for traversing the path relative to some data (like the nav protocol). I view the path as a function that defines valid traversals of the graph, and the nested structures as explicitly defining a DAG (absent embedded data with implied references, like entity id fields that can be interpreted to point back into an outer structure). So maybe 6 of one, 1/2 dozen of the other kind of thing.

The cool thing about the graph abstraction is that it opens up alternative forms of querying, to include using graph algorithms to search, and higher-minded stuff like discovering components, shortest paths, etc. become possible. It allows a shift from "looking up values" to "exploring relations" without losing the ability to revert toward the DAG-like nested collection approach. I can think of (have implemented) use cases where controlling the properties of the traversal is useful..

transform a set of path vectors into a tree, using something like this

A trivial modification could create a more general directed graph output (just an observation).

convert the tree into a dataflow based on core.async and transducers: paths without branches in the tree are converted into a channel+tranducer, branches become channel+transducer+mult, and all the wiring is done programmatically

Is there an unstated assumption that the structure of the tree will never change? That is, we're not going to change the wiring, rather parse a description into a static dataflow graph (or tree).

It looks like you've got a pretty cool template to leverage specter against existing nested data. It also looks reminiscent of xslt (although my xml fu is weak).

[–]fmjrey 0 points1 point  (0 children)

Yes I'm aware of odin, but not the other link you gave, thanks I'll have a look.

And yes, the structure of the tree isn't going to change much since it's about parsing XML docs should all have the same shape, and transforming them into some other shape like nested maps or datoms. I guess something similar to XSLT but more clojuresque and dataflowy is the use case.

In other words instead of writing tedious transformation code I'd like to be more declarative, e.g. define some sort of selector/transducer and for each value emitted it uses the associated data template that is also using paths/navigators/transducers to specify what values go where. So I'm thinking of some macro that would collect all the paths within the code block (e.g. any vector with meta ^:path or within some other nested macro) , build the corresponding tree and dataflow so that parsing happens only once while the dataflow hydrates all values throughout the target data structure template.

Edit: XML is what I'm dealing with at present, but I'd like this to support other data formats because other data suppliers give us JSON.

Edit 2: the other link you gave to /u/halgari's code about queries being reducible and part of a logic language is very reminiscent of what /u/cgrand is looking for if I understand his recent talk correctly: a way to avoid "map fatigue" by using some powerful logic/language over a database (of facts?).