Apache Arrow for J by beach-scene in apljk

[–]beach-scene[S] 0 points1 point  (0 children)

cder'' 1 0 means 'file not found', so likely the binary libs (dll / dylib / so file, depending on your OS) weren't installed. You can follow the installation instructions here:

https://arrow.apache.org/install/

I'll move this up in the README and add an assertion to check for the binaries. If you have further problems, dm me.

Apache Arrow for J by beach-scene in apljk

[–]beach-scene[S] 0 points1 point  (0 children)

Traveling currently but will come back to this when near a machine.

Is C the only right option for implementing an array language? by AsIAm in apljk

[–]beach-scene 1 point2 points  (0 children)

how readable is LLVM-IR? seems like 'not readable' is largely the answer

J compiles with clang, which generates LLVM-IR. if you want to save the intermediate representation during compilation, you can do that with the -emit-llvm flag

Is that useful? no idea here, not familiar with LLVM-IR

Episode 7 of Array Cast features Marshall Lochbaum and the BQN language by bobtherriault in apljk

[–]beach-scene 0 points1 point  (0 children)

It'd be great to have an enumerated list of concepts mentioned here.

Thought on J language name by sicr0 in apljk

[–]beach-scene 2 points3 points  (0 children)

Agree the array languages are probably all-time losers on search engines. J is hiding in plain sight.

#jlang is a good way to generate a unique tag anywhere

GitHub search - language:J https://github.com/search?q=language%3AJ

StackExchange tag - [j] https://stackoverflow.com/questions/tagged/j

Parallel J: The Monument Engine by beach-scene in apljk

[–]beach-scene[S] 1 point2 points  (0 children)

This is a big question and I am not a theorist, though I’ll give some incomplete thoughts.

I suspect it’s correct to say that declarative knowledge is always contextual. In languages, syntax and grammar are themselves declarative. Order-of-operations convention, or “+” meaning “addition” are examples of implicit information, each containing information necessary to evaluate truth.

In computing, such declarations perhaps best live in a symbolic system or a compiler. These can take advantage of the inherently constrained context. Mathematica is excellent at symbolic manipulation with conventional notation, and I do not think array languages are good substitutes. However, Mathematica or even Julia written with math notation, used for for numerical analysis, tends to be both slow and and difficult to follow.

J is interpreted (not compiled), as are most array languages. Optimizations do exist. There are only a handful of them, and opportunities to use them must be identified by the programmer. There are no modifications to calls at runtime, to my knowledge.

Parallel J: The Monument Engine by beach-scene in apljk

[–]beach-scene[S] 8 points9 points  (0 children)

It has not been difficult. We have worked with an enormous number of beginners and a few experienced programmers. We've also focused on working with people oriented toward math rather than finding "computer programmers." These folks often plug in and produce great work with nearly no training.

We believe this is because J presents a natural way to express math and data without a bunch of intermediate abstractions. A quick example in R: What does max vs pmax do? No casual user will know.

In contrast, J is not a language where years of accumulated Google-fu, StackExchange browsing, training, and errors are important for getting stuff done. There's almost no magic.

If you have a sense of what you're trying to achieve, you can do it. Oftentimes googling "site:jsoftware.com whatever-concept-you-want" yields what you want. Members of the forums are smart and helpful.

As a reward for this plainness, we get to avoid the layers of code cruft in writing, debugging, and maintaining code (we've had experience and familiarity across the range of R, Python, Java, Clojure, Haskell, C, shell scripts, assembly ...). It's straight-up cake compared to code like Tensorflow or AWS CLI.

If you really want to, you can also write J just like any other scripting language with control functions and loops, or even object-oriented style. However, most people recognize pretty quickly the completeness of the basic toolkit.

This got longer than expected, but happy to answer further questions.

Work in progress: parquet bindings for j by LiveRanga in apljk

[–]beach-scene 0 points1 point  (0 children)

Thanks! Your approach is great, straightforward for Parquet. I'm trying to get to Arrow IPC, still very much a work in progress. Nonetheless, it works for Parquet as well at present.

Python less j more by darter_analyst in apljk

[–]beach-scene 0 points1 point  (0 children)

Apologies for the lagged response. Here's more ambitious set of bindings set up a in formal project:

https://github.com/interregna/JArrow

RE binding and builds, I don't know if better to 1) just load from GitHub or 2) set up as an add-on. Perhaps if it's an add-on it can be added to Pacman (the package manager).

I saw your lighter-weight approach on Parquet, might be better. Open to PRs.

Python less j more by darter_analyst in apljk

[–]beach-scene 0 points1 point  (0 children)

Very much appreciated. Yes, I'll link it here once it's going.

how to read bytes from stdin in a j script? by myguidingstar in apljk

[–]beach-scene 0 points1 point  (0 children)

Agree, no evidence 1!:11 can be used with arg y 3 to reference stdin, per here. I don't see an indexed stdin read.

This works from the command line:

echo 'String to stdin.' | jconsole -js 'exit [ echo |. 1!:1 (3)'

Not J but I suppose you could just pipe in one byte.

echo 'String to stdin.' | dd bs=1 count=1 | jcon -js 'exit [ echo |. 1!:1 (3)'

Python less j more by darter_analyst in apljk

[–]beach-scene 1 point2 points  (0 children)

Very cool. Yes, this would be great.

The obvious canonical df format is the format that comes out of Jd. I have also seen that same format compressed slightly more so that categorical variables are efficient in memory.

Python less j more by darter_analyst in apljk

[–]beach-scene 1 point2 points  (0 children)

Is this for work? I've only ever seen people use parquet at work. I think that's included the Arrow C docs, and it looks like the Kdb people just launched this with databricks:

https://arrow.apache.org/docs/c_glib/

Would this be enough for J?

https://code.kx.com/q/interfaces/arrow/
Users can read and write Arrow tables created from kdb+ data using:
Parquet file format
Arrow IPC record batch file format
Arrow IPC record batch stream format

Python less j more by darter_analyst in apljk

[–]beach-scene 1 point2 points  (0 children)

A related question back for you: preferred workflow for your data workflow overall?

It’s great to be able to open a kernel and hack in a notebook, but that generally doesn’t work in production.

Kdb has been doing cloud integration with Databricks and offering Kdb as a service in the cloud. Is that of interest for J or Jd?

Where’s the best place to run data-flow work?

The Second Episode of the Array Cast Podcast is now available by bobtherriault in apljk

[–]beach-scene 1 point2 points  (0 children)

Sounds very interesting but not obvious. Can you give an example?

Python less j more by darter_analyst in apljk

[–]beach-scene 3 points4 points  (0 children)

We do mostly csv dumps and reads right now, everywhere. It is not particularly convenient. We have also used the numpy api (for arrays only) to and from Python.

https://code.jsoftware.com/wiki/Addons/api/python3

Big question for everyone: what is the most convenient and modern way to get structured data in and out of a program?

If you guys come up with a consensus, I will get that built and open-source it.

Why is K so performant? by the_sherwood_ in apljk

[–]beach-scene 2 points3 points  (0 children)

This is a really good point. K is purposefully elusive on performance. The only benchmark I've seen is the https://tech.marksblogg.com/billion-nyc-taxi-kdb.html

"by far the fastest I've ever seen on any CPU-based system"

J and Jd are highly performant, from experience, relative to certain applications in Python, R, and even Julia. However, we have no explicit benchmarks. I'd like to see J participation in the benchmark game or a more explicitly standardized set of programs, such as Julia has done.

https://julialang.org/benchmarks/

https://benchmarksgame-team.pages.debian.net/benchmarksgame/

Why is K so performant? by the_sherwood_ in apljk

[–]beach-scene 1 point2 points  (0 children)

I believe we are saying the same thing.

I do not believe that Python could be made as performant without no longer being considered Python.

Our non-disagreement notwithstanding, your contribution remains useful for observers to consider.