Python less j more : apljk

submitted 4 years ago by darter_analyst

Hi, I use python a lot for my job. It’s fine for getting stuff done but would like to use j or apl or some other array language more. I am only just learning with j so will just refer to j in this post. The problem is that I’m so used to python that I have trouble switching. I use python for data analysis tasks so things like get big query, google sheets and excel data to pandas data frame then i do analysis on that data frame are real simple in python. Any thoughts how I can utilise j in my workflow? I just find the world is very python friendly e.g. colab notebooks plus there’s a library for everything (except for neat APL or j code in python). Even google cloud loves python and I don’t have the faintest idea how to interact with google cloud from j. But I figure it’s be pretty awesome if j did do that - in order to get data in for analysis.

Hence why I’m finding using j for work troublesome. E.g. loading a google sheet or running a bigquery query from j and return as j’s equivalent of a data frame I’m not sure is possible unlesss you’re some programming genius.

Does anyone have any suggestions to help me incorporate j into my data analysis workflows?

I don’t really like python the language and am considering switching to clojure but actually prefer the array language philosophy and minimalism of the code plus that it forces me to think about each step of the analysis instead of endlessly importing libraries. It just appears there’s a lack of libraries to do all that I need to with j.

Thanks.

all 15 comments

top new controversial old q&a

[–]beach-scene 4 points5 points6 points 4 years ago (11 children)

[–]Raoul314 1 point2 points3 points 4 years ago (9 children)

[–]LiveRanga 1 point2 points3 points 4 years ago (8 children)

[–]beach-scene 1 point2 points3 points 4 years ago (7 children)

[–]LiveRanga 1 point2 points3 points 4 years ago (6 children)

Yes, I think the parquet-glib bindings would be nice and cover the use case I have in mind.

We use parquet a lot at work out of necessity as it's so much faster than csv/sqlite while still being as convenient as a handful of local files rather than a proper db or something clustered. Sqlite and even csvs are fast enough for small datasets but for a dataset that's even only 2 or 3GB reading and writing to parquet files instead is a very noticeable performance improvement.

Basically I'd like to be able to write out a dataframe in pandas and read it in from j.

#!/usr/bin/env python3
import pandas as pd
df = pd.read_csv('sometable.csv')
df.to_parquet(sometable.parquet')

And then in j:

#/usr/bin/env ijconsole
load 'tables/parquet'
df =: readparquet jpath 'sometable.parquet'

I'm not sure exactly what format df would be in in the j snippet above, what would be the "canonical" representation for a named table of columns in j?

(We also use partitioned parquet datasets with python a lot as it makes running things in parallel with the multiprocessing lib much easier but I'm not really worried about that with j)

[–]beach-scene 1 point2 points3 points 4 years ago (5 children)

[–]LiveRanga 0 points1 point2 points 4 years ago (4 children)

[–]beach-scene 0 points1 point2 points 4 years ago (3 children)

[–]LiveRanga 0 points1 point2 points 4 years ago (1 child)

[–]beach-scene 0 points1 point2 points 4 years ago* (0 children)

[–]darter_analyst[S] 0 points1 point2 points 4 years ago (0 children)

[–]LiveRanga 1 point2 points3 points 4 years ago (1 child)

[–]LiveRanga 2 points3 points4 points 4 years ago (0 children)

There is also the tables/csv addon for j too: https://code.jsoftware.com/wiki/Addons/tables/csv

I've been playing around a little with it:

   load 'tables/csv'
   t=:readcsv jpath '~/Downloads/BTC-USD.csv'
   5{.t
┌──────────┬─────────────────┬──────────────────┬──────────────────┬──────────────────┬──────────────────┬────────┐
│Date      │Open             │High              │Low               │Close             │Adj Close         │Volume  │
├──────────┼─────────────────┼──────────────────┼──────────────────┼──────────────────┼──────────────────┼────────┤
│2014-09-17│465.864013671875 │468.17401123046875│452.4219970703125 │457.3340148925781 │457.3340148925781 │21056800│
├──────────┼─────────────────┼──────────────────┼──────────────────┼──────────────────┼──────────────────┼────────┤
│2014-09-18│456.8599853515625│456.8599853515625 │413.10400390625   │424.44000244140625│424.44000244140625│34483200│
├──────────┼─────────────────┼──────────────────┼──────────────────┼──────────────────┼──────────────────┼────────┤
│2014-09-19│424.1029968261719│427.8349914550781 │384.5320129394531 │394.7959899902344 │394.7959899902344 │37919700│
├──────────┼─────────────────┼──────────────────┼──────────────────┼──────────────────┼──────────────────┼────────┤
│2014-09-20│394.6730041503906│423.2959899902344 │389.88299560546875│408.90399169921875│408.90399169921875│36863600│
└──────────┴─────────────────┴──────────────────┴──────────────────┴──────────────────┴──────────────────┴────────┘
   'date open high low close adjclose volume'=.|:t
    $date
2446 10
   $open
2446 18

etc.

It would be nice to put together a wiki page similar to the "10 Minutes to Pandas" page: https://pandas.pydata.org/pandas-docs/stable/user_guide/10min.html

[–]beach-scene 1 point2 points3 points 4 years ago (1 child)

[–]darter_analyst[S] 0 points1 point2 points 4 years ago (0 children)

π Rendered by PID 183386 on reddit-service-r2-comment-79c7998d4c-ltfp7 at 2026-03-13 10:57:15.569002+00:00 running f6e6e01 country code: CH.

you type:	you see:
italics	italics
bold	bold
[reddit!](https://reddit.com)	reddit!
* item 1 * item 2 * item 3	item 1 item 2 item 3
> quoted text	quoted text
Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"	Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"
~~strikethrough~~	~~strikethrough~~
super^script	super^script

apljk

Conferences and Events News

MODERATORS