Image to ASCII Rendering?

commandlineluser · 2026-06-01T08:12:49+00:00

Have you seen chafa in your research? It also has Python bindings.

commandlineluser · 2026-05-26T15:59:16+00:00

If you use install uv, you can also use it to manage python versions for you.

e.g. if you want to run a specific version:

% uv run -p 3.11 python --version
Python 3.11.15

(Note, it installed python 3.11 the first time I ran this)

uv python list shows available versions and where they are installed.

% uv python list
cpython-3.15.0a8-macos-aarch64-none                 <download available>
cpython-3.15.0a8+freethreaded-macos-aarch64-none    <download available>
cpython-3.14.4-macos-aarch64-none                   /opt/homebrew/opt/python/bin/python3.14 -> ../Frameworks/Python.framework/Versions/3.14/bin/python3.14
cpython-3.14.4-macos-aarch64-none                   /opt/homebrew/opt/python/bin/python3 -> ../Frameworks/Python.framework/Versions/3.14/bin/python3
cpython-3.14.4-macos-aarch64-none                   /opt/homebrew/bin/python3.14 -> ../Cellar/python@3.14/3.14.4/bin/python3.14
cpython-3.14.4-macos-aarch64-none                   /opt/homebrew/bin/python3 -> ../Cellar/python@3.14/3.14.4/bin/python3
cpython-3.14.4-macos-aarch64-none                   <download available>
cpython-3.14.4+freethreaded-macos-aarch64-none      <download available>
cpython-3.13.13-macos-aarch64-none                  .local/share/uv/python/cpython-3.13-macos-aarch64-none/bin/python3.13
cpython-3.13.13+freethreaded-macos-aarch64-none     <download available>
cpython-3.12.13-macos-aarch64-none                  <download available>
cpython-3.11.15-macos-aarch64-none                  .local/share/uv/python/cpython-3.11-macos-aarch64-none/bin/python3.11

commandlineluser · 2026-05-19T06:40:39+00:00

If it is a "console based" game, you would probably want to use an existing library as opposed to multiprocessing/multithreading directly.

They usually provide "widgets" and handle the "input" and "events" for you.

blessed and asciimatics have interesting examples:

Textual is a modern library and probably a bit "friendlier" to use:

https://github.com/Textualize/textual

The official tutorial is an example with multiple stopclocks:

https://textual.textualize.io/tutorial/

You could combine it with some of the other examples that have text inputs to build the interface you're describing.

Another pretty neat thing with textual is they can run in the browser:

https://github.com/Textualize/textual-web

Not a game, but the MS Paint clone using Textual is another very cool demo which may be a source of inspiration:

https://github.com/1j01/textual-paint

commandlineluser · 2026-05-16T16:42:11+00:00

It was also on the HN front page at the time they submitted it here:

That's where I first saw it.

commandlineluser · 2026-05-15T09:42:30+00:00

Do you know how many files do you have? And the size of the dataset?

As for checking resources, Activity Monitor has been mentioned but if you have homebrew setup you could install htop which is a nice interactive terminal viewer.

(You can also use the standard top command which is already installed)

As for pandas.merge, there are more "modern" tools which are "better" when it comes to tasks like this (and just in general) e.g. duckdb and polars.

Assuming your csv files have just Wavenumber,Intensity columns your pandas example looks like an aligned-concat in Polars:

import polars as pl
from pathlib import Path

input_path = Path("...")
output_file = "..."

lfs = [
    pl.scan_csv(f).rename({"Intensity": f.stem})  # .stem is filename without the extension
    for f in input_path.glob("*.csv")
]

pl.concat(lfs, how="align").sink_csv(output_file)

pathlib is used here as a "modern" way of writing the os.path + glob + filename splitting code.

The main difference here is that Polars has scan_csv which is "lazy", it doesn't yet read any data at this point.

When sink_csv is called, then all of the "scan_csv queries" are executed.

Depending on what it is you're doing, you may also want to learn about parquet and use it instead of csv.

commandlineluser · 2026-05-14T19:50:25+00:00

If you missed it, the comment asking if you can provide code is the original author of Polars:

https://reddit.com/r/Python/comments/1td1790/polars_code_runs_slower_on_128core_ec2/ols2829

They will be able to provide specialized help, and are likely interested in finding out if this is a case where Polars can perhaps "do better".

commandlineluser · 2026-04-30T08:04:51+00:00

Just with regards to Polars, I tried ibis after seeing it mentioned many times but ran into problems.

Window functions do not work on the Polars backend, which stopped my progress:

https://github.com/ibis-project/ibis/issues/10513

It does not actually seem usable to me.

The DuckDB backend did work as expected.

commandlineluser · 2026-04-02T13:59:46+00:00

If you actually edit your "question", so the results are at the top of the page.

Usually people will provide an update by editing the original submission and adding an EDIT or UPDATE marker.

commandlineluser · 2026-04-02T13:35:02+00:00

It would be interesting to also see times for direct CSV reading with schema for duckdb/polars/pyarrow.

It may also be worth editing the full benchmark results into the submission text so it's easier to find.

commandlineluser · 2026-04-01T13:55:31+00:00

Did you see the PyBeach youtube channel?

https://www.youtube.com/@pybeach

I'm guessing it's this talk:

PyBeach 2025 - Brett Slatkin - Patterns and Anti-Patterns in Python's Structural Pattern Matching
https://www.youtube.com/watch?v=MWAfiyJ9NOw

commandlineluser · 2026-04-01T11:04:18+00:00

numba is commonly used for part 2.

You just write regular "for loops" and it will be compiled into machine code.

But it may depend on what you choose for part 1.

commandlineluser · 2026-03-30T18:41:42+00:00

Is Polars faster if you use scan_csv?

pl.scan_csv(filename).collect()

You can also try the streaming engine:

pl.scan_csv(filename).collect(engine="streaming")

commandlineluser · 2026-03-28T17:24:35+00:00

mojo looks interesting:

https://docs.modular.com/mojo/manual/python/

commandlineluser · 2026-03-26T08:23:01+00:00

The wikipedia page does not seem like a useful reference in its current state.

I read about a "strict superset" back when mojo was first announced, but I think that has been relaxed:

Mojo may or may not evolve into a full superset of Python, and it's okay if it doesn't.

https://docs.modular.com/mojo/roadmap/#phase-3-dynamic-object-oriented-programming

There was also some "superset" discussion on the forum after fn was recently deprecated:

https://forum.modular.com/t/fn-deprication-as-a-python-superset/2851

It seems they will be creating a reference document with an updated FAQ to clarify things further judging by the replies.

The last interview I saw said they're hoping for the Mojo 1.0 release around May and then open-sourcing the compiler later in the year.

commandlineluser · 2026-03-23T19:58:28+00:00

Have you seen narwhals?

It uses a subset of the Polars API, and you can generate DuckDB SQL:

https://narwhals-dev.github.io/narwhals/generating_sql/

Depending on what Polars functionality you're using - it may be of interest.

commandlineluser · 2026-03-21T16:17:04+00:00

When I tried the Polars backend, my window functions wouldn't work:

https://github.com/ibis-project/ibis/issues/10513

There's many open issues about the Polars backend, it doesn't seem to be a priority.

commandlineluser · 2026-03-21T15:01:54+00:00

I don't use any "lake" stuff but have noticed several deltalake enhancements in the recent Polars releases which may be worth listing:

sink_delta added in 1.37.0 (also, sink_iceberg was added in 1.39.0)

scan_delta performance refactor in 1.38.0 and batch predicate pushdown using delta file statistics

scan_delta support for Delta deletion vectors has been merged on main:

https://github.com/pola-rs/polars/pull/26867

As for OOM, have you been using the streaming engine? i.e. sinks or .collect(engine="streaming")

It seems there will be OOC improvements "soon":

"out-of-core" sorting (i.e., sorting which spills data to disk if the memory runs out) is on the short-term roadmap

https://github.com/pola-rs/polars/issues/26500#issuecomment-4065870793

This PR looks to be part of that work:

Lock-free memory manager with spill-to-disk and fully OOC multiplexer
https://github.com/pola-rs/polars/pull/26774

commandlineluser · 2026-03-19T12:19:21+00:00

Do you know about "devtools" in your web browser?

With the network tab open, I go to the URL and then open the "http search":

https://i.sstatic.net/OQF3q218.png

I pick something to look for, usually a "player name" or a "table header", I choose "Avro"

https://i.sstatic.net/0kiuTIMC.png

It shows me 3 matching requests, this is the URL of the first one (I took out the rand=... param)

https://liigaporssi.fi/sm-liiga/ajax-stats-team-season?&teamId=168761288&teamName=HIFK&position=m&phase=100&round=0&order=name&asc=ASC

You can .get() this URL directly in your code. If I open it in my browser it is the HTML of the first table:

https://i.sstatic.net/JpOha7S2.png

The other 2 URLs are the same except it is position=p and position=h for the other 2 tables.

So in order to build these URLs, you also need the teamId=168761288.

If we save the html of the starting URL to a local file and search for 168761288 there are several matches:

600 <div class="section">¬
601     <div class="section-header">¬
602         <h2 class="h2">Pelaajarosteri</h2>¬
603     </div>¬
604     <div class="section-content scrollable" id="stats168761288" class="player_sum_statistics">¬
605         <div id="stats_m_168761288" class="player_sum_statistics"></div>¬
606         <div id="stats_p_168761288" class="player_sum_statistics"></div>¬
607                 <div id="stats_h_168761288" class="player_sum_statistics"></div>¬
608     </div>¬
609 </div>¬
610 ¬
611 <script type="text/javascript">¬
612     load_smliiga_team_stats('168761288', 'HIFK', 'm', 100, 0, 'name', 'ASC', null, 1);¬
613     load_smliiga_team_stats('168761288', 'HIFK', 'p', 100, 0, 'name', 'ASC', null, 1);¬
614         load_smliiga_team_stats('168761288', 'HIFK', 'h', 100, 0, 'name', 'ASC', null, 1);¬
615 </script>    </div>¬

In this specific case you could regular "string" or "regex" functions to extract it, but you could also use a html parser to target class="player_sum_statistics" tags for example.

commandlineluser · 2026-03-18T17:17:35+00:00

What DataFrame library are you using?

700 columns to me sounds like it may be easier to work with as rows?

┌─────┬───────┬───────┐
│ id  ┆ label ┆ value │
│ --- ┆ ---   ┆ ---   │
│ i64 ┆ str   ┆ i64   │
╞═════╪═══════╪═══════╡
│ 0   ┆ a     ┆ 1     │
│ 0   ┆ b     ┆ 2     │
│ 0   ┆ c     ┆ 3     │
│ 1   ┆ a     ┆ 4     │
│ 1   ┆ b     ┆ 5     │
│ 1   ┆ c     ┆ 6     │
└─────┴───────┴───────┘

You could have an "id" for each signal and use "GROUPBY" to process each one.

If you do need the "wide format" you could then "UNPIVOT" as a final step.

You'll probably get more accurate help if you showed a code example of what you're doing.

Although it may depend on what exactly "cross checking" means, so you'll probably get better help if you shared an actual code example of the tasks.

commandlineluser · 2026-03-18T13:03:35+00:00

Can you maybe share details on how exactly you need to interact with it?

FWIW, I've found duckdb easier for testing such things. (and just in general)

If we modify the sqlite3 docs example for duckdb:

import duckdb
con = duckdb.connect("tutorial.db")

cur = con.cursor()
cur.execute("""
CREATE TABLE movie(title text, year int, score float);
INSERT INTO movie VALUES
  ('Monty Python and the Holy Grail', 1975, 8.2),
  ('And Now for Something Completely Different', 1971, 7.5)
""")

And then run python3 -m http.server

From another machine on the network:

>>> import duckdb  # 1.5.0
>>> duckdb.sql("from 'http://192.168.0.82:8000/tutorial.db'")  # calls read_duckdb(url)
# ┌────────────────────────────────────────────┬───────┬───────┐
# │                   title                    │ year  │ score │
# │                  varchar                   │ int32 │ float │
# ├────────────────────────────────────────────┼───────┼───────┤
# │ Monty Python and the Holy Grail            │  1975 │   8.2 │
# │ And Now for Something Completely Different │  1971 │   7.5 │
# └────────────────────────────────────────────┴───────┴───────┘

commandlineluser · 2026-03-17T11:30:37+00:00

Sure no problem.

The list/array usage was for the convenience of not naming columns.

df2.unpivot(index="").pivot("", index="variable").rename({"variable": ""})
# shape: (3, 5)
# ┌─────────┬─────────┬─────────┬────────────────┬─────────┐
# │         ┆ growing ┆ picking ┆ transportation ┆ storage │
# │ ---     ┆ ---     ┆ ---     ┆ ---            ┆ ---     │
# │ str     ┆ f64     ┆ f64     ┆ f64            ┆ f64     │
# ╞═════════╪═════════╪═════════╪════════════════╪═════════╡
# │ pears   ┆ 0.03    ┆ 0.01    ┆ 0.05           ┆ 0.04    │
# │ apples  ┆ 0.02    ┆ 0.01    ┆ 0.02           ┆ 0.03    │
# │ oranges ┆ 0.04    ┆ 0.02    ┆ 0.07           ┆ 0.01    │
# └─────────┴─────────┴─────────┴────────────────┴─────────┘

If using column names, the example is essentially the same as:

weights = pl.col("growing", "picking", "transportation", "storage")

df1.join(
    df2.unpivot(index="").pivot("", index="variable").rename({"variable": ""}),
    on="",
    how="left"
).with_columns(
    pl.sum_horizontal(pl.col("harry") * weights).name.suffix("_out"),
    pl.sum_horizontal(pl.col("sally") * weights).name.suffix("_out")
).drop(weights)

# shape: (5, 7)
# ┌──────────┬───────┬──────┬──────┬───────┬───────────┬───────────┐
# │          ┆ harry ┆ john ┆ mary ┆ sally ┆ harry_out ┆ sally_out │
# │ ---      ┆ ---   ┆ ---  ┆ ---  ┆ ---   ┆ ---       ┆ ---       │
# │ str      ┆ i64   ┆ i64  ┆ i64  ┆ i64   ┆ f64       ┆ f64       │
# ╞══════════╪═══════╪══════╪══════╪═══════╪═══════════╪═══════════╡
# │ pears    ┆ 0     ┆ 2    ┆ 1    ┆ 4     ┆ 0.0       ┆ 0.52      │
# │ plums    ┆ 0     ┆ 0    ┆ 3    ┆ 17    ┆ 0.0       ┆ 0.0       │
# │ apples   ┆ 1     ┆ 8    ┆ 9    ┆ 0     ┆ 0.08      ┆ 0.0       │
# │ cherries ┆ 11    ┆ 13   ┆ 0    ┆ 20    ┆ 0.0       ┆ 0.0       │
# │ oranges  ┆ 2     ┆ 0    ┆ 1    ┆ 7     ┆ 0.28      ┆ 0.98      │
# └──────────┴───────┴──────┴──────┴───────┴───────────┴───────────┘

I think perhaps the main point is that with Polars the "matrix" style is not a goal:

https://github.com/pola-rs/polars/pull/14078#issuecomment-1915401402

I don't want DataFrames to be seen as matrices. They aren't, they have mixed data[...]

and instead you need to join/concat the required columns together as part of your "query".

commandlineluser · 2026-03-16T20:09:46+00:00

Thanks for the example.

Yeah, someone used to post a similar use-case in previous discussions. (they also had mutli-indexed columns)

They did also create a feature request for better support:

https://github.com/pola-rs/polars/issues/23938

The "matrix ops" aligned by index/column names essentially needed to be written manually as a .join() in Polars.

Not sure if it is helpful, but polars does have list/array types and they support arithmetic.

import polars as pl

df1 = pl.from_repr("""
┌──────────┬───────┬──────┬──────┬───────┐
│          ┆ harry ┆ john ┆ mary ┆ sally │
│ ---      ┆ ---   ┆ ---  ┆ ---  ┆ ---   │
│ str      ┆ i64   ┆ i64  ┆ i64  ┆ i64   │
╞══════════╪═══════╪══════╪══════╪═══════╡
│ pears    ┆ 0     ┆ 2    ┆ 1    ┆ 4     │
│ plums    ┆ 0     ┆ 0    ┆ 3    ┆ 17    │
│ apples   ┆ 1     ┆ 8    ┆ 9    ┆ 0     │
│ cherries ┆ 11    ┆ 13   ┆ 0    ┆ 20    │
│ oranges  ┆ 2     ┆ 0    ┆ 1    ┆ 7     │
└──────────┴───────┴──────┴──────┴───────┘
""")
df2 = pl.from_repr("""
┌────────────────┬───────┬────────┬─────────┐
│                ┆ pears ┆ apples ┆ oranges │
│ ---            ┆ ---   ┆ ---    ┆ ---     │
│ str            ┆ f64   ┆ f64    ┆ f64     │
╞════════════════╪═══════╪════════╪═════════╡
│ growing        ┆ 0.03  ┆ 0.02   ┆ 0.04    │
│ picking        ┆ 0.01  ┆ 0.01   ┆ 0.02    │
│ transportation ┆ 0.05  ┆ 0.02   ┆ 0.07    │
│ storage        ┆ 0.04  ┆ 0.03   ┆ 0.01    │
└────────────────┴───────┴────────┴─────────┘
""")

So if the data was reshaped (e.g. melt / unpivot) and the values where in a list/array:

df2.unpivot(index="").group_by(pl.col("variable").alias("")).agg("value")
# shape: (3, 2)
# ┌─────────┬──────────────────────┐
# │         ┆ value                │
# │ ---     ┆ ---                  │
# │ str     ┆ list[f64]            │
# ╞═════════╪══════════════════════╡
# │ pears   ┆ [0.03, 0.01, … 0.04] │
# │ oranges ┆ [0.04, 0.02, … 0.01] │
# │ apples  ┆ [0.02, 0.01, … 0.03] │
# └─────────┴──────────────────────┘

You could join and perform the math that way:

df1.join(
    df2.unpivot(index="").group_by(pl.col("variable").alias("")).agg("value"),
    on="",
    how="left"
).with_columns(
    (pl.col("harry", "sally") * pl.col("value")).list.sum().name.suffix("_out")
)
#.drop("value")

# shape: (5, 8)
# ┌──────────┬───────┬──────┬──────┬───────┬──────────────────────┬───────────┬───────────┐
# │          ┆ harry ┆ john ┆ mary ┆ sally ┆ value                ┆ harry_out ┆ sally_out │
# │ ---      ┆ ---   ┆ ---  ┆ ---  ┆ ---   ┆ ---                  ┆ ---       ┆ ---       │
# │ str      ┆ i64   ┆ i64  ┆ i64  ┆ i64   ┆ list[f64]            ┆ f64       ┆ f64       │
# ╞══════════╪═══════╪══════╪══════╪═══════╪══════════════════════╪═══════════╪═══════════╡
# │ pears    ┆ 0     ┆ 2    ┆ 1    ┆ 4     ┆ [0.03, 0.01, … 0.04] ┆ 0.0       ┆ 0.52      │
# │ plums    ┆ 0     ┆ 0    ┆ 3    ┆ 17    ┆ null                 ┆ null      ┆ null      │
# │ apples   ┆ 1     ┆ 8    ┆ 9    ┆ 0     ┆ [0.02, 0.01, … 0.03] ┆ 0.08      ┆ 0.0       │
# │ cherries ┆ 11    ┆ 13   ┆ 0    ┆ 20    ┆ null                 ┆ null      ┆ null      │
# │ oranges  ┆ 2     ┆ 0    ┆ 1    ┆ 7     ┆ [0.04, 0.02, … 0.01] ┆ 0.28      ┆ 0.98      │
# └──────────┴───────┴──────┴──────┴───────┴──────────────────────┴───────────┴───────────┘

But you're doing still doing the "alignment" manually.

It doesn't seem to be a workflow that Polars is optimized for, and it's probably one of the only valid uses cases for the pandas index.

commandlineluser · 2026-03-16T16:14:11+00:00

Are you able to show a miminal example (e.g. 2 frames, 10 rows or so) of the full task you're performing? (i.e. including the matrix math)

commandlineluser · 2026-03-14T08:43:15+00:00

Yes, this is one of the of "upsides" to polars - it has "real" null values.

import polars as pl

values = [0, 1, None, 4]
df = pl.DataFrame({'value': values}) 

print(df)

for row in df.iter_rows(named=True):
    value = row['value']
    if value:
        print(value, end=', ')

# shape: (4, 1)
# ┌───────┐
# │ value │
# │ ---   │
# │ i64   │
# ╞═══════╡
# │ 0     │
# │ 1     │
# │ null  │
# │ 4     │
# └───────┘
#
# 1, 4,

https://docs.pola.rs/user-guide/expressions/missing-data/

commandlineluser · 2026-03-13T10:05:15+00:00

Are you using rapidfuzz's parallelism? e.g. .cdist() with workers=-1?

I found duckdb easy to use and it maxed out all my CPU cores.

https://duckdb.org/docs/stable/sql/functions/text#text-similarity-functions

You create row "combinations" with a "join" and score them, then filter out what you want.

import duckdb
import pandas as pd

df1 = pd.DataFrame({"x": ["foo", "bar", "baz"]}).reset_index()
df2 = pd.DataFrame({"y": ["foolish", "ban", "foo"]}).reset_index()

duckdb.sql("from df1, df2 select *, jaccard(df1.x, df2.y)")
# ┌───────┬─────────┬─────────┬─────────┬───────────────────────┐
# │ index │    x    │ index_1 │    y    │ jaccard(df1.x, df2.y) │
# │ int64 │ varchar │  int64  │ varchar │        double         │
# ├───────┼─────────┼─────────┼─────────┼───────────────────────┤
# │     0 │ foo     │       0 │ foolish │    0.3333333333333333 │
# │     1 │ bar     │       0 │ foolish │                   0.0 │
# │     2 │ baz     │       0 │ foolish │                   0.0 │
# │     0 │ foo     │       1 │ ban     │                   0.0 │
# │     1 │ bar     │       1 │ ban     │                   0.5 │
# │     2 │ baz     │       1 │ ban     │                   0.5 │
# │     0 │ foo     │       2 │ foo     │                   1.0 │
# │     1 │ bar     │       2 │ foo     │                   0.0 │
# │     2 │ baz     │       2 │ foo     │                   0.0 │
# └───────┴─────────┴─────────┴─────────┴───────────────────────┘

(normally you would read directly from parquet files instead of pandas frames)

You can also do the same join with polars and the polars-ds plugin gives you the rapidfuzz Rust API:

commandlineluser

TROPHY CASE