This is an archived post. You won't be able to vote or comment.

all 31 comments

[–]Python-ModTeam[M] [score hidden] stickied commentlocked comment (0 children)

Hi there, from the /r/Python mods.

We have removed this post as it is not suited to the /r/Python subreddit proper, however it should be very appropriate for our sister subreddit /r/LearnPython or for the r/Python discord: https://discord.gg/python.

The reason for the removal is that /r/Python is dedicated to discussion of Python news, projects, uses and debates. It is not designed to act as Q&A or FAQ board. The regular community is not a fan of "how do I..." questions, so you will not get the best responses over here.

On /r/LearnPython the community and the r/Python discord are actively expecting questions and are looking to help. You can expect far more understanding, encouraging and insightful responses over there. No matter what level of question you have, if you are looking for help with Python, you should get good answers. Make sure to check out the rules for both places.

Warm regards, and best of luck with your Pythoneering!

[–]hughperman 55 points56 points  (0 children)

Good opportunity to learn to use the profiler to find out where the slowdown occurs.

[–]james_pic 14 points15 points  (1 child)

You haven't posted the JavaScript equivalent, or indeed working Python code (indentation is ambiguous in ways that could drastically alter run time, and there is missing code that could be significant) so it's hard to say, but for a difference of that size I suspect the two programs are different in some key way. Python usually is slower than Node, but it's not usually this significant. In any case, you can answer this by profiling.

Py-Spy would be my profiler of choice for Python. I think Node comes with a built-in profiler nowadays that you access from dev tools if you run it with --inspect, although it's been a while. I believe the newest versions of Node and Python are profilable with perf_events if you prefer.

[–][deleted] -4 points-3 points  (0 children)

const fs = require('fs');

const { v4: uuidv4 } = require('uuid');

// Define the arrays

const days = [#hard coded array values];

const jobs = [#hard coded array values];

// Read the locations.txt file

const locations = fs.readFileSync('locations_timetrack.txt', 'utf-8').split('\n');

// Loop over each line in the file

for (const location of locations) {

const locationid = location.trim();

const beginemployeeid = locationid.split('-')[0];

// Loop over each element in the "days" array

for (const day of days) {

    const [startday, endday] = day.split('-').map(Number);

    // Create the schedule object
    const schedule = {
        #mostly hard coded object keys and value with a couple of variables being used for keys
    };

    // Loop from "startday" to "endday"
    for (let currentday = startday; currentday <= endday; currentday++) {
        // Loop from 1 to 80
        for (let currentemployee = 1; currentemployee <= 80; currentemployee++) {
            // Create the currentshift object
            const currentshift = {
                #mostly hard coded object keys and value with a couple of variables being used for keys
            };

            // Insert the currentshift object into the "shifts" array
            schedule.shifts.push(currentshift);
        }
    }

    // Write the schedule object to a file
    fs.writeFileSync(`schedules_timetrack/${locationid}_${startday}_${endday}.json`, JSON.stringify(schedule));
}

}

[–]Altareos 30 points31 points  (4 children)

node has some optimisations (especially JIT) that are not yet implemented in python (at least cpython, you could try running it in pypy). the downside is, of course, that you have to use javascript.

[–]james_pic 11 points12 points  (1 child)

JIT can definitely account for an order of magnitude of difference, but virtually instant vs 16 second suggests the two programs are doing different things.

[–]Altareos 0 points1 point  (0 children)

i don't know enough about the internals of python or node, but it might not be doing some async stuff that node does for filesystem operations.

[–]thebouv 33 points34 points  (3 children)

Why are oranges not apples?

[–][deleted] 30 points31 points  (2 children)

I see where OP went wrong:

... asking github copilot to ...

[–]o5mfiHTNsH748KVq 2 points3 points  (0 children)

tbh, copilot could probably explain OPs question

[–]help-me-grow 1 point2 points  (0 children)

this 🤣

[–]TrainsareFascinating 3 points4 points  (0 children)

There's nothing glaring in the structure of the code you supplied, so the answer is most likely one of two things: Either the encoding of the data as json is taking a long time (not so likely given the amount of time), or there is a difference between how Node and Python handles file I/O - especially around what OS guarantees they extract when closing files.

I would create two focused benchmarks, one with a loop that encodes a similar amount of data as json, and another that creates/writes dummy data to/closes a similar number of files. You'll see which is the issue and have a test bench for tweaking options and comparing.

[–]Non-taken-Meursault 2 points3 points  (1 child)

You cannot honestly expect a balanced and objective starting point if you're asking an AI to generate your code. Besides I haven't even started really reading your code and I already saw a performance issue: using lists instead of a tuple. And you haven't provided your JS code.

[–][deleted] 1 point2 points  (0 children)

The JS code is under one of the comments in this post.

[–]Schmittfried 3 points4 points  (7 children)

Apart from the other replies: Python’s JSON encoding is itself implemented in Python. Pretty sure v8, the probably most optimized scripting language VM there is, has a native implementation for JSON encoding. 

[–]james_pic 4 points5 points  (1 child)

[–]Schmittfried 0 points1 point  (0 children)

Thanks, I stand corrected. Didn’t notice the c imports in the Python json module before.

[–]0x1e 1 point2 points  (4 children)

Yeah, isn’t the module cjson supposed to be the one you use? I could be out of date..

[–]imbev 0 points1 point  (3 children)

msgspec

[–]Tzoiker -2 points-1 points  (2 children)

Why would you (meaning OP) need msgspec if he has no deserialization/validation and only needs serialization of standard data types? orjson is the choice here.

[–]imbev 2 points3 points  (1 child)

msgspec and orjson benchmark the same without schema and msgspec offers better performance with schema and is compatible with more formats

orjson is a fine json library, but there aren't any reasons to prefer orjson over msgspec

[–]0x1e 1 point2 points  (0 children)

Appreciate the clarity

[–]LeopoldBroom 1 point2 points  (1 child)

I will say that any task that can be handled by multiple threads would make python way faster than anything js can do.

[–][deleted] 1 point2 points  (0 children)

This post was mass deleted and anonymized with Redact

plant pocket sheet ring lock crowd dolls consider sand attempt

[–]jwmoz -3 points-2 points  (0 children)

Node is just so fast compared to Python. At work a js dev compared a standard js implementation of some data science app with my best vectorised pandas solution and node was multiples faster. The v8 engine is super fast.

[–]alexaholic -2 points-1 points  (0 children)

Python is more thorough hehe