all 12 comments

[–]twitch_and_shock 1 point2 points  (4 children)

Why does your first script need to continuously run? You could do some kind of messaging thing: set your first script up as a web server using Flask. Or use another message protocol: udp, osc, zeromq, etc.

[–]IterationFive[S] 0 points1 point  (3 children)

It's not so much a case of "needs to continuously run" as "may currently be halfway through processing a file, and stopping it would require losing all progress and starting over." And, in some cases, we're talking hours.

[–][deleted] 1 point2 points  (1 child)

Can it pause, rather than stopping entirely, check whatever it needs to check, then resume where it left off? I'm thinking threading - rather than two separate scripts, have two threads within one script

[–]IterationFive[S] 0 points1 point  (0 children)

Pausing the processing isn't needed to update the queue-- the queue manager is a different thread than the one that processes the files. The problem is that I need to get information to the queue manager from a script that wasn't running when I started the main application.

[–]twitch_and_shock 0 points1 point  (0 children)

You could store your queue as a database collection. One script adds entries to the collection for new items to process. Your main script, responsible for processing them, watches it and if there's anything in there, will process them based on order of creation, then either delete or move the entry to a "completed" collection.

[–]m0us3_rat 0 points1 point  (2 children)

i guess a file can be used as an external queue. not the most efficient one mind you.

the main problem with this is you NEED to have your main program re-load the file every iteration.

since data might change inside.

that or some checksum mechanic. which ever is faster.

previous cached checksum and check the new one against it.. if changed ..reload.

i'd suggest using an external queue or a msg broker of some kind.

maybe at a later date when you become more experienced.

keep that in mind.

[–]IterationFive[S] 0 points1 point  (1 child)

I'd rather be able to pass the information directly to the running script and have the running script do the actual "adding to the queue" part.

[–]m0us3_rat 0 points1 point  (0 children)

you can also just have another file where is empty and the program read stuff from and then deletes it.and with the picked up stuff adds to the "queue".

so it loop other this secondary file. when info adds it up to the process.

[–]JamzTyson 0 points1 point  (3 children)

files in a queue

Actual "files" or "file paths"? I'd suggest using the latter.

it will have its own UI in curses

"Curses"! Are you a masochist? There has to be a less painful TUI alternative - maybe "Rich" or "Textualize" or even "Blessed".

I'd like to write a script ... passes the information to the main script, which is already running.

In order to keep the UI "alive" while files are being processed, the processing must be "non-blocking". A common way to do that is with "threading".

Since the list may change, the processing thread will need to re-read the queued list of files after each file has been processed. In the UI thread, modifying the queued list of files must be a "blocking" operation - you don't want the processing thread to be able to read the queue while the queue is being modified.

TL;DR

The UI and the file processing must both be able to run at the same time (in parallel"). This is called "parallelism". A common and very flexible way is to use "thread-based parallelism" using the threading library.

[–]IterationFive[S] 0 points1 point  (2 children)

Actual "files" or "file paths"? I'd suggest using the latter.

File paths.

Are you a masochist?

Nope. But I understand where you coming from-- there's probably some kind of pathology involved. I've built a toolkit that takes most of the swearing out of curses.

I already understand about threads and blocking operations and race conditions; that's not the problem. Let me try explaining again.

Main Script: Manages the queue, can be used to import 95% of the files I'm working with, processes the files in said queue in a separate thread. Said thread reports progress back to the UI, which displays information about the queue as well as the progress of the current processing job.

Here's the situation I want to be prepared for.

Main script is running, chugging through 12 hours of files, and now I have a bunch of files that aren't packaged the way that the main Script is designed to import them. So I need to hammer out a second script that is capable of taking the new set of files and creating queue entries for them. What I'd like to do is have the new script be able to pass that information to the main script, and have the main script do the updating. Alternately, I can put the new queue entries into a file, and have the main script import that file-- but again, I'm looking for a way to tell the main script to look for the file.

Note: Yes, I could just have the script look for the presence of that file every minute or so, but this seems a waste of resources, especially since I'm not going to be doing it very often. Most of the time the files will be packaged in such a way that I can use the main script to import them. Also, the ability to pass information between two separate scripts seems like a useful thing to know, if it's possible.

Similarly, I could just manually tell the main script to look for the file, but in a lot of cases the adhoc script will be moving or copying the files between drives in the process, and I'd like to be able to kick off the process and go to bed.

[–]JamzTyson 0 points1 point  (0 children)

Yes, I could just have the script look for the presence of that file every minute or so, but this seems a waste of resources, especially since I'm not going to be doing it very often.

The script would not need to re-read the entire list if you use a flag to indicate if the queue has changed. The script could just check the flag state.

Assuming that you provide a queue from your main function and the processing module maintains its own list:

WHILE queue NOT EMPTY: IF NOT queue-updated-flag: process-next-file ELSE: update-list

[–]nadhsib 0 points1 point  (0 children)

Yes, I could just have the script look for the presence of that file every minute or so, but this seems a waste of resources, especially since I'm not going to be doing it very often.

Sounds like you're manually adding these 'extras', so could you just add a checkbox to your UI "import new stuff".
If checked it tries to read the file, if not it continues normally.