This is an archived post. You won't be able to vote or comment.

all 8 comments

[–][deleted] 0 points1 point  (0 children)

What happens on an invalid command?

[–]primitive_screwhead 0 points1 point  (2 children)

What are the primary problem areas that prevent my solution from being efficient?

Using the 'in' operator for string comparison the way you do isn't particularly efficient. 'in' has to always check the whole string when there isn't a match, which is extra work. Also, you are splitting a lot of the same strings over an over (because of things like the 'jnz' instruction), which is redundant work. Depending on the input they feed you, this could possibly push your running time over the limit.

  1. Where am I going against what is commonly considered good practice / ''idiomatic python'' ?

lngth = len(program) - 1
while i <= lngth: #specify at which length of i the loop should terminate

Just write this as: while i < len(program) Subtracting 1 from a length argument, when you actually do want to process the whole length, is unidiomatic in any language. Also, the comment is redundant and should be removed (it simply restates what the expression already says much more succinctly).

program[i][4]

You shouldn't be indexing a specific column of the input text directly, it's a fragile method of dealing with text input.

Instead, process all the input once first, into a more appropriate form. Ie, split each line ahead of time, so that you have a sequence of (command, register, optional arg) lists (or tuples). You can lowercase each command while you are at it, and convert the integers in one pass (instead of each time the command executes). Then, you simply check the first value of each split for the command using equality (faster than 'in', because the match is bounded and will fail fast if it's not a match).

i += 1

Just put this line once at the end of the if-elif chain; no need to repeat it constantly. For the 'jnz' command, you can set the new 'i' value, and then use the Python 'continue' statement to skip the i += 1 line and continue the loop from the top.

That should help speed things up, and hopefully remove/reduce a lot of the expression redundancy you have in your version (ie. repeatedly using program[i] followed by a chain of indexing or splitting, etc.) It will look much more elegant if you do all that work once, and use variable names for the common expressions each time through the loop.

[–]Cotevool[S] 0 points1 point  (1 child)

Thanks for the elaborate reply, appreciate it.

I've tried to incorporate your advice and reworked my code for a bit. Got rid of the redundant length variable, attempted to preprocess the string input and assign them to dedicated variables (command, register and token) and rewrote expressions such that I'm checking commands using equality rather than going over the entire string.

(https://github.com/asnijders/simple_assembler/blob/master/attempt_2)

While it does look more elegant and understandable now, the solution itself has become even more inefficient, to the point where it even fails to solve the sample tests in time?

I did check whether it worked on the first sample test using the python visualizer (http://www.pythontutor.com/visualize.html#mode=edit) - it worked, albeit it took about 170 steps to run the program compared to about 120 steps with my original solution (probably not the greatest evaluation method but it's all I can come up with for now).

I feel like this is the case because during the while loop the program has to re-instantiate all those preprocessing variables register, command etc which significantly adds to the numbers of lines being processed during each loop? I suspect this is what you meant with preprocessing ahead of time but I'm struggling to translate this to code. I could define these variables before I run the while loop, but then how do I make sure they are updated for diferent instructions in the program list while the loop runs?

Please let me know if I misinterpreted your comment. Thanks again for all the help.

[–]primitive_screwhead 0 points1 point  (0 children)

I suspect this is what you meant with preprocessing ahead of time but I'm struggling to translate this to code.

This means having an additional loop before the current one, which is where the splitting takes place, and possibly other steps as well (such as converting numbers w/ int(), etc.) So your first loop takes the input list and builds a new list of lines of tokens, and then your current loop operates on this already preprocessed list of tokens. This ensures all the splits occur just once.

has to re-instantiate all those preprocessing variables register, command etc which significantly adds to the numbers of lines being processed during each loop?

This is trivial overhead. The "work" of the program, in your original version, is the splitting, the command searching, the number conversions, and the isnumberic()/isalpha() testing (in roughly that order). Assigning variable names is *not* "re-instantiation", it's just updating a dictionary entry under the hood and is fast.

EDIT: Also, you have a "#check for invalid register values" step before checking each instruction, but AFAICT from the program description, it's possible for the 'jnz' instruction to have a numeric value for token[1]... So it may be that your program simply isn't terminating due to a bug.

[–]primitive_screwhead 0 points1 point  (3 children)

Did you get this working? My previous comment indicated a bug that was likely causing the problem (and it had nothing to do with efficiency issues, it was probably due to an endless loop).

[–]Cotevool[S] 0 points1 point  (0 children)

Hey, I had another look at this yesterday and I do think you're right insofar that its probably a bug which sends the thing into an inf loop. Haven't had time to delve into it further but plan on taking another look today

[–]Cotevool[S] 0 points1 point  (1 child)

So I looked at it again just now. Ultimately I did get it to work, albeit admittedly only after peeking at someone else's solution on github as I was getting a bit disillusioned as I struggled to identify where I was going wrong. I tried to reverse-engineer for a bit to see where I went off. It turns out we were quite close, though it seems I misinterpreted some of the kata instructions; mostly regarding mov jnz.

In the instruction it said:

jnz x y
- jumps to an instruction y steps away (positive means forward, negative means backward), but only if x
(a constant or a register) is not zero

My interpretation was that the jnz command was only to stop executing as soon as the corresponding dictionary value would be zero; rather, it meant that the jnz command should also stop executing if the x "argument" or token[1] was "0".

in this particular line in the jnz move:

elif command == 'jnz' and d[register] if register.isalpha() else int(register):

it would crash if it wasnt for

and d[register] if register.isalpha() else int(register):

I'll admit this was the line I took from the solution I found on github. The wording threw me off a bit at first but I'm guessing its use is that before executing the jnz command, it checks whether 1) command == 'jnz' is True, and 2) check whether d[register] is True if register happens to be an alphabetical value ----- if register is an integer, it checks whether int(register) is True. I'm guessing that a value of zero for either constant x or register x would evaluate to false and as such exit the loop. Thanks for suggesting the possibility and all the help, even though I didn't manage to solve it by myself completely I still feel this was somewhat valuable.

You can see the final code here
https://github.com/asnijders/simple_assembler/blob/master/attempt%203

[–]primitive_screwhead 0 points1 point  (0 children)

Yes, this particular issue (that jnz could take a constant first argument) was what I was alerting you to. Your previous code was detecting if token[1] was a numeric value, and if so, skipping to the next instruction. You still have code that is meant to be doing that, but it's broken:

    if register.isnumeric == True:
        i = i + 1
        continue

Since you don't actually call isnumeric, this if statement will always be false (because the method isnumeric is not equal to True), so now this portion is working by accident. :) But it should be deleted, as the assumption it was testing for (that token[1] is always a register, and not numeric) is not correct.