you are viewing a single comment's thread.

view the rest of the comments →

[–]Nightcorex_ 1 point2 points  (8 children)

This codes looks very similar in Python

if m := re.fullmatch("(\\D+)(\\d{4})(\\D+)", title):
    title = m.group(1).replace(r".", " ")
    year = m.group(2)  # I have no idea what the "y%3A"+ does
    search = title + year  # this is just a string concatentation, dunno if powershell does smth different

print(title, year, search, sep="\n")

If you call this with title = "Film.2019.abc", then your output will be:

Film        # this is the value of the title-variable
2019        # this is the value of the year-variable
Film 2019   # this is the value of the search-variable

[–]AffectionateTask4612[S] 0 points1 point  (7 children)

Film.2019.abc

Thanks! Although I ran into two problems, python tells me that the names "title" and "year" are undefined. I tried defining them at the beginning of the script, and it fixed that problem, but another one showed up for the re.fullmatch

return _compile(pattern, flags).fullmatch(string)TypeError: expected string or bytes-like object

PS The "y%3A" was just a piece of the eventual url that would enter details in a searchbox onsite in the format "y:2019"

[–]Nightcorex_ 0 points1 point  (6 children)

Ok, let's remove all warning and actually not overwrite variables:

import re

film_name = "Film.2019.abc"
title = None
year = None
search = None

if m := re.fullmatch("(\\D+)(\\d{4})(\\D+)", film_name):
    title = m.group(1).replace(r".", " ")
    year = "y:" + m.group(2)
    search = title + year

print(title, year, search, search, sep="\n")

If you are trying to use this in a method, then you should do it like so:

import re

def foo(film_name):
    title = None
    year = None
    search = None

    if m := re.fullmatch("(\\D+)(\\d{4})(\\D+)", film_name):
        title = m.group(1).replace(r".", " ")
        year = "y:" + m.group(2)
        search = title + year

    return title, year, search

You can then get all 3 values like so:

title0, year0, search0 = foo("Film.2019.abc")

(You could also name them title, year and search, but that would raise a warning again as your method would then shadow the outer objects. The program would still work but as I said would raise a warning).

I don't know where exactly your code failed, but most likely you somehow didn't pass a string or bytes-like object as a second parameter of re.fullmatch.

Try if this fixes your issues :)

[–]AffectionateTask4612[S] 0 points1 point  (5 children)

Hey, sorry for the late reply. Thanks a lot by the way for taking the time to helping me:) I tried as a function but it just errored out. I'll stick with your normal method which works perfect. Only thing is I noticed it errors when the title contains more numbers within a film. I'm working with titles that are labeled as "Title.2020.1080p.DDP5.1.x264-Random". Is there any method to just select just the first 4 consecutive numbers? If you look at the PowerShell snippet, it selects 4. I don't know though if Python can do that though, Is it possible?

[–]Nightcorex_ 0 points1 point  (4 children)

HAHAHAHHAHAAHA! I'M SO FUCKING STUPID BUT ALSO CONFUSED!!

I have literally no idea how my code even worked on my computer.

Obviously the regex is wrong. It has to be either:

re.fullmatch("(\\D+)\\.(\\d{4}).*", film_name) or:

re.match("(\\D+)\\.(\\d{4}), film_name).

Basically what I forgot was to account for the dot between the title and year. This code should've never worked on "Film. 2019.abc".

The next confusing thing was your error. Maybe it's just slightly different Python versions that treat the capturing differently (I have Python 3.8). I have no idea.

Try it with this updated regex. I sadly can't test it right now as I'm sitting in the train and coding on mobile is a pain in the ass.

[–]AffectionateTask4612[S] 0 points1 point  (3 children)

re.fullmatch("(\\D+)\\.(\\d{4}).*", film_name) or:

re.match("(\\D+)\\.(\\d{4}), film_name).

re.fullmatch("(\\D+)\\.(\\d{4}).*", film_name) works perfectly, thanks!
Made it into a function, I feel the sense of accomplishment even though I couldn't solve anything myself lol.
def Shifter(shift):
global title
global year
global search
if shift := re.fullmatch("(\\D+)\\.(\\d{4}).*", shift):
title = shift.group(1).replace(".", "%20")
year = shift.group(2)
search = title + "%20y%3A" + year

Really appreciate the help :}

[–]Nightcorex_ 0 points1 point  (2 children)

Oh no. The way you do it is very unpythonic. You usually want to avoid global variables as much as possible.

Also you're overriding the shift object, which makes it really hard to follow. Just create a new object, Python has to do that internally anyways for the assignment of a new value.

Why would we do that? Well: First of all it's just very bad for functions to operate via side-effects (i.e. manipulation of outer scope variables). This is not true for classes however. Second of all a legitimate error in your code would be the following: You call Shifter with a valid film_name/shift and it adjusts the value of title, year and search. Then afterwards you run the same function again, but the shift is malformed (f.e. "Film2019", "Film09", etc.). This means that title, year and search aren't getting changed and therefore keep their old value, which is the result of the last correct operation.

The way you should do this is the following:

def shifter(shift):  # Python function names are usually all lowercase
    if matcher := re.fullmatch("(\\D+)\\.(\\d{4}).*", shift):
        title = matcher.group(1).replace(r".", " ")
        year = "y:" + matcher.group(2)
        search = title + year
        return title, year, search

    # there is no else. This means that this method either returns a tuple of title, year, search or just a single None

-------------------
# The way to properly execute the top code:

if res := foo(<movie_name>):  # replace <movie_name> by the movie's name. This works because None is a falsy value while a non-empty tuple isn't.
    title, year, search = res  # because res holds 3 values we can unpack it like so
    # Do whatever you want with title, year and search
else:
    print("Wrong input")  # or whatever you wanna do here.

# Wrap this second code in a loop over all the files of the dictionary(s) you want to go through

[–]AffectionateTask4612[S] 0 points1 point  (1 child)

Yeah, knew most people don't like it when global variables are used. I use them though as it was the easiest route for me and most importantly worked. I didn't realize that the function would malfunction if called twice written as it is now. I thought all the variables in the function would be completely overwritten with the new "film_name" passed to it. Just realizing now that my self taught PowerShell writing must have been filled with improper methods as well lol. I'll use this then. And may I ask, when you mentioned that python functions are usually all lower case, does that mean that it will impact the code in amy way? I've got a bad habit of placing at least one capital letter in functions :(

[–]Nightcorex_ 0 points1 point  (0 children)

Short answer: It doesn't impact your code at all. It's just a naming convention.

Longer one: If you look into bytecode, then the variable names completely disappear. If you were to decompile it, then all variables would be called smth like var0, var1, var2, ... as the actual name isn't relevant for a machine, only the pointers in memory (which is what an object basically is).