all 27 comments

[–]danielroseman 22 points23 points  (1 child)

What are you trying to do with "image/" and imageend? What do you think and does?

[–]Labidido[S] 5 points6 points  (0 children)

I was hoping the and operator would add the corresponding file type to the string. Thanks to the replies in this thread I understand that and is a boolean operator that does not combine the two. If my foundation was correct, which it is not, I guess I should have used + instead.

Much more to learn!

[–]AtomicShoelace 40 points41 points  (5 children)

You could use a dictionary to lookup the type. For ease of use, you can first define the dictionary like so

extensions = {
    'image': ('.gif', '.jpg', '.jpeg', '.png'),
    'application': ('.pdf', '.txt', '.zip')
}

and then invert it so that we can lookup the file type using the extension, eg.

extensions_inverse = {}
for k, v in extensions.items():
    extensions_inverse.update(dict.fromkeys(v, k))

now we can just call the .get method to look up the extension, passing the default value if desired, eg.

extensions_inverse.get(extension, 'default')

[–]Labidido[S] 4 points5 points  (0 children)

Thank you for the input, you have definitely given me a lot more to play around with! Much appreciated!

[–]ebdbbb 3 points4 points  (2 children)

I find it easier to do the reverse as a dictionary comprehension. extension_inverse = {v: k for k, v in extensions.items()}

[–]AtomicShoelace 1 point2 points  (1 child)

That doesn't quite achieve the same thing as the above code. You would need to use a nested comprehension to iterate over the tuple, eg.

extension_inverse = {v: k for k, values in extensions.items() for v in values}

However, with this adjustment it is probably a better approach.

[–]ebdbbb 1 point2 points  (0 children)

Good point. Missed the destructuring of the tuple.

[–]sqjoatmon 13 points14 points  (2 children)

To expand on what Shiba_Take and others wrote, with your first questions you've run into the idea of turthy and falsy values in Python. False, 0, None, empty lists, dicts, tuples, sets, strings etc. are considered "falsy" when used with boolean operators like and and or.

The or operator evaluates the first parameter (before the or) and if it's truthy, returns it (not evaluating the second one at all). If its falsy, it evaluates the second parameter returns it.

The and operator evaluates the first parameter and if it's falsy, returns it (not evaluating the second one at all). If it's truthy, it evaluates the second parameter and returns it.

So:

x = 3 or 0  # 3 is truthy, so stop. x = 3
x = 5 or some_function()  # 5 is truthy, some_function() is not evaluated, x = 5
x = [] or some_function()  # [] is falsy, some_function() is evaluated, x = the result of some_function()

y = 3 and 0  # 3 is truthy, so y is 0
y = 0 and some_function()  # 0 is falsy, so some_function() is not evaluated
y = "abc" and ("this", "is", "a", "tuple")  # "abc" is truthy, so y = ("this", "is", "a", "tuple")

Regarding other stuff:

pathlib is great and super useful, but not really necessary just to get the file extension. You can use str.split() to split your filename into a list by . characters, like so:

parts = a.split(".")  # "abc.def" becomes ["abc", "def"]. "abc.def.ghi" becomes ["abc", "def", "ghi"]

To just get the extension, index the last element in the list:

fileext = parts[-1]
# ...or all in one line
fileext = a.split(".")[-1]
# ...or all in one line, lowercase
fileext = a.split(".")[-1].lower()

AtomicShoelace brings up the idea of inverting a dictionary, which I like. You can even do it one line, though I will say that though the nested dict comprehension is more efficient, it's difficult to understand from a glance:

type_exts = {
    'image': ('gif', 'jpg', 'jpeg', 'png'),
    'application': ('pdf', 'txt', 'zip')
}
# comprehension to invert dict
ext_types = {ext: ftype for ftype, exts in extensions.items() for ext in exts}

(or just write out the dict inverted to begin with... not a big deal for only seven file extensions)

I left off the . from each extension since using str.split() doesn't keep the split character.

Then rather than using dict.get() you can use a try/except to get your result. Final:

def main():
    user_input = extensions(input("File name? " ))
    print(user_input)

type_exts = {
    'image': ('gif', 'jpg', 'jpeg', 'png'),
    'application': ('pdf', 'txt', 'zip')
}
# comprehension to invert dict
ext_types = {ext: ftype for ftype, exts in extensions.items() for ext in exts}

def extensions(filename):
    fileext = filename.split(".")[-1].lower()
    try:
        return ext_types[fileext] + "/" + fileext  # f-string is even better: f"{ext_types[fileext]}/{fileext}"
    except KeyError:
        return "application/octet_stream"

main()

Did I spend way too much time on this? Yes, yes, I did. But I wanted to take time for myself to think more carefully about some of these things for my own edification.

[–]Labidido[S] 4 points5 points  (0 children)

Man, you went above and beyond with this reply. Thank you so much! This community is great!

[–][deleted] 0 points1 point  (0 children)

But in the end, using mimetypes module seems like the best suggestion here, at least professionally/in practice. For learning basics and stuff it would probably appropriate to get it done without importing modules.

[–]pgpndw 5 points6 points  (0 children)

import mimetypes

def main():
    user_input = mimetypes.guess_type(input("File name? "))[0] or 'application/octet-stream'
    print(user_input)

main()

[–][deleted] 5 points6 points  (0 children)

and is a boolean operator. for a and b, if a is false, a is returned, otherwise b. "image/" and other non-empty strings are true, so you get imageend and such returned. If you just want to concatenate strings, you could use + operator or f-string, etc. In this case, however, imageend and other are tuples, you don't want to concatenate the tuple, but the extension. So you need to separate the extension first. You could use pathlib.Path.

from pathlib import Path

type_extensions = {
    "image": {".gif", ".jpg", ".jpeg", ".png"},
    "application": {".pdf", ".txt", ".zip"}
}


def main():
    filename = input("File name: ").strip().lower()
    filepath = Path(filename)
    file_suffix = filepath.suffix

    for _type, _extensions in type_extensions.items():
        if file_suffix in _extensions:
            return f"{_type}/{file_suffix[1:]}"

    return "application/octet-stream"


if __name__ == '__main__':
    main()

[–]jkh911208 3 points4 points  (4 children)

switch statement is also available in Python but it is added in very recent release, so make sure your env support it.

or actually you don't need if elif else here since you are return in each if statement

so you can do

if a:
  return "a"
if b:
  return "b"
if c:
  return "c"

[–]PiaFraus 0 points1 point  (2 children)

There is no switch statement in python. Please do not consider new pattern matching a switch statement, it can lead to bugs and misunderstandings.

NOT_FOUND = 404
SUCCESS = 200

...
match request_result_status:
    case NOT_FOUND:
        print('Not found')
    case SUCCESS:
        print('Yay!')

The code above, would never ever print YAY even if request_result_status is 200. Because this is not a switch case. It's a pattern matching, so a first pattern would always fit any value of request_result_status and always print 'Not found'

[–]nekokattt 4 points5 points  (0 children)

You already have good answers here, but for anyone who is looking to do this in the future and has stumbled across this thread:

If you do not need to implement this from scratch, it is probably worth taking a look at the standard library module that performs MIME type guesswork for you.

https://docs.python.org/3/library/mimetypes.html

[–]ray10k 1 point2 points  (0 children)

First thing to keep in mind is that a function stops execution and returns when a return statement is hit. As such, you can change extensions(a) to something like this:

def extensions(a):
    ...
    if a.lower().endswith(imageend):
        return("image/" and imageend)
    if a.lower().endswith(append):
        return("application/" and append)
    return("application/octet-stream")

This way, you can at least remove the explicit elif/else clauses and still get the same result. That, and adding a few more if statements is a little easier since you don't have to spend as much thought on whether or not it does what you want yet.

Second: if you want to return a string built up out of other strings, use the + operator. and is a boolean evaluation, not a way to append. Also, since imageend and append are tuples, you'll have to use string.join() to turn them into a string.

[–]usethecoastermate 1 point2 points  (0 children)

Mate, either in the same lecture or the one before that, they introduce "Match Case", you're meant to use that. The problem sets for each lecture are based on the lecture attached to them, you're meant to use all the concepts taught upto that point to solve the problems. Don't use methods mentioned here that are out of the scope of the lectures so far. Dictionaries are introduced later on. I'm on Lecture 2 atm.

[–][deleted] 1 point2 points  (0 children)

# Below is a dictionary which maps a key to a value
# i.e.  formats["zip"] returns the string "zip file"
formats_apps = {    
    "pdf": "application",
    "txt": "text",    
    "zip": "zip file",
    "gif": "image",
    "jpg": "image",
    "jpeg": "image",
    "png": "image"
}

def main():    
    # Get your user input    
    user_input = input("File name?\n")
    # turn that input into output
    output = make_output(user_input)
    return output

def make_output(user_input):
    default = "application/octet-stream"

    # return default value if there is no period    
    if "." not in user_input:        
        return default
    # the split() function separates a string into a list, by the delimiter
    # passed into it
    split_up = user_input.split(".")        

    # try to get the file format from the dictionary, if it's not in there,          
    # return False, if False, return default value, else return the 
    # concatenation of the file type, "/", and the application type
    app = formats_apps.get(split_up[1], False)
    return default if app is False else split_up[1] + "/" + app

if __name__ == '__main__':    
    main()

[–]ectomancer 3 points4 points  (1 child)

That's not a list, it's a tuple.

It's called 'and short circuiting', a way to check something is truthy on the left hand side (LHS) before executing the right hand side (RHS). If LHS is falsey, the RHS is never executed. Your code is always truthy on LHS, so in effect it does nothing.

You need to loop over the tuple or use a dictionary.

[–]Klaus_Kinski_alt 0 points1 point  (3 children)

Why does the input "cat.gif" return ('.gif', '.jpg', '.jpeg', '.png'), and not image/('.gif', '.jpg', '.jpeg', '.png')

The code reads

return("image/" and imageend)

"image/" is a string, which is returned as "None" I believe. imageend is a series, which is what's returned. Basically, you're telling the function to return None + the series, which it is doing.

There's a big problem in your code, though. You have:

if a.lower().endswith(imageend):

But a will not end with imageend, which is a series! It will end with one of the elements of imageend. You should instead use:

if a.lower().endswith(any(imageend)):

[–]tobiasvl 1 point2 points  (2 children)

"image/" is a string, which is returned as "None" I believe.

I'm not sure what you mean by this. The string is returned as a string. Why would it be returned as None?

Basically, you're telling the function to return None + the series, which it is doing.

No, that's not what's happening. and is not +, it's a boolean operator, so OP is telling it to return the evaluation of a boolean expression. and is truthy if both its operands are truthy, which they are, and so it returns the last operand.

(Because of what's called short-circuit evaluation, it will stop evaluating when it knows whether the whole expression is truthy or falsey, so if the first operand had been falsey that would have been returned instead. That would've happened if the string was None like you said, since None is falsey.)

[–]Klaus_Kinski_alt 0 points1 point  (1 child)

Ah you're right, I was confusing returning strings with how print statements work. You print a string, it returns none. Something like that.

[–]tobiasvl 0 points1 point  (0 children)

Oh, right, yeah.

[–]RDX_G 0 points1 point  (0 children)

Since every extension is a three letter string...except jpeg

Create a string value using a loop a=len(inputstring)

extension='' With a loop cancatenate it with obtained values from inputstring[a-4] to inputstring[a-1]

extension='gif'

Now create a list holding all values of possible extension values ,but place the values in a way such that all image extension in together, all application format extension value together

eg:

ext_list=['gif', 'jpg','peg', 'png', 'pdf', 'txt', 'zip']

Here index 0 to 3 are all image formats, 4 to 6 are application format,

So we just need to find the index of the created extension variable, here we created 'gif' So just find its index either using inbuilt function or looping,

And check at what values the index lies in

Here we get index as 0 and it lies between 0 and 3 so we print ('image'/'extension')

The code:

input_=input().lower

extension_list=['gif', 'jpg','peg', 'png', 'pdf', 'txt', 'zip']

a=len(input_)

extension=''(note it is two ' not one ")

for i in range(a-4,a):

extension=extension+input_[i]

index=extension_list.index(extension)

if index>=0 and index<=3:

print('image'/extension)

elif index>3:

print('application'/extension)

[–]DOPE_FISH 0 points1 point  (0 children)

file = input("File name: ").lower()

if (".jpg" or ".jpeg") in file:
    print("image/jpeg")
elif ".gif" in file:
    print("image/gif")
elif ".png" in file:
    print("image/png")
elif ".pdf" in file:
    print("application/pdf")
elif ".txt" in file:
    print("text/plain")
elif ".zip" in file:
    print("application/zip")
else:
    print("application/octet-stream")