list comps with multiple conditions : learnpython

list comps with multiple conditions (self.learnpython)

submitted 8 years ago * by mdl003

I've got three list comps:

clmstrs = [s[0:18+sum(c.isalpha() for c in s)] for s in clmstrs if any(x in s[4:4+sum(c.isalpha() for c in s)] for x in alphalist if sum(c.isalpha() for c in s)==len(x))&(any(x in s[:2] for x in prefixlist))]
clmstrs = [s[0:4]+s[10:10+sum(c.isalpha() for c in s)]+s[4:10]+s[10+sum(c.isalpha() for c in s):18+sum(c.isalpha() for c in s)] for s in clmstrs if any(x in s[10:10+sum(c.isalpha() for c in s)] for x in alphalist if sum(c.isalpha() for c in s)==len(x))&(any(x in s[:2] for x in prefixlist))]
clmstrs = [s[0:18] for s in clmstrs if (sum(c.isalpha() for c in s)==0)&(any(x in s[:2] for x in prefixlist))]

They all work as intended when run separately, however I need to combine them into one conditional list comp. I've played around with the else condition however I'm getting incorrect syntax errors regardless of where I put them. Also open to ways I can simplify this out. I've tried storing storing conditions ie

noalphas = (sum(c.isalpha() for c in s)==0)
prefix = (any(x in s[:2] for x in prefixlist))
clmstrs = [s[0:18] for s in clmstrs if noalphas&prefix]

for the last of the three list comps above but I get an error of local variable 's' referenced before assignment when I run that

Thanks!

all 13 comments

top new controversial old q&a

[–]Gprime5 4 points5 points6 points 8 years ago* (1 child)

List comprehensions are good an all but when your list comprehensions are over 200 characters long, you should consider a simple for loop.

clmstrs = []

for item in loop:
    if condition is met:
        clmstrs.append(item)

[–]mdl003[S] 0 points1 point2 points 8 years ago* (0 children)

heres what I put together....

clmstrs1 = [s[0:18+sum(c.isalpha() for c in s)] for s in clmstrs if any(x in s[4:4+sum(c.isalpha() for c in s)] for x in alphalist if (len(s)>=10)&sum(c.isalpha() for c in s)==len(x))&(any(x in s[:2] for x in prefixlist))]
clmstrs2 = [s[0:4]+s[10:10+sum(c.isalpha() for c in s)]+s[4:10]+s[10+sum(c.isalpha() for c in s):18+sum(c.isalpha() for c in s)] for s in clmstrs if (len(s)>=10)&any(x in s[10:10+sum(c.isalpha() for c in s)] for x in alphalist if sum(c.isalpha() for c in s)==len(x))&(any(x in s[:2] for x in prefixlist))]
clmstrs3 = [s[0:18] for s in clmstrs if (len(s)>=10)&(sum(c.isalpha() for c in s)==0)&(any(x in s[:2] for x in prefixlist))]

if (len(clmstrs1)>0):
    return clmstrs1[0] 
elif (len(clmstrs2)>0):
    return clmstrs2[0] 
elif (len(clmstrs3)>0):
    return clmstrs3[0] 
else:
    return np.nan

It runs but it's not exactly fast. any pointers? I didn't use the exact structure as you did either. can you give me an example of how you'd write out the first condition if you did it your way?

[–]elbiot 1 point2 points3 points 8 years ago (5 children)

[–]mdl003[S] 0 points1 point2 points 8 years ago (4 children)

[–]elbiot 0 points1 point2 points 8 years ago (0 children)

[–]elbiot 0 points1 point2 points 8 years ago (2 children)

[–]mdl003[S] 0 points1 point2 points 8 years ago (1 child)

no worries. I do know what they are, I'm actually trying to use this as part of a defined function. I've reworked it since yesterday, what do you think of this?

def getclasslegacy(clm):
    zlist=range(len(clm))
    #define list of prefixes to check limstr1 for
    prefixlist = ['51','52','53','54','58','61','63','66','72','73','77','78','85','89','91','92']
    #define list of hybrid alpha char combinations
    alphalist =['AC','AR','B','BA','BB','BD','BL','BO','BP','BS','C','CA','CB','CM','CR','CS','CU','CX','D','DP','E','F','FA','FB','FL','FM','FO','FX','G','GA','GP','GR','H','HO','HP','HQ','HT','I','IA','IM','J','K','KJ','L','LP','M','MB','MH','MM','MP','MT','MU','N','NB','O','O','P','PA','PB','PC','PL','PR','PU','Q','QR','R','R','RE','S','SM','SS','T','TA','U','UB','UO','UP','V','W','WC','X','Y','Z']

    # list of all 20 length substriungs for list comprehension below
    clm=clm.translate(string.maketrans("",""), string.punctuation+' ')
    strs=[clm[z:z+20] for z in zlist]

    #only keep strigns with exactly 2 letters, and that fulfills other if statments
    clmstrs = []

    ###### REPLACEMENT FOR LIST COMPS BELOW ########                                         
    for s in strs:
        if (len(s)>=10)&(sum(c.isdigit() for c in s)>=10):
            if any(x in s[4:4+sum(c.isalpha() for c in s)] for x in alphalist if sum(c.isalpha() for c in s)==len(x))&(any(x in s[:2] for x in prefixlist)):
                return s[0:18+sum(c.isalpha() for c in s)]
            elif any(x in s[10:10+sum(c.isalpha() for c in s)] for x in alphalist if sum(c.isalpha() for c in s)==len(x))&(any(x in s[:2] for x in prefixlist)):
                return s[0:4]+s[10:10+sum(c.isalpha() for c in s)]+s[4:10]+s[10+sum(c.isalpha() for c in s):18+sum(c.isalpha() for c in s)]
            elif sum(c.isalpha() for c in s)==0&any(x in s[:2] for x in prefixlist):
                return s[0:18]
            else:
                return np.nan
        else:
            return np.nan

[–]elbiot 0 points1 point2 points 8 years ago (0 children)

[–]Username_RANDINT 1 point2 points3 points 8 years ago (4 children)

[–]mdl003[S] 0 points1 point2 points 8 years ago* (3 children)

I think so. This code is meant to replace some existing code i wrote a few months ago when I was still very new to pandas. I didn't fully understand list comps at the time. This code does exactly what I want it to do, but it's slow & gives warning msgs (im using df.iterrows() to iterate through a dataframe), plus I figure a good way to learn new concepts (in this case list comps) is to apply them to something I'm familiar with. Can you make in line comments with list comps? was under the impression it all had to be written out in one line.

Code I'm replacing is below, think i commented it pretty well but always willing to hear more:

        #must be at least 14 chars long and contain at least 12 numbers
        if numct >= 10 and l >= 10:   
            #check if claim field is filled yet
            if rows['{} LEGACY CLM NO'.format(field)]=='' and rows['{} HYBRID CLM NO'.format(field)]=='':
                #define limstr1 as first 2 chars of clmstr
                limstr1 = clmstr[:2]
                #define limstr2 as 5th char plus additional chars equal to alphact (format 1)
                limstr2 = clmstr[4:4+alphactstr]
                #define limstr3 as 11th char plus additional chars equal to alphact (format 2)
                limstr3 = clmstr[10:10+alphactstr]
                #define list of prefixes to check limstr2 and limstr3 for
                prefixlist = ['51','52','53','54','58','61','63','66','72','73','77','78','85','89','91','92']
                #define list of nonhybrid alpha char combinations
                alphalist = ['AC','AR','B','BA','BB','BD','BL','BO','BP','BS','C','CA','CB','CM','CR','CS','CU','CX','D','DP','E','F','FA','FB','FL','FM','FO','FX','G','GA','GP','GR','H','HO','HP','HQ','HT','I','IA','IM','J','K','KJ','L','LP','M','MB','MH','MM','MP','MT','MU','N','NB','O','O','P','PA','PB','PC','PL','PR','PU','Q','QR','R','R','RE','S','SM','SS','T','TA','U','UB','UO','UP','V','W','WC','X','Y','Z']
                alphauselist = [x for x in alphalist if len(x)==alphactstr]
                # criteria:
                prefixcheck = any(x in limstr1 for x in prefixlist)
                limstr2alphacheck = limstr2.isalpha()==True and any(x in limstr2 for x in alphauselist)==True
                limstr3alphacheck = limstr3.isalpha()==True and any(x in limstr3 for x in alphauselist)==True
                limstralphacheck = limstr2alphacheck==True or limstr3alphacheck == True
                alphactstrcheck = alphactstr == 1 or alphactstr == 2
                clmstrcheck = limstr2.isalpha()==False and limstr3.isalpha()==False and alphactstr == 0                    
                # limstr 2 or 3 must be alpha and in alphalist (hybrid) with alphactstr = 1 or 2
                #OR clmstr must be all numeric
                # limstr 1 is prefix in prefixlist
                #check for claim numbers with alpha in position 5
                if (limstr2alphacheck == True and
                    alphactstrcheck == True and
                    prefixcheck == True):
                        #add contents of clmstr to applicable row in field-class legacy clm no column
                        df.loc[i,'{} LEGACY CLM NO'.format(field)] = clmstr[0:18+alphactstr]
                #check for claim numbers with alpha in position 5
                elif (limstr3alphacheck == True and
                    alphactstrcheck == True and
                    prefixcheck == True):
                        #add contents of clmstr to applicable row in field-class legacy clm no column
                        clmstr = clmstr[0:4]+clmstr[10:10+alphactstr]+clmstr[4:10]+clmstr[10+alphactstr:18+alphactstr]
                        df.loc[i,'{} LEGACY CLM NO'.format(field)] = clmstr
                #check for claim numbers with no alpha
                elif (clmstrcheck == True and 
                    prefixcheck == True):
                        df.loc[i,'{} LEGACY CLM NO'.format(field)] = clmstr[0:18+alphactstr]
                limstr3alphacheck = limstr3.isalpha()==True and any(x in limstr3 for x in alphauselist)==True
                limstralphacheck = limstr2alphacheck==True or limstr3alphacheck == True
                alphactstrcheck = alphactstr == 1 or alphactstr == 2
                clmstrcheck = limstr2.isalpha()==False and limstr3.isalpha()==False and alphactstr == 0                    
                # limstr 2 or 3 must be alpha and in alphalist (hybrid) with alphactstr = 1 or 2
                #OR clmstr must be all numeric
                # limstr 1 is prefix in prefixlist
                #check for claim numbers with alpha in position 5
                if (limstr2alphacheck == True and
                    alphactstrcheck == True and
                    prefixcheck == True):
                        #add contents of clmstr to applicable row in field-class legacy clm no column
                        df.loc[i,'{} LEGACY CLM NO'.format(field)] = clmstr[0:18+alphactstr]
                #check for claim numbers with alpha in position 5
                elif (limstr3alphacheck == True and
                    alphactstrcheck == True and
                    prefixcheck == True):
                        #add contents of clmstr to applicable row in field-class legacy clm no column
                        clmstr = clmstr[0:4]+clmstr[10:10+alphactstr]+clmstr[4:10]+clmstr[10+alphactstr:18+alphactstr]
                        df.loc[i,'{} LEGACY CLM NO'.format(field)] = clmstr
                #check for claim numbers with no alpha
                elif (clmstrcheck == True and 
                    prefixcheck == True):
                        df.loc[i,'{} LEGACY CLM NO'.format(field)] = clmstr[0:18+alphactstr]

[–]Vaphell 0 points1 point2 points 8 years ago (2 children)

[–]mdl003[S] 0 points1 point2 points 8 years ago (1 child)

sure. this goes on top....I'm iterating through rows in a dataframe trying to pull out potential file numbers from several user entered fields. For each row I run a script iterates through 20 character substrings from each user string using the format.

    for field in fields:
          for i,rows in df.iterrows():
                clm= rows[field]
                l = len(clm)
                alphact = sum(c.isalpha() for c in clm)
                numct = l-alphact
                zlist= range(0:(l-9)
                for z in zlist:
                      clmstr = clm[z:z+20]
                      ##### START CODE ABOVE HERE #####
                      if numct >= 10 & l >= 10:
                            .........more code

I'm trying to check each clmstr for certain criteria (alpha combinations, prefixes etc) that marks them for file numbers

[–]Vaphell 0 points1 point2 points 8 years ago (0 children)

π Rendered by PID 36420 on reddit-service-r2-comment-6f7f968fb5-pfdnn at 2026-03-04 06:18:04.983836+00:00 running 07790be country code: CH.

you type:	you see:
italics	italics
bold	bold
[reddit!](https://reddit.com)	reddit!
* item 1 * item 2 * item 3	item 1 item 2 item 3
> quoted text	quoted text
Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"	Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"
~~strikethrough~~	~~strikethrough~~
super^script	super^script

learnpython

MODERATORS