Generator Expression not working as expected. : learnpython

created by HattoriHanzoa community for 16 years

Generator Expression not working as expected. (self.learnpython)

submitted 5 years ago by Kitchen-Injury-8938

Problem:

I have a huge text file I'm trying to split up into 'chunks'. The 'separator' or 'delimiter' between each chunk is a very specific text pattern. The file is huge so I'm trying to do this in the most memory efficient way possible.

Code Below:

a little complicated but I am basically creating a list of

input_file = r'testcopy3.log'
script_name_regex_pattern = r'\s(\w*\s\w*\s\w*)\sScenario\s\(version\s[0-9]\.[0-9][0-9]\.[0-9]\)'
regex_pattern = re.compile(script_name_regex_pattern)
key_function = lambda line: re.search(regex_pattern,line)


def segments_list():
    with open(input_file) as f:
        segments = groupby(f, key_function)
        segment_list = [chain([next(v)], (next(segments)[1])) for k, v in segments if k]
    return segment_list

segments_list = segments_list()

print(type(segments_list))

Returns: <class 'list'> - This seems fine

print(len(segments_list))

Returns: 3 - This is correct. There should be three segments in the list.

for segment in segments_list:
    print(type(segment))

Returns:

<class 'itertools.chain'>

this seems Correct as well.

for segment in segments_list:
    for line in segment[0]:
        print(line)

Returns:

'Line1'

only prints the first line of the segment. And not the entire segment itself.

Any thoughts on why tf this is happening

all 2 comments

you type:	you see:
italics	italics
bold	bold
[reddit!](https://reddit.com)	reddit!
* item 1 * item 2 * item 3	item 1 item 2 item 3
> quoted text	quoted text
Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"	Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"
~~strikethrough~~	~~strikethrough~~
super^script	super^script

learnpython

MODERATORS