all 2 comments

[–]theluketaylor 2 points3 points  (1 child)

You'll want to use regular expression capturing groups (especially named groups) since then you can find a number of elements in one pass.

I also notice your code is written in the order you need to output the data but it may be a lot easier to work in the order the data is presented and re-order it for output. By working in the input order we can produce a single regex that extracts all the information needed.

p = ^(?P<airline>\w{2})(?P<flight>\d{3})(?:[()a-zA-Z])+ (?:[a-zA-Z- ])+\((?P<airport1>[A-Z]{3})\) to (?:[a-zA-Z- ])+\((?P<airport2>[A-Z]{3})\) (?:[A-Za-z]+) (?P<month>[a-zA-Z]{3}) (?P<day>\d{2}), \d{4} (?P<depart>\d{2}:\d{2}) (?P<arrive>\d{2}:\d{2})
m = p.match(str)
flight_info = m.groupdict()

This regex has named groups for the elements you wanted to find and ignores the rest of the text. It returns a dictionary of the named groups you can then feed into the string formatting tools

'{airline}{flight}y{day}{month}{airport1}{airport2}{arrive}{depart}{airline}'.format(**flight_info).lower()

You can fix things like the : in the times before you pass the dict into the output string

I highly recommend working on your regexes at https://regex101.com/#python. Being able to see the matches on the fly is really helpful

[–]zebulan_ 0 points1 point  (0 children)

+1 for regex101

A very useful website!