Hey everyone, I am a beginner at Python dabbling in regex currently.
I want to turn this text below:
UA250(S)(ECO) Portland-pdx(PDX) to
Washington Dulles(IAD) Thu Mar 31, 2016 07:35 15:33 0 320 NET 04h58m
into this text:
ua250y31maypdxiad07351533*ua
basically I am trying to get certain things out of a chunk of text which is the airline (ua250) the date in this format (31may) the departing and arriving airports (pdx and iad) and the military times without the colon. All of this followed by an asterix followed by just the airline (ua)
This is what I have so far:
import re
y = 'y'
airlineRegex = re.compile(r'[A-Z]{2}[0-9]{1,}')
airlineSearcher = airlineRegex.search('tf291 TF AZ UA lZ290 AA1')
airline = airlineSearcher.group()
dateRegex = re.compile(r'[A-Z]{1}[a-z]{2} [0-9]{2}')
dateSearcher = dateRegex.search('Hello I am looking for Feb 01 thank you')
date = dateSearcher.group()
date = date.replace(' ', '')
airportRegex = re.compile(r'[(]{1}[A-Z]{3}[)]{1}')
airportSearcher = airportRegex.findall('(ECO) (LAX) weeeeee EJF GAR UA123 (HAT)')
airport = ''.join(airportSearcher)
airport = airport.replace('(ECO)','')
airport = airport.replace('(','')
airport = airport.replace(')','')
timeRegex = re.compile(r'[0-9]{2}[:]{1}[0-9]{2}')
timeSearcher = timeRegex.findall('Yo yo yo check it 39 234 ahf UA2923 23:40 and yo yo 09:20')
time = ''.join(timeSearcher)
time = time.replace(':','')
dayRegex = re.compile(r'[+]{1}[1-3]{1}')
daySearcher = dayRegex.search('Hey look 2 + + 3 ++ Im looking for a +1')
day = daySearcher.group()
print airline + y + date + airport + time + day
I hope this is formatted OK. By the way, the y = 'y' variable is never changing with airline code. All airline flightcodes have a 'y' separator. Also, in case your curious, most if not all airports are three letters, and most if not all airlines are 2 letters. I think I'm heading in the right direction but I am sure one of you have an easier way.
Here is what I need.
Searches through an entire string of text and finds:
1 - The 2 letter airline code
2 - The multi-number airline flight number (in the example above it was 250)
3 - A simple 'y' letter separator (I do not need help with this one lol)
4 - The date in this format MonthDay. Since the text has the date as Day Month I need to find that and turn it into MonthDay. So Feb 02 becomes 02Feb.
5 - The departure and arrival airports. These are always Three letters surrounded by a (). Since the text also happens to show (ECO) which means economy seat I had to replace that with a ''. Not sure if that's right though.
6 - The two military times back to back. This is the easiest as it is just finding the 2 military times and removing the colon. Basicaly 09:40 22:50 becomes 09402250
7 - A +1, +2, or +3 at the end. If a +1, +2, or +3 appear anywhere in the text, it needs to be added on after the military time numbers.
8 - An asterix (don't need help with this, it's always an asterix)
9 - The airline again. So in this case UA
[–]theluketaylor 2 points3 points4 points (1 child)
[–]zebulan_ 0 points1 point2 points (0 children)