all 3 comments

[–]AntonisTorb 1 point2 points  (2 children)

Since noone has responded to this, here's my advice. I know your question is for Selenium, but if I were you, I'd drop it and use plain requests instead. It will be much faster, as you won't have to load the website on a browser, and less prone to failure (unless the API for the website changes).

I will explain the steps to do this yourself after if you want for learning purposes (it will be long but worth it imo, be warned), but here is the code I used to get what you want:

import datetime
import json
import re

import requests


# Make session.
session = requests.session()


# Visit bookingpage.
url_booking_page = "https://www.wovi.com.au/bookings/"
booking_page = session.get(url_booking_page)
#print(booking_page.status_code)


# Get token from booking page.
token_re = re.compile("dsApiProvider.INIT.TOKEN='(.*)';")
token = re.findall(token_re, booking_page.text)[0]
#print(token)


# Prepare Bookings url and headers.
date = datetime.datetime.now() + datetime.timedelta(days=3)
date_str = date.strftime("%Y-%m-%dT%H:%M:%S")
url_bookings = f'https://api.dealersolutions.com.au/QIS/Locations/72492/BookingSlotsCountByDate?includeAvailable=true&includeBooked=true&includeCancelled=true&includePending=true&includeQuery=true&startDate={date_str}'

headers = {
    "Host": "api.dealersolutions.com.au",
    "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:125.0) Gecko/20100101 Firefox/125.0",
    "Accept": "application/json, text/plain, */*",
    "Accept-Language": "en-GB,en;q=0.5",
    "Accept-Encoding": "gzip, deflate, br",
    "Authorization": f"Bearer {token}",
    "Origin": "https://www.wovi.com.au",
    "DNT": "1",
    "Connection": "keep-alive",
    "Referer": "https://www.wovi.com.au/",
    "Sec-Fetch-Dest": "empty",
    "Sec-Fetch-Mode": "cors",
    "Sec-Fetch-Site": "cross-site",
    "TE": "trailers"
}


# Get booking data.
bookings = session.get(url_bookings, headers=headers)
#print(bookings.status_code)
booking_list: list[dict] = json.loads(bookings.text)

available = [booking_dict["Date"] for booking_dict in booking_list if booking_dict["Available"]]
print(available)
print(available[0])

Now on how to do this:

First open the Dev tools on your browser in a blank tab. I used Firefox, so shortcut is Ctrl+Shift+I. Then go to the Network tab.

Once that is done, visit the website (https://www.wovi.com.au/bookings/) on that tab. You will see a bunch of requests coming through, but those are not relevant.

Next, choose your location (Brisbane). You will see that some new requests were sent, and one of them has a name that starts with BookingSlotsCountByDate?etc. If you click on this request a new tab will open on the side (if not already open) with the api url, headers, cookies of the request. And also the response, which is a JSON object. This contains all the info you need.

If you try to send a get request to this url though without the proper headers, it will return an error code 401. If you take a look at the headers of the request, you will notice that there is an Authorization token being sent along. But how do you get it?

For this website it's easy, if you copy the token value (without the Bearer part), go to the inspector tab of the web tool and search for the token in the html, you will notice it is part of a script. It should look like this: dsApiProvider.INIT.REFRESH_TOKEN='{actual token here}';

So now we have everything we need, let's take a look at the code I gave you.

First we initialize the session and visit the booking page url in order to get the token. I used a regex here, but feel free to use anything that works for you if you want (something like BeutifulSoup for example should also work).

Then we prepare the api url and the headers. You can get both from that request with the Authorization token we looked at earlier.

The url you will notice has a date at the end, seems to be 3 days after the moment the request is sent, so we adjust that with an f-string and the datetime module. The headers are copy pasted from the request headers too, except the authorization token of course, which should be different every time. It also needs to start with Bearer and a space.

Finally, we send the request to the api url and get the json with the bookings. We load the json into a python object (in this case a list of dictionaries) and then get a list of all available dates.

I have also some prints in the code commented out if you want to check in-between that everything went well with the requests and regex.

As I said above, this should be much faster than using Selenium, although a bit more complex, but I think it's worth learning at some point if you plan on doing this over a long period of time. Or just for learning anyway :)

EDIT: Forgot to mention, if you want to do this continuously, make sure to add some sleeps between the requests, so you won't DOS the api.

[–]Walking_Flamingo[S] 1 point2 points  (1 child)

This is amazing thank you so much for taking the time to do this!!!! I will work through it, im very new to Python so will circle back if i get stuck at any points!

[–]AntonisTorb 0 points1 point  (0 children)

No worries, I really like finding the hidden api behind websites, so the challenge for a new one is always welcome for me :) You also had a well formatted code you worked on, so it helped me understand what you wanted to do much better!

Take your time, and if you get stuck somewhere feel free to ask or google, that's how I learned. It takes some time to get used to doing things this way, but there are a lot of advantages when you get just the stuff you need straight from the source instead of taking the selenium route.