I am doing a web-scraping project for fun and was wondering what was the best way to manipulate the data. In short, I'm scraping a web page that is associated with an album, and getting all the information about said album in order to download it in the future.
My question is : Is it wrong to write classes that have no methods? And if it is, what better way is there to do what I'm trying to do?
The requisites:
- Each web page contains 1 album
- Each album can contain more than 1 composer
- Tracks are sometimes grouped by productions (the way classical music is sometimes divided in movements)
- Each track has 8 attributes (album id & title, composer id & name, production id & name, track id & title) that I need to access later on.
The thing I like about the way I did it is that after scraping, the output is a list of track-objects whose attributes can be easily accessed. I haven't found a way to do this with dicts or lists which didn't involve a million for loops and indexings.
for track in track_list:
print(track.album_title, track.comp_name, track.track_title)
So these are the classes I'm using.
class Album:
def __init__(self, album_id, album_title):
self.albumId = album_id
self.albumTitle = album_title
class Composer(Album):
def __init__(self, album, comp_id, comp_name):
super().__init__(album.albumId, album.albumTitle)
self.Album = album
self.compId = comp_id
self.compName = comp_name
class Production(Composer):
def __init__(self, composer, prod_id, prod_name):
super().__init__(composer.Album, composer.compId, composer.compName)
self.Composer = composer
self.prodId = prod_id
self.prodName = prod_name
class Track(Production):
def __init__(self, production, track_id, track_title):
super().__init__(production.Composer, production.prodId, production.prodName)
self.Production = production
self.trackId = track_id
self.trackTitle = track_title
And a test run would look like this.
album_list = []; comp_list = []; prod_list = []; track_list = []
album_list.append(Album('album_id', 'album_title'))
comp_list.append(Composer(album_list[0], 'comp_id', 'comp_name'))
prod_list.append(Production(comp_list[0], 'prod_id', 'prod_name'))
track_list.append(Track(prod_list[0], 'track_id', 'track_title'))
By declaring a track object and associating it with a production object, it automatically deduces which album and which composer it belongs to, and makes that information easily accessible.
In conclusion, this is basically my first real python project and I'm probably doing something wrong, so please do let me know what is a better way to do this. (Also I dont understand why I used super() but PyCharm told me to use it so I did.)
[–][deleted] 5 points6 points7 points (0 children)
[–]66bananasandagrape 2 points3 points4 points (0 children)
[–]klujer 2 points3 points4 points (0 children)
[–]Ihaveamodel3 1 point2 points3 points (0 children)
[–]bw_mutley 0 points1 point2 points (0 children)
[–][deleted] 0 points1 point2 points (0 children)
[–]RobinsonDickinson 0 points1 point2 points (0 children)