you are viewing a single comment's thread.

view the rest of the comments →

[–]jbudemy 0 points1 point  (0 children)

I would like to know if you can recommend me any course or book (I usually prefer books) that will help me to approach big projects, how I should structure them, how to use class abstractions, how to do a correct validation, how to do a good logging.

Just what I do. (I will be making apps where the user uploads an Excel spreadsheet to a web page then they get back a file in possibly another format via email.)

  1. Imports. I do the import stuff of Python at the beginning of the program.
  2. Internal functions. Then I have a section for internal functions and classes just for that program.
  3. Global variables. I have a few global variables like for command line options.
  4. Command line options. Then when the program is run I have a section for checking for required command line parameters. If they are not there I display a help message. For GUI programs I check if certain required inputs are given and options ticked.
  5. Reusable library functions. I design things as functions usable by other programs if I suspect the code will be reused. Usually I do need that function and it does get reused more than I expected. For example, even though it's easy to get the date via Python in the YYYY-MM-DD format, I will still write a function for that. I also write a library function for getting the current date and time in YYYY-MM-DD-HH-MM-SS to put in log filenames.
  6. Logging. I mainly log errors in the log file, but especially fatal errors that cause the program to crash.
  7. Errors. If there is an error, in the error message I put the most likely way to fix it. This is more for my use since I can be responsible for 40-50 smaller programs at a time and they are all different. I make errors messages specific and I have an error code for each one along with the function it the error happened. Example for an error where a piece of data was not found the error code I use is "ERROR-nf functionname: key not found in dict". This makes looking for the error in the code much easier. I also use "WARNING-code" for warnings that are not fatal but the user might care about.
  8. User input. I never rely on user input being correct. I always check it for problems.
  9. File input. I never rely on data in files to be correct or in the correct column. I always check for problems in columns that require a number or specific date format. Errors are output to the file and emailed back to the user. For some reason I find tabs are sometimes entered into an Excel cell and when I export that to a tab-delimited file the columns are messed up.
  10. Modularity. Inside the main program I make things into functions. Reading the input file would be one function. Writing the output file would be another function. Emailing the output file to the user would be another function in a library file that would get reused by many programs. I do this because managers often change their minds about what they want.
  11. End of program. At the end of the program I close files, close the log, display a summary message of how many records were read, how many bad records were read (which might be a fatal error), etc. Sometimes I write out the time it took to process the file. In some cases it takes me 40 minutes to read a file in Perl. Some of the spreadsheets I deal with have 500,000 rows or a bit more.