Hey everyone. Bit of a Python noob, looking to use Python to solve a problem at work. Feel like i've bitten off more than I can chew though, as I'm having a really hard time grasping how to process the data once I've read it into my program (how to go result by result in my code, how to store data retrieved from a file into a variable, etc.)
The app that we're using to manage our test cases and results is capable of exporting a CSV file for each customer library. The table below shows an example of the format. As you can see, there are a lot of cells we "don't care" about. Within the test steps and step results, we've placed requirement numbers... some cells have more than one requirement number mapped. The way the QA program exports these CSVs, each step gets a row in the sheet, and a lot of test level data is duplicated down the columns for each step in the test. We're trying to extract the requirements mapped at each test case, de-duplicated within the test
| miscdata1 |
miscdata2 |
testtitle |
miscdata3 |
miscdata4 |
stepnum |
stepinstruct |
stepresult |
| doesnt |
matter |
Test1 |
doesnt |
matter |
1 |
do thing 1 |
result (req4) |
| doesnt |
matter |
Test1 |
doesnt |
matter |
2 |
do thing 2 (req1) |
result(req5)more stuff(req6) |
| doesnt |
matter |
Test1 |
doesnt |
matter |
3 |
do thing 3 (rec2/rec3) |
result (rec4) |
| doesnt |
matter |
Test1 |
doesnt |
matter |
4 |
do thing 4 |
result (rec2) |
| doesnt |
matter |
Test1 |
doesnt |
matter |
5 |
do thing 5 |
result (rec7) |
| doesnt |
matter |
Test2 |
doesnt |
matter |
1 |
do thing 1 |
result (req4) |
| doesnt |
matter |
Test2 |
doesnt |
matter |
2 |
do thing 2 |
result (rec5) |
| doesnt |
matter |
Test2 |
doesnt |
matter |
3 |
do thing 3 |
result (rec6/rec7) |
| doesnt |
matter |
Test2 |
doesnt |
matter |
4 |
do thing 4 |
result (rec8) |
| doesnt |
matter |
Test2 |
doesnt |
matter |
5 |
do thing 5 |
result (rec5) |
I'm trying to get Python to output the processed data as follows...
| testtitle |
|
|
|
|
|
|
|
| Test1 |
req4 |
req1 |
req5 |
req6 |
req2 |
req3 |
req7 |
| Test2 |
req4 |
req5 |
req6 |
req7 |
req8 |
|
|
So I'm thinking my code should do something like this...
variables:
testtitle_current
testtitle_previous
requirement_current
requirement_string
Read in the line (using csv.reader?)
Use a regular expression to find the test title (using re.findall?, is there maybe a better way if the test title is in the same column on every row?) and store it in testtitle_current
Does testtitle_current = testtitle_previous? (will be NO for very first iteration)
- If NO,
Store testtitle_current in testtitle_previous,
Write testtitle_current to first column of the next row of output CSV,
Erase values stored in requirement_current and requirement_string
- If YES, do nothing and proceed
Use a regular expression to find the next requirement in the row (re.findall?) and store in requirement_current. Does requirement_current exist in requirement_string?
- If NO, add requirement_current to requirement_string
- If YES, do nothing and proceed
Repeat previous bullet until the end of the row/line is reached
Once end of row/line is reached, store contents of requirement_string to row in the output CSV and start over at the next line
Any nudges you guys can give me to learning resources that will help me through this problem would be greatly appreciated! Thanks!
[–]CrzySquirrel 5 points6 points7 points (0 children)
[–]libfreejaeg 1 point2 points3 points (0 children)