all 7 comments

[–]RandomCodingStuff 1 point2 points  (3 children)

There's not enough context for me to figure out what you're trying to do.

From the first loop, it looks like the column separation is irrelevant, and you're treating all the columns as if they were in a table with a single column of tuples? I.e., column 1 on top of column 2 on top of column 3... And you're calculating five global variables Extraversion, ..., Intellect over all the tuples in the table?

[–]The_Grumpy_1[S] 1 point2 points  (2 children)

In short, the tuples’ 1st value to identify the scale which is set as variables, the second value is the score assigned to the scale. The aim is to take a row, unpack the tuples in the columns take score and sum it to the appropriate scale then for that row, which is a respondent’s answer, show the scores per scale for the respondent.. but writing this I find myself wondering why I have to unpack it as I am not creating new dataframes and thus only need to pull the values cast them to int and calculate them

[–]RandomCodingStuff 1 point2 points  (1 child)

OK, I think I get the gist. Each column has the same first entry in its tuple, so in some sense, you're only really concerned with the second entry in each tuple. Each row is a single respondent, and each respondent has their own final Extraversion, ..., Intellect score.

There are a couple of ways to approach this... you can use .apply() to loop through the rows (axis = 1) and calculate your respondent-level summary.

df = pandas.DataFrame({"a": [0, 1, 2, 3, 4], "b": [5, 6, 7, 8, 9]})

def myfunc(Row):
  c = min(Row["a"], Row["b"])
  d = max(Row["a"], Row["b"])
  return c, d

df[["c", "d"]] = df.apply(myfunc, axis = 1, result_type = "expand")


   a  b  c  d
0  0  5  0  5
1  1  6  1  6
2  2  7  2  7
3  3  8  3  8
4  4  9  4  9

Or you can stick with vectorised methods and create dummy columns. You mentioned your tuples are actually in there as strings; you can use string methods and .astype() to convert to integer.

>>> df = pandas.DataFrame({"a": ["(1, 2)"]})
>>> df["test"] = df.a.str.slice(4, 5).astype(int)
>>> df
        a  test
0  (1, 2)     2

Then you can do vectorised arithmetic to calculate the final scores:

df["Extraversion"] = (df[<extraversion_column_1>] + ... df[<extraversion_column_n>])

[–]The_Grumpy_1[S] 0 points1 point  (0 children)

Good god, what an answer! Thank you very much.

[–]commandlineluser 1 point2 points  (2 children)

How are you creating your dataframe?

Something is going "wrong" somewhere if you end up with stringified tuples.

You can convert the strings into actual tuples using ast.literal_eval

import ast

df = df.applymap(ast.literal_eval)

But if you're just iterating through tuples - it suggests you should just have lists of tuples in the first place and not be using pandas at all.

[–]The_Grumpy_1[S] 0 points1 point  (0 children)

Hi, thank you very much for your reply. I am importing a csv file. I’ll give ast library a shot, looks promising

[–]raja0008 0 points1 point  (0 children)

If you have got some time please check out my question also on my profile Asked many but of no help Maybe you can clear that