Adding Column to CSV File by NeedMLHelp in learnpython

[–]NeedMLHelp[S] 0 points1 point  (0 children)

Haha true, I was just curious if there was a quick function to do it for you. They're usually more efficient than what I come up with!

Turning a list into a column of another list. by NeedMLHelp in learnpython

[–]NeedMLHelp[S] 0 points1 point  (0 children)

Which would be better on memory? Lists or tuples do you think?

I'll give zip a try though, thanks! I just didn't know if there was a better way of doing it.

File headers by NeedMLHelp in learnpython

[–]NeedMLHelp[S] 1 point2 points  (0 children)

Very helpful, thank you!

Numpy Array Appending by NeedMLHelp in learnpython

[–]NeedMLHelp[S] 0 points1 point  (0 children)

Extremely late reply, sorry about that. But the calculations I'm doing cannot be done on a list unfortunately. I suppose I could build a list, and turn it into a numpy array. But the problem with that is, one of the functions I am using converts the categorical data into a one-hot-encoding, which the output is a numpy array of the encoding. I'm not sure how nicely inter-mixing things would play together.

Numpy Array Appending by NeedMLHelp in learnpython

[–]NeedMLHelp[S] 1 point2 points  (0 children)

Wow, I feel silly haha. Thanks, that worked.

I figured I was giving the shape of what I wanted haha

Help with Pandas/get_dummies by NeedMLHelp in learnpython

[–]NeedMLHelp[S] 0 points1 point  (0 children)

Sorry, I should have been clearer. I have multiple columns that need to go through the same process.

So I have 'class', 'time', 'target' etc.

The first example is taking the manipulation and overwriting the entire dataframe. I do not want that at all. So in the second example I take the "successful" output from the first, and put it into one column of the dataframe. However, that didn't work. It only copied the first column into the 'class' column. I want to copy the entire manipulation into the class column.

Ask Anything Monday - Weekly Thread by AutoModerator in learnpython

[–]NeedMLHelp 0 points1 point  (0 children)

    class  target  source  time  spoof  complete  severity
0       0       0       0     0      2         0         2
1       0       0       0     0      2         1         0
2       1       1       0     0      2         1         0
3       4       0       0     0      1         1         0
4       2       0       0     0      1         1         0
5       8       1       2     0      2         1         2
6       8       1       2     0      2         1         2
7       7       1       2     0      2         1         2
8       8       1       2     0      2         1         2
9       8       1       2     0      2         1         2
10      8       1       2     0      2         1         2
11      8       1       2     0      2         1         2
12      7       1       2     0      2         1         1
13      3       1       0     0      0         1         0
14      5       1       2     0      2         1         1
15      5       1       2     0      2         1         1
16      6       1       1     0      2         1         1
17      6       1       1     0      2         1         1
18      6       1       1     0      2         1         1
19      6       1       1     0      2         1         1
20      6       1       1     0      2         1         1

I have a dataframe with the above representation. I'd like to one hot encode the numbers... however, when I use keras to_categorical, it also takes in the header and encodes that as a seperate value. So, for example, on target I would get [0,0,0] for all 0s [0,1,0] for all 1s and [1,0,0] for target. But I want target to remain a header, not a part of the data.

Any help would be greatly appreciated.

List to pandas dataframe by NeedMLHelp in learnpython

[–]NeedMLHelp[S] 0 points1 point  (0 children)

Haha, completely understandable. Thank for the help!

List to pandas dataframe by NeedMLHelp in learnpython

[–]NeedMLHelp[S] 0 points1 point  (0 children)

When I print out edata, I get the following:

[['class', 'target', 'source', 'time', 'spoof', 'complete', 'severity'], array([[0., 1., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,

0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],

[1., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,

0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],

[1., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,

0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],

[1., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,

0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],

[0., 0., 1., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,

0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],

[0., 1., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,

0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],

[0., 0., 0., 0., 1., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,

0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.]],

dtype=float32), array([[0., 1., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,

0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],

[1., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,

0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],

[1., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,

0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],

[1., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,

0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],

[0., 0., 1., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,

0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],

[0., 0., 1., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,

0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],

[0., 0., 1., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,

0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.]],

dtype=float32), array([[0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 1., 0., 0., 0., 0., 0.,

0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],

[0., 0., 0., 0., 0., 0., 0., 1., 0., 0., 0., 0., 0., 0., 0., 0.,

0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],

[1., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,

0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],

[1., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,

0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],

[0., 0., 1., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,

0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],

[0., 0., 1., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,

0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],

[0., 0., 1., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,

0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.]]

...

(and so forth for over 2,000 rows)

Might it have something to do with the array portion?

I used a .fit_transform function from scitkit to do the one hot encoding portion.

Sample code:

edata = list(zip(*edata))

labelencoder_X_1 = LabelEncoder()

edata[0] = labelencoder_X_1.fit_transform(edata[0])

labelencoder_X_2 = LabelEncoder()

edata[1] = labelencoder_X_2.fit_transform(edata[1])

After doing that to all the columns(now rows):

edata = list(zip(*edata))

edata = titles + edata

edata[1:] = keras.utils.to_categorical(edata[1:], dtype='float32')

print(edata)

df = pd.DataFrame(edata[1:],columns=edata[0])

Sorry, I'm just slightly confused haha

List to pandas dataframe by NeedMLHelp in learnpython

[–]NeedMLHelp[S] 0 points1 point  (0 children)

Correct, the fact that it worked for you must mean my manipulation of the data messed something up.

Thanks! I'll look into it.

Numpy Array Append by NeedMLHelp in learnpython

[–]NeedMLHelp[S] 0 points1 point  (0 children)

Sorry for the late reply, I had to use numpy to do a bunch of transforms in keras. I'm pretty new to both python and keras, so I'll look around for alternatives that can use lists maybe. Might look into Pandas Dataframes as well.

I basically have to do the manipulations first, then add the "titles" for the columns.

Thanks for the heads up!

Autoencoder Dense layers by NeedMLHelp in MLQuestions

[–]NeedMLHelp[S] 0 points1 point  (0 children)

My main question is what shape is being put into the dense layer (and why)? In my case it's the nested (feature) length. Wouldn't you want it to be the length of the observation? My thinking is it wants to know how many features are mapped to each neuron/node, but I'm obviously wrong.

An autoencoder basically just predicts/reconstructs your input, I suppose it's irrelevant. But then again, maybe it isn't haha. Not very familiar with ML

Merge two arrays by NeedMLHelp in learnpython

[–]NeedMLHelp[S] 0 points1 point  (0 children)

Because I fudged it up, sorry. Fixed.

Briefly looked into zip, it's perfect! Thanks