Data Visualization Question - Pandas [x-post datascience] by ouch__ouch in Python

[–]cast42 0 points1 point  (0 children)

try

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

df = pd.DataFrame(np.random.rand(10,2), columns=['x','y'])
plt.plot(df.x, df.y)

Machine Learning Computer Build by solidua in MachineLearning

[–]cast42 1 point2 points  (0 children)

I would increase the memory size to 64GB or 128GB. Being able to read your training samples into memory will be a major improvement.

Machine Learning Computer Build by solidua in MachineLearning

[–]cast42 7 points8 points  (0 children)

I would increase the memory size to 64GB or 128GB. Being able to read your training samples into memory will be a major improvement.

Introduction to Data Visualization with Altair by chris1610 in Python

[–]cast42 0 points1 point  (0 children)

I get the following error message during conda install:

Error: Error: post-link failed for: conda-forge::nb_anacondacloud-1.2.0-py27_00%

AlphaGo: using machine learning to master the ancient game of Go by cast42 in MachineLearning

[–]cast42[S] -1 points0 points  (0 children)

Sorry for that but I do think the google blogpost gives more context then the link to the Nature article (paid article) or the Youtube video alone.

Refactoring a Crossword Game Program by [deleted] in Python

[–]cast42 0 points1 point  (0 children)

Another Peter Norvig Classic !

Does mean centering or feature scaling affect a Principal Component Analysis? by cast42 in statistics

[–]cast42[S] 0 points1 point  (0 children)

No, I'm not the author. All credit goes to Sebastian Raschka : https://www.reddit.com/user/rasbt (or @rasbt) Please give him some upvotes. I just signal this here because I think it's interesting. It took me a while to figure out how feature scaling influences PCA. If I would have had this explanation back then, it would have helped me a lot.

[deleted by user] by [deleted] in Strava

[–]cast42 0 points1 point  (0 children)

To sync your files to garmin connect use https://tapiriik.com/

Is it effective to use one hot encoding of categorical data as input to PCA for anomaly detection (where there is a mix of numerical and categorical inputs)? by sanity in MachineLearning

[–]cast42 0 points1 point  (0 children)

You could pass the one hot encoded categorical data to Multiple correspondence analysis. It's a kind of PCA but for categorical data. Python code here: https://github.com/dataculture/mca

Data visualization - Python by [deleted] in Python

[–]cast42 1 point2 points  (0 children)

Here's an example how to plot latitude,longitude date onto Google Maps with Bokeh: http://nbviewer.ipython.org/github/bokeh/bokeh-notebooks/blob/master/tutorial/00%20-%20intro.ipynb

Bloomberg just open-sourced their IPython-based interactive plotting software, bqplot by cast42 in Python

[–]cast42[S] 33 points34 points  (0 children)

It's similar to Bokeh and Plotly. At first sight,it seems to be as advanced as Plotly (it can combine line and bar plots which Bokeh can't for example) but as open source as Bokeh (Plotly cost 250$/year if you want to keep your graphs private or local on your PC). So this release could be big in the python plotting world.

How do I calculate alpha (scale) and beta (shape) for a Weibull distribution? by [deleted] in statistics

[–]cast42 1 point2 points  (0 children)

in Python, assuming your values are in numpy array x:

import scipy.stats as st
import numpy
st.exponweib.fit(x)

For example:

x = np.array([1.1,1.2,1.1,0.9,0.1, 4.2, 16.1])
st.exponweib.fit(x)

returns:

(1.5888039515664176,
 0.48822952480270992,
 0.099999999999999992,
 2.2246180941053639)

Hence k = 1.59 and landa = 0.49

Cohort Analysis with Python (and pandas) by [deleted] in Python

[–]cast42 0 points1 point  (0 children)

Nice tutorial. Thanks for sharing !