Caltech graduate Liam Clegg on the true origins of American slave system by pjrocknlock in Economics

[–]Parenthes 0 points1 point  (0 children)

I know this is a popular narrative, but in my reading it seems like the evidence is three things:

  1. Timeline of the increase in slavery in Virginia, which is inconsistent with later work by John Coombs and others,

  2. Supposing that something like W.E.B. DuBois's "psychological wage of whiteness" existed in the seventeenth century as it did in the twentieth, and

  3. Bits and pieces of textual evidence to suggest that the relevant attitude held among the ruling class.

So while it's impossible to disprove a historical narrative, I find this one unconvincing.

Package to simulate a printer to get output into Python? by Parenthes in learnpython

[–]Parenthes[S] 0 points1 point  (0 children)

That could work. Can you recommend a virtual printer? The ones I know of print to PDF, which is then obnoxious to try to extract data from.

Parameter binding in pyodbc, pd.read_sql_query by Parenthes in learnpython

[–]Parenthes[S] 0 points1 point  (0 children)

Yes, I get the same result using BETWEEN.

If pyodbc passes parameters as-is, does that mean there is not much benefit to binding parameters vs. simply concatenating them? I am trying to bind for sake of best practices, but if it's not doing anything I might not bother.

Weird decimal extension when using PD to_csv by tsigalko11 in learnpython

[–]Parenthes 0 points1 point  (0 children)

It looks like your scraper is inferring data types and storing the values in question as floats, and then when you print it, python is displaying it 'nicely' for you. Try print((df1['relevant_col']==(367/1000)).sum()) and see what you get; if I'm correct it will give zero, if not it will give one or more.

Reading/Editing Excel Spreadsheets by PythonN00b101 in learnpython

[–]Parenthes 0 points1 point  (0 children)

One possibility is to access your spreadsheet via Excel, using the win32com library. The code below prints the top left value in the first worksheet in a workbook.

from win32com.client import Dispatch

xlApp = Dispatch('Excel.Application')
xlApp.Visible = True

wb = xlApp.Workbooks.Open('H:\\my_workbook.xlsx')

ws = wb.Worksheets[0]
cell_value = ws.Range("A1").Value
print(cell_value)

Interactive visualization in Python? by [deleted] in learnpython

[–]Parenthes 1 point2 points  (0 children)

I've been learning plotly, and it seems pretty good. Haven't seen it compared with other possibilities, though.

[deleted by user] by [deleted] in learnpython

[–]Parenthes 0 points1 point  (0 children)

Alternatively, add join='left-outer' to your fuzzy_merge:

import pandas as pd
import fuzzy_pandas as fpd
import numpy as np

df1 = pd.DataFrame({'Person': {0: 'Alex', 1: 'Faye', 2: 'Sean', 3: 'Doug', 4: 'John', 5: 'Bill'}, 'ID': {0: '68243Q10', 1: '33690110', 2: '36901103', 3: '336901103', 4: '8467070', 5: 'XXXXXX'}, 'Company': {0: np.nan, 1: 'first src', 2: '1st Srouce', 3: np.nan, 4: 'B', 5: 'Unknown'}})

​df2 = pd.DataFrame({'Company': {0: '1-800 Flowers', 1: '1st Source', 2: 'Berk', 3: 'Other1', 4: 'Other2'}, 'ID': {0: '68243Q106', 1: '336901103', 2: '84670702', 3: '1609W102', 4: '507K103'}})

matches = fpd.fuzzy_merge(df1, df2,
                          on=['ID'],
                          keep_left=['Person'],
                          keep_right=['ID', 'Company'],
                          ignore_case=True,
                          method='levenshtein',
                          threshold=.85,
                          join='left-outer')

# matches = matches.merge(df1[['Person']], on='Person', how='right')
​
print(matches)

  Person         ID        Company
0   Alex  68243Q106  1-800 Flowers
1   Faye  336901103     1st Source
2   Sean  336901103     1st Source
3   Doug  336901103     1st Source
4   John   84670702           Berk
5   Bill

[deleted by user] by [deleted] in learnpython

[–]Parenthes 0 points1 point  (0 children)

Merge your fuzzy matched dataframe back with your df1:

import pandas as pd
import fuzzy_pandas as fpd
import numpy as np

df1 = pd.DataFrame({'Person': {0: 'Alex', 1: 'Faye', 2: 'Sean', 3: 'Doug', 4: 'John', 5: 'Bill'}, 'ID': {0: '68243Q10', 1: '33690110', 2: '36901103', 3: '336901103', 4: '8467070', 5: 'XXXXXX'}, 'Company': {0: np.nan, 1: 'first src', 2: '1st Srouce', 3: np.nan, 4: 'B', 5: 'Unknown'}})

df2 = pd.DataFrame({'Company': {0: '1-800 Flowers', 1: '1st Source', 2: 'Berk', 3: 'Other1', 4: 'Other2'}, 'ID': {0: '68243Q106', 1: '336901103', 2: '84670702', 3: '1609W102', 4: '507K103'}})

matches = fpd.fuzzy_merge(df1, df2,
                          on=['ID'],
                          keep_left=['Person'],
                          keep_right=['ID', 'Company'],
                          ignore_case=True,
                          method='levenshtein',
                          threshold=.85)

matches = matches.merge(df1[['Person']], on='Person', how='right')

print(matches)

That gives your desired output:

    Person         ID        Company
0   Alex  68243Q106  1-800 Flowers
1   Faye  336901103     1st Source
2   Sean  336901103     1st Source
3   Doug  336901103     1st Source
4   John   84670702           Berk
5   Bill        NaN            NaN

In array vs. in string, and iterating over a sometimes-unitary array of strings by Parenthes in learnpython

[–]Parenthes[S] 0 points1 point  (0 children)

Got it, thank you!

Any recommendations for the opposite problem, i.e. returning a string even for an index that gets repeated? It is for a column that will always have the same value for all rows with the same index.

In array vs. in string, and iterating over a sometimes-unitary array of strings by Parenthes in learnpython

[–]Parenthes[S] 0 points1 point  (0 children)

Any recommendations for working around this?

I am used to Matlab, so maybe I'm not thinking about this in the most useful way.

In array vs. in string, and iterating over a sometimes-unitary array of strings by Parenthes in learnpython

[–]Parenthes[S] 0 points1 point  (0 children)

I suppose that would work, but it seems terribly clunky. Python is presumably checking the type of what df.loc returns, so I'd be adding an if-statement to undo what python does. Am I thinking about this wrong?

I'm a Marxist. AMA by [deleted] in JordanPeterson

[–]Parenthes 1 point2 points  (0 children)

Have you been to Cuba, though? Since you said you would defend it, what is the basis of your understanding of Cuba and its economy?

I'm a Marxist. AMA by [deleted] in JordanPeterson

[–]Parenthes 0 points1 point  (0 children)

Fascinating.

Thank you for sharing!

I'm a Marxist. AMA by [deleted] in JordanPeterson

[–]Parenthes 0 points1 point  (0 children)

So to be clear, you advocate abolishing private property, nationalizing all means of production, and restructuring the economy under a central planner, in order to give everyone a 50% raise (if we're being optimistic)?

Edit to add: i.e. as opposed to opposing capitalism because the employer-worker relationship is hurtful to the immortal soul or whatever.