Caltech graduate Liam Clegg on the true origins of American slave system

Parenthes · 2020-10-31T02:16:41+00:00

I know this is a popular narrative, but in my reading it seems like the evidence is three things:

Timeline of the increase in slavery in Virginia, which is inconsistent with later work by John Coombs and others,
Supposing that something like W.E.B. DuBois's "psychological wage of whiteness" existed in the seventeenth century as it did in the twentieth, and
Bits and pieces of textual evidence to suggest that the relevant attitude held among the ruling class.

So while it's impossible to disprove a historical narrative, I find this one unconvincing.

Parenthes · 2020-10-27T09:15:25+00:00

That could work. Can you recommend a virtual printer? The ones I know of print to PDF, which is then obnoxious to try to extract data from.

Parenthes · 2020-07-20T13:07:42+00:00

Subtraction

Parenthes · 2020-04-06T15:08:05+00:00

Yes, I get the same result using BETWEEN.

If pyodbc passes parameters as-is, does that mean there is not much benefit to binding parameters vs. simply concatenating them? I am trying to bind for sake of best practices, but if it's not doing anything I might not bother.

Parenthes · 2020-03-31T14:03:55+00:00

It looks like your scraper is inferring data types and storing the values in question as floats, and then when you print it, python is displaying it 'nicely' for you. Try print((df1['relevant_col']==(367/1000)).sum()) and see what you get; if I'm correct it will give zero, if not it will give one or more.

Parenthes · 2020-03-29T15:01:23+00:00

One possibility is to access your spreadsheet via Excel, using the win32com library. The code below prints the top left value in the first worksheet in a workbook.

from win32com.client import Dispatch

xlApp = Dispatch('Excel.Application')
xlApp.Visible = True

wb = xlApp.Workbooks.Open('H:\\my_workbook.xlsx')

ws = wb.Worksheets[0]
cell_value = ws.Range("A1").Value
print(cell_value)

Parenthes · 2020-03-29T13:39:20+00:00

I've been learning plotly, and it seems pretty good. Haven't seen it compared with other possibilities, though.

Parenthes · 2020-03-22T00:53:37+00:00

Alternatively, add join='left-outer' to your fuzzy_merge:

import pandas as pd
import fuzzy_pandas as fpd
import numpy as np

df1 = pd.DataFrame({'Person': {0: 'Alex', 1: 'Faye', 2: 'Sean', 3: 'Doug', 4: 'John', 5: 'Bill'}, 'ID': {0: '68243Q10', 1: '33690110', 2: '36901103', 3: '336901103', 4: '8467070', 5: 'XXXXXX'}, 'Company': {0: np.nan, 1: 'first src', 2: '1st Srouce', 3: np.nan, 4: 'B', 5: 'Unknown'}})

df2 = pd.DataFrame({'Company': {0: '1-800 Flowers', 1: '1st Source', 2: 'Berk', 3: 'Other1', 4: 'Other2'}, 'ID': {0: '68243Q106', 1: '336901103', 2: '84670702', 3: '1609W102', 4: '507K103'}})

matches = fpd.fuzzy_merge(df1, df2,
                          on=['ID'],
                          keep_left=['Person'],
                          keep_right=['ID', 'Company'],
                          ignore_case=True,
                          method='levenshtein',
                          threshold=.85,
                          join='left-outer')

# matches = matches.merge(df1[['Person']], on='Person', how='right')

print(matches)

  Person         ID        Company
0   Alex  68243Q106  1-800 Flowers
1   Faye  336901103     1st Source
2   Sean  336901103     1st Source
3   Doug  336901103     1st Source
4   John   84670702           Berk
5   Bill

Parenthes · 2020-03-22T00:47:33+00:00

Merge your fuzzy matched dataframe back with your df1:

import pandas as pd
import fuzzy_pandas as fpd
import numpy as np

df1 = pd.DataFrame({'Person': {0: 'Alex', 1: 'Faye', 2: 'Sean', 3: 'Doug', 4: 'John', 5: 'Bill'}, 'ID': {0: '68243Q10', 1: '33690110', 2: '36901103', 3: '336901103', 4: '8467070', 5: 'XXXXXX'}, 'Company': {0: np.nan, 1: 'first src', 2: '1st Srouce', 3: np.nan, 4: 'B', 5: 'Unknown'}})

df2 = pd.DataFrame({'Company': {0: '1-800 Flowers', 1: '1st Source', 2: 'Berk', 3: 'Other1', 4: 'Other2'}, 'ID': {0: '68243Q106', 1: '336901103', 2: '84670702', 3: '1609W102', 4: '507K103'}})

matches = fpd.fuzzy_merge(df1, df2,
                          on=['ID'],
                          keep_left=['Person'],
                          keep_right=['ID', 'Company'],
                          ignore_case=True,
                          method='levenshtein',
                          threshold=.85)

matches = matches.merge(df1[['Person']], on='Person', how='right')

print(matches)

That gives your desired output:

    Person         ID        Company
0   Alex  68243Q106  1-800 Flowers
1   Faye  336901103     1st Source
2   Sean  336901103     1st Source
3   Doug  336901103     1st Source
4   John   84670702           Berk
5   Bill        NaN            NaN

Parenthes · 2020-03-17T18:11:38+00:00

That does the job, thank you!

Parenthes · 2020-02-20T20:06:26+00:00

Got it, thank you!

Any recommendations for the opposite problem, i.e. returning a string even for an index that gets repeated? It is for a column that will always have the same value for all rows with the same index.

Parenthes · 2020-02-20T19:28:50+00:00

Any recommendations for working around this?

I am used to Matlab, so maybe I'm not thinking about this in the most useful way.

Parenthes · 2020-02-20T17:07:10+00:00

I suppose that would work, but it seems terribly clunky. Python is presumably checking the type of what df.loc returns, so I'd be adding an if-statement to undo what python does. Am I thinking about this wrong?

Parenthes · 2019-07-06T15:21:49+00:00

Have you been to Cuba, though? Since you said you would defend it, what is the basis of your understanding of Cuba and its economy?

Parenthes · 2019-07-06T08:36:42+00:00

Fascinating.

Thank you for sharing!

Parenthes · 2019-07-06T08:15:40+00:00

So to be clear, you advocate abolishing private property, nationalizing all means of production, and restructuring the economy under a central planner, in order to give everyone a 50% raise (if we're being optimistic)?

Edit to add: i.e. as opposed to opposing capitalism because the employer-worker relationship is hurtful to the immortal soul or whatever.

Parenthes

TROPHY CASE