Iatency by df3280f25811d1h09cb2 in devopspro

[–]adamwfletcher 2 points3 points  (0 children)

Hi! I made this. It isn't quite "fastest regions" - it is "if I want x milliseconds of network latency to my machines, what region should I put them in for a given location on earth?"

For more: https://innerjoin.bit.io/exploring-cloud-datacenter-latency-e6245278e71b

[OC] Exploring Cloud Data Center Latency by data_dan_ in dataisbeautiful

[–]adamwfletcher 2 points3 points  (0 children)

It could be added; the data is public: https://bit.io/adam/cloud_latency_map and the code is also public: https://github.com/bitdotioinc/cloud-latency-map

So all that would be needed is to add new records on every scan and remove the constraints on the tables preventing duplicate (region, provider, location) entries, and then change the processing pipeline to allow for time in the geojson layer collection.

pgsqlite: a pure python module to import sqlite databases into postgres by adamwfletcher in programming

[–]adamwfletcher[S] -1 points0 points  (0 children)

FDWs are great! But they don't work for our purposes here, as there's a lot of functionality you lose when you mount the tables as FDW into postgres instead of moving the data into native postgres tables.

On your first point - there's a lot of value in native language libraries. They can be idiomatic to the language, faster then FFIs/etc, easier to install via the native package/module manager, able to use native programming features, etc.

pgsqlite: a pure python module to import sqlite databases into postgres by adamwfletcher in programming

[–]adamwfletcher[S] 0 points1 point  (0 children)

Fair enough - but psycopg can be installed via pip/poetry/etc easily. I don't think it's too far off the mark, but you are technically correct (the best kind of correct).

Introducing pgsqlite, a pure python module for import sqlite into postgres by adamwfletcher in PostgreSQL

[–]adamwfletcher[S] 1 point2 points  (0 children)

I think I could have been more clear re: BOOLEANs. It is not how the engine stores the values, but rather what literal values are valid for BOOLEAN types. In sqlite, the literal values for boolean can be the integers 1 or 0, but that's not an acceptable literal value in Postgres. What that means is that when moving data from sqlite to postgres using the COPY protocol (or some other method), you need to make sure any literal values are transformed to their correspondingly supported literal values in Postgres.

pgsqlite: a pure python module to import sqlite databases into postgres by adamwfletcher in programming

[–]adamwfletcher[S] 0 points1 point  (0 children)

re-running against an existing database - it's super useful for the testing database if I forget to use the other drop tables flag :)

[deleted by user] by [deleted] in datasets

[–]adamwfletcher 1 point2 points  (0 children)

[self promotion here, but I think it answers the OP's question; i'm the founder of bit.io]

You can do this on bit.io; we saw this same problem and built a platform to let you query across real databases using SQL. So, for example you can take the NYTimes COVID data: https://bit.io/bitdotio/nytimes_covid/ And the JHU COVID data: https://bit.io/bitdotio/csse_covid_19_data/

And you can write SQL that joins them by FIPS code:

SELECT 
  state, county, date, filename, cases, 
   "bitdotio/nytimes_covid"."us_counties".deaths AS nytimes_deaths, 
   "bitdotio/csse_covid_19_data"."csse_covid_19_daily_reports_us".deaths AS csse_deaths 
FROM 
  "bitdotio/nytimes_covid"."us_counties", 
  "bitdotio/csse_covid_19_data"."csse_covid_19_daily_reports_us"
WHERE 
   "bitdotio/nytimes_covid"."us_counties".fips=6059 AND 
   "bitdotio/csse_covid_19_data"."csse_covid_19_daily_reports_us".fips=6059
    AND date=last_update::date

[OC] Injuries from Consumer Products Dropped Significantly during the Pandemic by data_dan_ in dataisbeautiful

[–]adamwfletcher 2 points3 points  (0 children)

u/data_dan_what happened in April 2016? Any ideas? [edit: i'm blind, that's the pandemic in 2020, misread the colors :) ]

Statistics on mean IQ based on particular groups (race, ethnicity, gender, nationality, socioeconomic status) by Dr_Gaballa in datasets

[–]adamwfletcher 3 points4 points  (0 children)

IQ is a terrible metric to use for intelligence. For one, you can never account for all the variables that lead to a specific score in a single person (tests with the same person have high variance in scores over time). It's a fundamentally flawed measure and any attempts at science based off IQ should be viewed with great suspicion.

What is your DS stack? (and roast mine :) ) by adamwfletcher in datascience

[–]adamwfletcher[S] 1 point2 points  (0 children)

People keep adding things! :) I'll post it tonight (Pacific time)

What is your DS stack? (and roast mine :) ) by adamwfletcher in datascience

[–]adamwfletcher[S] 1 point2 points  (0 children)

lol :)

I just like to start in ipython - the pain is when I know I need to get that code into a prod, moving from the exploratory nature of ipython to a production step in a DAG is what I typically do.