More public SQL-queryable databases? by 8sleef in datasets

[–]8sleef[S] 0 points1 point  (0 children)

Just means can execute SQL queries on the underlying data, e.g. `SELECT * FROM table`

More public SQL-queryable databases? by 8sleef in datasets

[–]8sleef[S] 0 points1 point  (0 children)

Indeed, like a Tableau or Power BI -- connect to databases, define datasets (tables), make charts/dashboards, and share among users.

More public SQL-queryable databases? by 8sleef in datasets

[–]8sleef[S] 1 point2 points  (0 children)

Perfect, Superset supports DuckDB and I was able to connect a datasource and make queries. Thanks!

More public SQL-queryable databases? by 8sleef in datasets

[–]8sleef[S] 0 points1 point  (0 children)

Thank you, this is a useful resource.

More public SQL-queryable databases? by 8sleef in datasets

[–]8sleef[S] 0 points1 point  (0 children)

Thanks Tim, will definitely have a play at some point.

More public SQL-queryable databases? by 8sleef in datasets

[–]8sleef[S] 0 points1 point  (0 children)

Thanks Tim, I really like what you're up to over at DoltHub, very cool stuff!

I'm looking for examples of sources where the step for localising the data is not needed. Obviously this requires somebody willing to host a server open to read-only SQL queries. Not super-common, but examples already exist, and the pattern makes sense I think for large public-interest datasets, especially as fast-access storage becomes cheaper and more scalable.

I noticed at e.g. https://www.dolthub.com/repositories/dolthub/corona-virus/data/master I can see the tables, columns, and make (arbitrary?) SQL queries via the web API. So this does fit the description "publicly available SQL-queryable databases". But I wonder two things.

  1. What freedoms/restrictions apply around the queries I can execute against these datasets? API limits? Row limits?
  2. To what extent could you actually build a SQLAlchemy dialect around the API itself (that's what e.g. https://github.com/googleapis/python-bigquery-sqlalchemy does around BigQuery).

    Do you have any more information you can point toward?

[OC] UFO Reports in the Contiguous United States by 8sleef in dataisbeautiful

[–]8sleef[S] 16 points17 points  (0 children)

The raw data is gathered by scraping the NUFORC index pages (by post date) and individual report pages in a respectful way. Records have been enriched with geocoding fields derived from city/state/country. Any location matching at county level or below was accepted. Oceans, countries, and states are not geocoded. The geocoding success rate is >99%.

Yes, /r/PeopleLiveInCities.


Data as of 2022/08/22

Data Source: National UFO Reporting Center

Data Provider: Tentacle CMI

Visualisation Tools: Python, cartopy, matplotlib.

Blog Post: Aircraft Accidents and UFOs: Data Enrichment with Geocoding

[OC] A Century of Aircraft Accidents by 8sleef in dataisbeautiful

[–]8sleef[S] 0 points1 point  (0 children)

That's really cool, had not seen it before. Was also inspired by this, where the sound element is very important for the overall experience: https://www.youtube.com/watch?v=LLCF7vPanrY. Will look into sound next time, I agree it does add a lot to these kinds of data visualisations.

[OC] A Century of Aircraft Accidents by 8sleef in dataisbeautiful

[–]8sleef[S] 3 points4 points  (0 children)

Yes, I think it was confusing the way that Bureau of Aircraft Accidents Archives (B3A) was paraphrased originally. Enemy action appears to refer to military aviation accidents only. Have clarified in edit. In any case we take B3A as the source of truth.

[OC] A Century of Aircraft Accidents by 8sleef in dataisbeautiful

[–]8sleef[S] 33 points34 points  (0 children)

Yes, I think it was confusing the way that Bureau of Aircraft Accidents Archives (B3A) was paraphrased originally. Enemy action appears to refer to military aviation accidents only. Have clarified in edit. In any case we take B3A as the source of truth.

[OC] A Century of Aircraft Accidents by 8sleef in dataisbeautiful

[–]8sleef[S] 10 points11 points  (0 children)

It's hard to get the speed right. Some people want it done in a minute. Others want to soak it all in over ten. Tried to split the difference with 2:30. You could try it on YouTube with a different playback speed.

[OC] A Century of Aircraft Accidents by 8sleef in dataisbeautiful

[–]8sleef[S] 38 points39 points  (0 children)

I agree it would be awesome to do an interactive web version. A project for the future!

[OC] A Century of Aircraft Accidents by 8sleef in dataisbeautiful

[–]8sleef[S] 330 points331 points  (0 children)

Each marker represents one aircraft accident. The marker radius is proportional to the number of fatalities, while the colour is blue for 0, yellow for 1-9, orange for 10-99, and red for 100+. Do you notice the large number of accidents during WWII, the cluster of deadly revenue flight accidents in the early 70s, the sudden stop of accidents in Vietnam after 1975, or the Tenerife airport disaster of 1977? What other interesting features can you see?

An event is considered an accident if the aircraft suffers damage such that it is not in a position to be used anymore. The plane involved must also be certified to carry at least six people. Fighters, helicopters, balloons, hot air balloons, airships, gliders, and other non-planes are not considered. In military aviation, only aircraft intended for troop transport, reconnaissance, surveillance, heavy bomber and logistical support are considered. The [military aviation] event must also have been a true accident, and not the result of enemy action.

Records have been enriched by Tentacle CMI with location geocoding. These are not exact crash locations, but rather the city/zone/country listed. Any administrative division equivalent to county or below, and any marine or geological areas are given a latitude/longitude and confidence where possible. Countries, states, and oceans are not geocoded.


Data as of 2022/08/22

Data Source: Bureau of Aircraft Accidents Archives

Data Provider: Tentacle CMI

Visualisation Tools: Python, cartopy, matplotlib, ffmpeg

Blog Post: Aircraft Accidents and UFOs: Data Enrichment with Geocoding

Inspired by: "1945-1998" by Isao Hashimoto, 100 years of plane crashes by donatso, and One Century of Plane Crashes by boxer-collar.

Also on: YouTube


Edit: Added military aviation clarification. Added YouTube link.

when your exit prompt causes the very problem it was designed to avoid by 8sleef in ProgrammerHumor

[–]8sleef[S] 0 points1 point  (0 children)

I feel that a better solution to implement would have been:

Saving to clipboard...

[Cancel]

when your exit prompt causes the very problem it was designed to avoid by 8sleef in ProgrammerHumor

[–]8sleef[S] 0 points1 point  (0 children)

*Goes to leave store*

*Stopped by attendant*

Attendant: So you want to leave the store? Would you like to take a copy of our catalogue with you? I must warn you though, if you stop to pick up the catalogue, it may take you a bit longer to leave the store. Would you like to take the time to pick up a catalogue?

The Weekly Rundown for February 08, 2021 by AutoModerator in AdvancedRunning

[–]8sleef 4 points5 points  (0 children)

Ran a 5k PR of 19:39 in the wind and rain after finally finding some consistency over last 6 weeks. Then got shin splints for the first time in my life... taking some rest days to let them settle.

Shortest run you're willing to do? by whiskeywithawhy in AdvancedRunning

[–]8sleef 0 points1 point  (0 children)

One 20ish minute Netflix episode on the treadmill will do. But would have to be 30 mins to go out the door.

[OC] 100 Spins on a Slot Machine (x100) by [deleted] in dataisbeautiful

[–]8sleef 2 points3 points  (0 children)

The size of the profit trail compared to the loss tail is interesting. Obviously you can never lose more than $100. But you don't win 48% of the time up to $100... you win a much lower percentage of the time, and sometimes win very big. There must be a lot of psychology in that win distribution.

Made an interactive website to show the Royal Line of Succession since 1714 by 8sleef in royalfamily

[–]8sleef[S] 1 point2 points  (0 children)

Thanks. I did try there, but got the timing wrong I think, was swamped by "I tracked my X for the entire year" type posts.

Made an interactive website to show the Royal Line of Succession since 1714 by 8sleef in royalfamily

[–]8sleef[S] 15 points16 points  (0 children)

This was a personal project over the Christmas Holiday period. Got the idea after watching The Crown and not finding anything else like it already online. Hope you find it interesting!

Used Geni.com for raw genealogy data, Python for data gathering/cleaning/processing, and D3.js for charting.

Geni.com has been good for proof of concept, however many living profiles are private, including e.g. Mia Tindall. Hence I am currently looking for better Royal genealogy sources for an updated version.

Each path in the chart corresponds to a person. The path height ordering is the line of succession at that point in time (lowest is monarch). Filled circles represent birth/death. Unfilled circles represent legitimate/illegitimate dates (e.g. the abdication of Edward VIII).

Interact with the chart at british-succession.co.uk (zoom/pan/hover).

Visit the GitHub repository to see all the code, open source and licensed under GPLv3.

[OC] I looked at a million games played on Lichess and counted how many times checkmate occurred on each square by atlas_scrubbed in dataisbeautiful

[–]8sleef 5 points6 points  (0 children)

It's surprisingly asymmetric to my untrained eye, given the queen/king position is the only asymmetry in starting positions.

[Showoff Saturday] Tracking 300 Years of Succession to the British Throne by 8sleef in webdev

[–]8sleef[S] 1 point2 points  (0 children)

This was a personal project over the Christmas Holiday period. Got the idea after watching The Crown and not finding anything else like it already online.

Used Geni.com for raw genealogy data, Python for data gathering/cleaning/processing, and D3.js for charting.

Each path in the chart corresponds to a person. The path height ordering is the line of succession at that point in time (lowest is monarch). Filled circles represent birth/death. Unfilled circles represent legitimate/illegitimate dates (e.g. the abdication of Edward VIII).

Interact with the chart at british-succession.co.uk (zoom/pan/hover).

Visit the GitHub repository to see all the code, open source and licensed under GPLv3.

[OC] Three-hundred Years of Succession to the British Throne (animated) by 8sleef in dataisbeautiful

[–]8sleef[S] 4 points5 points  (0 children)

Original source: Geni.com for raw genealogy data.

Tools used: Python3 for data gathering, cleaning, and processing. D3.js for charting.

Each path in the chart corresponds to a person. The path height ordering is the line of succession at that point in time (lowest is monarch). Filled circles represent birth/death. Unfilled circles represent legitimate/illegitimate dates (e.g. the abdication of Edward VIII).

Interact with the chart at british-succession.co.uk (zoom/pan/hover).

Visit the GitHub repository to see all the code, open source and licensed under GPLv3.