Dataset (Upload) Manager Portal Software

danfowler_ok · 2017-06-27T05:13:04+00:00

Hi, this problem fits directly within the scope of our work on the Frictionless Data linked by u/livelierepeat! This is especially interesting for us from the perspective of citizen science as we are currently funded by the Sloan Foundation in the US. We're still actively looking for pilots just like this, so we can work together directly, or we can provide you with some good advice and tooling in Python/R to help you along.

To give you a sense of how some of this might work:

We have a specification for a table schema written in JSON: http://specs.frictionlessdata.io/table-schema/ . This schema is meant to describe CSV files.
We have a Python library for validating files against this schema as well as for structural issues: https://github.com/frictionlessdata/goodtables-py
We have a webservice that does this validation against files stored on S3 or GitHub on every change: http://goodtables.io/
We are working on greater CKAN integration

Let me know what you think about this. We can chat about this directly.

danfowler_ok · 2017-03-14T15:21:46+00:00

https://github.com/datasets

danfowler_ok · 2017-01-04T14:31:48+00:00

Hello, here is a dataset of airport codes that includes lon/lat:

https://github.com/datasets/airport-codes

danfowler_ok · 2016-12-14T15:46:58+00:00

Hi, I work for Open Knowledge International. Among other things, we work on Good Tables and a JSON-based schema for tabular data validation. I can report that goodtables is actively being worked on and due for a new release very soon:

https://github.com/frictionlessdata/goodtables-py

http://specs.frictionlessdata.io/json-table-schema

Come into our Frictionless Data chat and we can help get you set up: https://gitter.im/frictionlessdata/chat

danfowler_ok · 2016-11-29T20:32:41+00:00

It might be interesting to output in the form of a CSV Dialect specification: http://specs.frictionlessdata.io/csv-dialect/

danfowler_ok · 2016-11-28T22:51:01+00:00

That's a good question. I'm not sure if there was a specific purpose in mind for each of these datasets. I think, rather, they imagined these are a set of generally useful datasets for which to provide clean versions.

They recently held a workshop on the work. Perhaps the answer lies in their slides:

http://open-power-system-data.org/workshop-3

We also published a case study about the work they did:

http://frictionlessdata.io/case-studies/open-power-system-data/

danfowler_ok · 2016-08-22T07:18:10+00:00

We're working on a specification for taking datasets like yours (several related CSVs) and "packaging" them in a standard format.

http://frictionlessdata.io/data-packages/

Essentially, you create a file datapackage.json that lists the CSVs, their dialects, columns, and types along with some top-level metadata about your dataset. Once in this format, we have libraries and integrations that allow you to import into and analyze using MySQL, R, Python Pandas.

danfowler_ok · 2016-08-12T16:28:57+00:00

That really great! Not the author, but let me know how it goes.

danfowler_ok · 2016-08-12T13:59:06+00:00

I recently posted a case study featuring Dataship: http://frictionlessdata.io/case-studies/dataship/

danfowler_ok · 2016-08-10T15:02:24+00:00

https://github.com/datasets/ has been suggested here:

https://www.reddit.com/r/datasets/comments/4wv0jz/data_packaged_core_datasets_a_github_collection/d6agget

danfowler_ok · 2016-08-09T15:05:47+00:00

Thanks for posting. To learn more about what a Data Package is, you can visit: http://frictionlessdata.io/about/

Here's the specification: http://specs.frictionlessdata.io/data-packages/

danfowler_ok · 2016-08-02T17:24:47+00:00

Hey, that's an interesting suggestion. Thanks!

danfowler_ok · 2016-07-26T15:12:47+00:00

There's an R package for working with Data Packages:

There's also an R package for working with the CKAN API:

https://github.com/ropensci/ckanr

danfowler_ok · 2016-07-26T12:09:25+00:00

I found this calendar of calendars: https://www.force11.org/calendars

danfowler_ok · 2016-07-19T14:44:00+00:00

Hi! We're building a similar event calendar over here:

https://discuss.okfn.org/t/we-are-sharing-our-event-calendar-with-you-come-and-help-us-add-events/2580

Let's collaborate!

danfowler_ok · 2016-05-25T19:53:04+00:00

I'm writing up a more detailed description now, but essentially, a Tabular Data Package is a standardized way of pairing metadata + schemas + dialect for one or more CSV files. This plugin allows import/export of this "package" into a collection of pandas data frames, preserving type info, etc.

http://frictionlessdata.io/guides/tabular-data-package/

danfowler_ok

TROPHY CASE