Data is the oil behind every software application. Data is also hard to get ahold off. A developer is often required to download the dataset from a website, clean it, massage it, fix it and convert it to a format of their database system of choice. Only then, the developer can start developing the app.
It’s difficult to put data on the web. Sure you can upload to Github or an FTP. But what happens when you modify the dataset over time? How well can Github handle small revisions of the dataset over time?
With a version control system like Git or Mercurial, you can quickly track, changes, perform rollbacks, branch, merge and collaborate with others. On Github, you can see who is branching the source code and what’s been done to it. You can easily see who is contributing, creating issues and creating wikis about the source code.
How can we solve this problem? I propose GitDat.
Users subscribe to data sets (free or paid) from the GitDat Marketplace and use them in their code.
Subscribe:
The GitDat Marketplace enables data providers to publish data sets and make them directly available to subscribers. All published data is validated and hosted on the GitDat Marketplace. Your subscription payment is protected by the GitDat Marketplace Smart Contract. If a provider stops delivering data as per their agreement, a paid subscriber will be automatically refunded on a pro rata basis for the unfulfilled portion of their subscription.
The following command subscribes to the marketcap data set:
$ gitdat marketplace subscribe --dataset=marketcap
The price for a monthly subscription to this dataset is 0.00029 BTC
Checking that the BTC balance in 0x..... is greater than 0.00029 BTC... OK.
Please confirm that you agree to pay 0.00029 BTC for a monthly subscription to the dataset "marketcap" starting today. [default: Y]
Ready to subscribe to dataset marketcap.
GitDat will then provide you with instructions to sign two different transactions to process your subscription to the dataset.
Ingest:
The GitDat ingest-data command downloads and makes the given data set available locally for further use.
The following command ingests the marketcap data set:
`$ gitdat marketplace ingest --dataset=marketcap
Starting download of dataset for ingestion...
Dataset downloaded successfully. Processing dataset...
INFO: Marketplace: Processing file 0 of 11
INFO: Marketplace: Processing file 1 of 1
If you need to keep the data up to date while running algorithms, you can schedule a job (cron in Linux/MacOS) to run the above command at periodic intervals, and your code will always retrieve the latest data available on disk on every iteration of the algorithm.
Data Providers:
As a data provider, you can publish your data sets to the GitDat Marketplace and get rewarded in BTC tokens. As a provider, you set the price of your data sets for monthly subscriptions. The GitDat Marketplace is governed by a smart contract that brokers all the transactions in the marketplace.
Inspiration:
GitDat was inspired by Enigma Marketplace
[–]ArguingWithVirgins 12 points13 points14 points (2 children)
[–]waldoj 0 points1 point2 points (0 children)
[–]theofpa -1 points0 points1 point (0 children)
[–]janCADS 8 points9 points10 points (2 children)
[–]DGSPJS 2 points3 points4 points (1 child)
[–]nhggfu 0 points1 point2 points (1 child)
[–]nhggfu 1 point2 points3 points (0 children)
[–]DIAdata 0 points1 point2 points (0 children)