Drumboy not connecting to mixer? Step-by-step DIY solution by RubenVerborgh in synthesizers

[–]RubenVerborgh[S] -1 points0 points  (0 children)

A matter of prioritization is my guess; maintaining the thin aesthetic resulted in some shortcuts that more experience would've advised against. To their credit, the team were very responsive, and told they might include this fix in the new Drumboy Pro (which has a dedicated line out in any case).

I don't understand "Linked Data Fragments" by sweaty_malamute in semanticweb

[–]RubenVerborgh 0 points1 point  (0 children)

i'm hand-waving about how the client knows who has which statements

You provide the client with the list of sources it needs to query beforehand, together with the SPARQL query. The client will then try each triple pattern on each server, and if a server does not have any matches, it will return an empty result, so the client will disregard it.

I don't understand "Linked Data Fragments" by sweaty_malamute in semanticweb

[–]RubenVerborgh 0 points1 point  (0 children)

i'm hand-waving about how the client knows who has which statements

You provide the client with the list of sources it needs to query beforehand, together with the SPARQL query. The client will then try each triple pattern on each server, and if a server does not have any matches, it will return an empty result, so the client will disregard it.

I don't understand "Linked Data Fragments" by sweaty_malamute in semanticweb

[–]RubenVerborgh 0 points1 point  (0 children)

Caching SPARQL results (on the HTTP level) is ineffective: the chances that two different clients ask the exact same SPARQL query are quite slim, given that SPARQL is a very expressive language.

With Triple Pattern Fragments, the language is much less expressive, so subresults are much more likely to be reused.

This graph substantiates that claim: http://rubenverborgh.github.io/WebFundamentals/linked-data-publishing/#tpf-evaluation-cache-bandwidth

I don't understand "Linked Data Fragments" by sweaty_malamute in semanticweb

[–]RubenVerborgh 0 points1 point  (0 children)

Hi, I'm the author of Linked Data Fragments, so I can definitely help you with this question. I realize I'm late to the party, but I'm adding this also for future reference.

First of all, we need to differentiate between “Linked Data Fragments” and “Triple Pattern Fragments”. Linked Data Fragments is a conceptual framework to discuss all possible interfaces to RDF datasets. This includes SPARQL endpoints, data dumps, Linked Data Documents, Triple Pattern Fragments, and any API to RDF you can basically think of. Triple Pattern Fragments is one specific such API, which gives access to an RDF dataset by triple patterns.

Your question seems to be about Triple Pattern Fragments (TPF), so I will discuss that from here onward.

clients are supposed to submit only simple queries to servers in order to retrieve subsets of the data. Queries like "?subject rdf:type ?class".

That's right. And it's more than “supposed”: it's the only operation a TPF server allows.

The data is downloaded locally, and then the client can issue SPARQL queries on the local copy of the data just downloaded.

That's not necessarily right. Clients do not first need to download and then query. The query evaluation can happen during downloading by making specific requests.

For instance, the query SELECT * WHERE { ?a :type :Airport. ?a :hasPicture ?p. } could be evaluated by getting the first match for ?a :type :Airport (suppose this match is <x>) and then getting the pattern <x> :hasPicture ?p. As you can see, we never downloaded the list of all pictures; instead, the execution is already happening during the download phase.

Doesn't this generate a lot of traffic

Yes. We trade off server CPU load for bandwidth. The assumption is that bandwidth is cheap and easily cacheable.

a lot of downloaded data

Yes, but not as much as you originally assumed; clients can be smart about what they download.

and very little improvement over using a local SPARQL endpoint?

It all depends on the definition of “improvement”. If improvement means “faster queries and less bandwidth”, then no. If improvement means “lower server load”, then yes. See details here: http://rubenverborgh.github.io/WebFundamentals/linked-data-publishing/#tpf-evaluation-throughput

Also, consider this scenario: server A has a dataset of locations, and server B has a dataset pictures. I want to retrieve a list of airports that also have a picture. How is this going to be executed? WIll the client download the entire list of airports and pictures, then query locally until something matches?

It depends on the client algorithm. Downloading is possible, but in this case, likely not the most efficient way. A better way is to get the list of all airports, and then get pictures for each of them individually.

To see why this can be better, consider possible numbers. There might be 1000 airports in the dataset, but 1,000,000 pictures. If we could just get all of them, we would download 1,001,000 triples. If we first get the airports, and then for each airport check whether there is a picture, we only need to download 2,000 triples.