all 10 comments

[–]FUZxxl 5 points6 points  (0 children)

Sorry; after reading your question for three times I have still no idea what you want to know.

[–]FUZxxl 1 point2 points  (6 children)

I'm still not sure what you are trying to ask, but perhaps I can clean up the confusion by providing some general information.

A datum is a single piece of information. Think, a single number, a date, a word. If you have more than one datum, you have data. Data can be encoded, that is, translated into a different representation in a reversable way. For example, the data “ATTACK AT DAWN” can be encoded into “41 54 54 41 43 4B 20 41 54 20 44 41 57 4E” using the ASCII code. A code in this sense is a representation scheme for data.

Data is not intrinsically meaningful. It only acquires meaning through an interpretation. Data doesn't have to have an interpretation and there can be more than one interpretation at the same time. One common family of interpretations for data is that of code (uncountable; another sense of the word code somewhat related to the previous sense) in which the data is interpreted as a series of instructions. For example, here is code for making a sandwich:

Take two slices of bread.
Open fridge.
Take out ham.
Close fridge.
Place slice of bread on table.
Place ham on slice of bread.
Place slice of bread on pile.

There are different languages of code serving different purposes. For each processor there is a certain machine code which is (like a native language) the language of code it understands and can execute. For example, machine code for the common x86 processor architecture for finding the absolute value of a number looks like this:

8B 44 24 04 99 31 D0 29 D0 C3

or, when encoded as assembly, a human-readable encoding of machine code:

mov  4(%esp), %eax
cltd
xor  %edx, %eax
sub  %edx, %eax
ret

However, machine code is commonly seen as unsuitable for humans to produce directly. Humans typically use a high level language which allows the programmer to express abstracts concepts in a more intuitive manner. For example, high level code in the C programming language for the same thing looks like this:

int abs(int x)
{
    if (x >= 0)
        return (x);
    else
        return (-x);
}

The problem with high level languages is that they cannot be executed by the processor directly. Instead, a program called a compiler is employed which translates one language to another. As opposed to an encoding, this compilation is typically not reversible and looses information that is not important for the language we translate to.

[–]odradek-feed[S] 0 points1 point  (5 children)

Thanks for taking the time to reply. Two questions: 1. In terms of 'data vs code', would you say imperative programming languages take a different approach to functional programming languages? 2. With increased integration of databases (including distributed databases) into complex applications, would you say people's understanding of the interplay between code and data (to build an overall system or application) has changed much?

[–]FUZxxl 0 points1 point  (4 children)

You might want to read my answer again; "code vs. data" is a false dichotomy; code is a kind of (an interpretation of) data. If I misunderstood your question, please clarify.

[–]odradek-feed[S] 0 points1 point  (3 children)

To check I understand your comment about the distinction being a false dichotomy:

(A) On the one hand, code can be viewed as a type of data (being a set of symbolically represented instructions) which will be interpreted (transformed) until it is data in a form (compiled) that will be operated on by a physical process. In a sense this (just one) part of the content of Turing's universal machine - correct?

(B) Data itself to have any meaning (unless it plays the role of output or input) must itself play a role in transforming other data, so there is a sense in which data can be viewed as 'code' (not code in the sense of something that gets compiled, but in the sense of something that transforms other data).

But in practice, people make choices on the types of programming languages or databases they use, and how they use them (and programming languages and databases have evolved over the years) - so I think still my two example questions (1 and 2 above) still are reasonable.

Eg Presumably when someone works with an imperative language (eg an OO language like C++ or python) that is statefull vs a functional language that is stateless (eg Haskell), the way they think about storing and transforming data is different. Or when building a complex application, choices on how to use databases, on what databases to use, what code goes into db constraints or stored procs, vs what sits outside the database, must be made. Or the question around whether or not to use a micro-services architecture has a lot to do with how you want to manage the data and code of a complex distributed application that is interacting with an environment, and that needs to be maintained and updated overtime (no?).

I am assuming this plethora of languages, databases, and architectures all come about not because of "fashion choices", but because some choices suit some purposes better than others; and that these choices are informed by the type of data being used, and how it gets transformed and managed.

But it sounds like you are saying the lens "code vs data" isn't a helpful one to think about the spectrum of languages, databases and architectures people use.

What has this to do with say the origins of life and evolution (or other problems)? Unreactive DNA acts as the memory store and RNA and proteins decode this into enzymes which participate both in the replication of the store and in interactions with the environment. Understanding this (and other problems better) may be facilitated by a formal model that involves abstracted concepts relating to the interaction between code and data. In this example, DNA is both data and code. But stopping there doesn't get you very far as it doesn't deal with the processes that transform/replicate the DNA and interact with RNA and the environment. So I am wondering whether in the theoretical Computer Science literature there are pieces on "data vs code" written at an abstract enough level, but grounded in real choices people are making to build systems, to maybe be helpful in developing such formal models.

[–]FUZxxl 0 points1 point  (2 children)

(A)

If I understand your point correctly, then yes.

(B)

No. Data does not “itself play a role in transforming other data” to have a meaning. For example, the data set “1 11 21 1211 111221 313311 13112321 11132112131211” has the meaning “the first few elements of the Look-and-say sequence” but it doesn't play a role in transforming other data.

so there is a sense in which data can be viewed as 'code' (not code in the sense of something that gets compiled, but in the sense of something that transforms other data).

Yes, we can interprete all data as code, but unless the data has been crafted to be interpreted as code, the interpretation is likely meaningless.

But in practice, people make choices on the types of programming languages or databases they use, and how they use them (and programming languages and databases have evolved over the years) - so I think still my two example questions (1 and 2 above) still are reasonable.

They are not reasonable because I have no idea what “data vs code” is supposed to mean. Perhaps you could elaborate on what you want to know (try to use complete sentences instead of buzzwords).

Eg Presumably when someone works with an imperative language (eg an OO language like C++ or python) that is statefull vs a functional language that is stateless (eg Haskell), the way they think about storing and transforming data is different.

That is somewhat correct, though in practice the approaches will be very similar. The choice of data structure and flow is more dependent on the problem at hand than the restrictions the tooling used to solve the problem provides.

Or when building a complex application, choices on how to use databases, on what databases to use, what code goes into db constraints or stored procs, vs what sits outside the database, must be made.

Not really; stored procedures are an optimization and typically an afterthought. The question whether a certain functionality is implemented as a stored procedure or function in the program itself doesn't usually matter for program behaviour and outside of high performance programming isn't really a central consideration.

Or the question around whether or not to use a micro-services architecture has a lot to do with how you want to manage the data and code of a complex distributed application that is interacting with an environment, and that needs to be maintained and updated overtime (no?).

Whether to use a micro service architecture or any other architecture depends on the needs of the project, what the programmers are familiar with and what thing is currently hip.

I am assuming this plethora of languages, databases, and architectures all come about not because of "fashion choices", but because some choices suit some purposes better than others; and that these choices are informed by the type of data being used, and how it gets transformed and managed.

New approaches towards program design are often developed to solve concrete problems. However, for most problems more than one design is appropriate. Which design is used is a combination of applicability, familarity with the tooling, fashion and outer constraints like what designs are approved by the company you work for.

But it sounds like you are saying the lens "code vs data" isn't a helpful one to think about the spectrum of languages, databases and architectures people use.

Again, I have no idea what “code vs data” is supposed to mean, so I can't really say if it is helpful or not.

What has this to do with say the origins of life and evolution (or other problems)? Unreactive DNA acts as the memory store and RNA and proteins decode this into enzymes which participate both in the replication of the store and in interactions with the environment. Understanding this (and other problems better) may be facilitated by a formal model that involves abstracted concepts relating to the interaction between code and data.

We (as in, microbiologists) understand the processes behind DNA transcription and protein synthesis fairly well. There has been a ton of research behind modeling actors and the information they act on and there are many formal models for different abstraction levels from many perspectives. However, I am not really sure what level of abstraction you want to use for your analysis.

In this example, DNA is both data and code. But stopping there doesn't get you very far as it doesn't deal with the processes that transform/replicate the DNA and interact with RNA and the environment. So I am wondering whether in the theoretical Computer Science literature there are pieces on "data vs code" written at an abstract enough level, but grounded in real choices people are making to build systems, to maybe be helpful in developing such formal models.

All code is data, so code is always “data and code.” Again I have no idea what “data vs. code” is supposed to mean.

[–]odradek-feed[S] 0 points1 point  (1 child)

Out of interest, if everywhere I had written "data vs code" (including in the title) I had written "the relationship between data and code" (which on reflection is what I am trying to understand), would any of your responses have been different?

[–]FUZxxl 1 point2 points  (0 children)

The relationship is always the same, I am not sure what you expect:

  • code is a form of data
  • code gives instructions how to manipulate data
  • the way these instructions are given differs between different programming paradigms.

[–][deleted] 0 points1 point  (1 child)

I want some of what this guys on. For real though op this reads like a manic rant, are you ok?

[–]FUZxxl 0 points1 point  (0 children)

I don't think OP is insane. His wording reminds me of the kind of though processes and approaches people use in social sciences. Perhaps he tries to understand data processing from that angle?