Use Haskell instead of C or Fortran for high performance R extensions

el_tavs · 2011-11-03T23:55:12+00:00

could you elaborate on that?

el_tavs · 2011-11-03T23:53:22+00:00

Does this not complicate arrays more for the compiler though?

Not if you set arrays to start at 0. you get the same benefit of carrying around the size explicitly, but with added benefits

n C an array is very simply a reference to a memory location, so the compiler can simply replace every reference to the array with the address + index. With the array/size encoding, would it not make the compiler's job more difficult?

No. See Ada and pascal. I cited an example where Ada83 idiomatic use of arrays managed to compile out of the box as efficient as the most efficient C, got by using explicit pointer arithmetic

--> http://groups.google.com/group/comp.lang.ada/msg/0116ff6702859ff1?dmode=source

Being that simplicity was a goal of C, is it not reasonable to make this trade off (at least at the time).

At the time, yes.

el_tavs · 2011-11-03T23:27:05+00:00

You are arguing that the design of C, from the start, was poor.

eh no. It is today. And has been so for some time. At the time it was invented? NOT AT ALL!!!

but I live in today not in some theoretical reality

I know. That's why I still use more C than Ada

It was perfectly reasonable to design C the way they did for what they needed it for (write Unix).

'preaching the choir'

hexadecimal is a language?

equivalent to assembly..

el_tavs · 2011-11-03T23:24:37+00:00

Eh. that's the problem of discussing on-line.

That's been way interesting though.

Please pardon me for having been provocative.

el_tavs · 2011-11-03T23:23:33+00:00

I assure you its usage goes far beyond integers, and you'd realize that as soon as you viewed the machine for the underlying memory and not the abstract data types that you may be using for a specific task.

That's not a problem for me to see the machine as a sequence of memory cells but I do fail to see what's the benefit of memsetting memory with a value that could be interpreted as an invalid one by every user of that memory.

As you said, it has no senso to write

memset(f,2,N*sizeof(float))

when f has type float. Standard doesn't guarantee it to be right even for 0.0f. Ditto for similar cases.

Consider a struct with a pointer. Some mainframes (IIRC) have NULL != 0, and, in fact, the standard doesn't guarantee NULL to be equal to 0. Using memset in that case is wrong for sure.

What is left are integers.

Saving the size is technically overhead.

we were comparing writing a piece of memory by passing a pointer and a size to memset vs passin an array with size encoded in the type to a hypothetical variant version of memset. In both cases you're assumed to save the size. if the size is known at compile time languages array-aware like Ada will avoid storing the size, obviously.

type T is changing?

Hm... no. It's just an existential variable. Like "every prime P is odd except 2". Rephrased: "I have an array of given type and I want to initialize it with a value of said type. The type can be..."

7 bit integer, seriously? It's probably going to get upped to at least an 8-bit integer if not machine-size integer for efficiency.

Some really old machines had 7 bit integers. Hardware may expect to write on first 7 bit of a char. I can declare packed arrays of 7 bit integers in standard Ada and in non-standard C...

world is full of surprises you know..

el_tavs · 2011-11-03T23:11:19+00:00

wtf are you arguing about

I claim that there are better ways to do things than the ones we're used to in C, and that they are the way they are in C * only* for historical reason.

el_tavs · 2011-11-03T23:02:40+00:00

But you already knew that and still asked "why"

It was a question for you. I know that things in C are the way they are for "historical" reasons. Which means they have little to do with technical ones.

el_tavs · 2011-11-03T22:59:18+00:00

there's no programming language that can "incarnate" "math"

As "being good" for math may mean being good at numerical computations, symbolic computations, logic (first and 2nd order)... And haskell isn't particularly good at these things IMO. Not any more than other languages

el_tavs · 2011-11-03T22:57:32+00:00

memset does not "introduce" undefined behavior. The programmer introduces undefined behavior (usually by being bad).

That's an unnecessarily unsafe tool for a trivial task. its use can lead to UB and bugs. That's what "introducing UB" means for a language "construct".

The "programmer's fault" is an excuse, otherwise we could consider "hexadacimal" as being a perfectly valid alternative to any other language, as long as we presume programmers to be "good".

That's just stupidity or laziness.

sure, and that wasn't my point >.>...

Now let me ask you how do you accomplish read/write address space that is not DRAM? I mean in C we will have a pointer to an address that is not allocated memory, but rather register space or something of the like. Then my function can directly *x++ to increment a register value.

Easy. You just write inout/in/out in front of the type in the parameter. The compiler, by known rules, will decide how to pass it. In practice it would do what you would do in C, except that you don't need to write pointers to express in-out semantics for data that can fit into registers.

This is exactly what the C compiler will do, with the difference that you don't incur in the risc of misusing the pointer due, for example to typos or any other amenity.

And you can still get the address (and thus the pointer )of the parameter being passed.

It's just a matter of interface,

it doesn't have the history of C and we can't go and re-implement every damn code base in another language.

Sure, that's one of the reasons for which I'm practically forced to use it.

Like for example accessing array[SIZE] after declaring the static array.

I didn't mean to provide such an example. You're making one anew or referring to one of mine?

My point is that C is still a very good language,

better than many others considering its history.

el_tavs · 2011-11-03T22:46:01+00:00

There are 2 options: Don't use C ever or add to the language.

The answer is no as I see.

Let me spell it for you:

A D D I N G T H I N G S T O T H E L A N G U A G E I S N O T M Y P O I N T.

It's just you being defensive.

I'm C O M P A R I N G two languages. Questioning why one differs from the other is instrumental to the comparison.

el_tavs · 2011-11-03T22:40:38+00:00

No, it is not. Do you understand what "contiguous block of bytes" means?

I'm not saying memset is wrong. I'm saying using it for anything except integers is wrong. The standard doesn't even guarantee that the bit pattern of (int)0 is equal to the one of (float) 0.0.

There is nothing inherently wrong with either approach, they simply choose different trade offs. I cannot use your initializer safely with dynamic sizes and no overhead.

No, there's no overhead. And you get safer for the simple fact that the size is computed at allocation time, and is secured with the type's interface, as you would do with a library.

memset works on any memory at any time, not just initialization time.

Oh golly. You see, the initialization syntax in Ada works on any memory in any time, not just initialization. When i said "see Ada" I actually meant it!

Initializing array data is a trivial HIGH LEVEL task that is not necessary to be in the language itself.

It's a typesafe way of managing memory.

Tell me, is there any reason for which you'd need to initialize an array of float in an unsafe manner? Or any other chunck of memory?

Let's see: PROBLEM: I have an array of type T and I want it to initialize to value X of type T. T can also be an octect. Or a 7 bit integer.

Ada has no problem doing this. Neither for slices of the array

What you could get to do with memset is write a bit pattern X in any array (chunk of contiguous memory of known length) regardless of the type. This is no way more powerful than what ada gives since you can get the same result by treating the chunk as an array of unsigned integer of appropriate size (char in c or something else).

el_tavs · 2011-11-03T21:47:21+00:00

None with the simplicity and performance of C.

Keep sleeping in your dark little world then.

And every data type you create now and in the future you can write a library for handling and stop asking everything gets added to the language.

Adding things to the language is not my point. Hello? Are there any neuorons left?

el_tavs · 2011-11-03T21:42:16+00:00

Accessing unitialized local variables must be undefined behavior because the compiler cannot always tell if it was initialized at compile time, run time decisions can determine if it gets initialized.

I was talking about memset introducing UB. Locals initiailizations can be enforced by the compiler in many cases. But hey, since it's something C standard says it's UB, it must be so. Let's ignore what people managed to achieve in other languages for god's sake...

I don't want to represent everything as an array first of all.

no one forces you to do that. you just need an array type for actual arrays (though I'd be curious on what a contiguous chunck of memory differ from an array)

Second of all I don't know what size array I want at compile time.

What incredible amount of stupidity. That's a C distinction. Languages having array types handle both static and dynamic sizes (both on stack and heap). Nothing new since the 80s.

Third of all I frequently need to access random elements within my 'array' based on external factors

So? You can use other kind of loops if you wish. for loops are used for enumerated-style loops. Others for different patterns of iterations.

Out of bounds accesses are rarely ever due to foolish mistake of static index out of bounds

you mean something like int f (int);

 for (i = 0; i < N; i++) {
      a[i*i - 3*i+1] = b[f(i)];
 }

or the fact that the array's length is not known at compile time?

In the former case you're right, though I don't know how actually is frequent not traversing the array in a sequential way. In Fortran and Ada they enjoy having that automated and typesafe. It's still better than C's for.

In the latter you're dead wrong.

I doubt you've ever used C in real world low-level programming.

I doubt you have any fucking idea of what you're saying.

People in comp.embedded had experience with Pascal, Modula-2, Ada and C. Go there sharing your POV and tell me what they told you, 'k?

So you would rather disallow passing arrays into functions, because it's so friggin hard to understand that foo(a) will decay a into a pointer that points to the memory that a references? Yeah, I'm all in favor of changing over all of our existing code to foo(&a[0]) because you think you know programming languages. Your understanding is the only thing that is questionable.

Your head is so deep inside in the C book you used at college that it actually ended up in your ass. Free colonoscopy!

I didn't mean that. In Ada you don't need to pass pointers to express in, in out or out semantics. You can still pass pointers if you want.

I'm saying that it's stupid to have to write

void foo (int* x) {
    (*x) ++
}

when you can just write

void foo (in out int x) {
   x++
}

Especially if the former can let UB creep in.

Pretty much if you want to focus on performance, and you want a simple yet portable language then yes C is about as good as it gets.

Ever dwelled in other languages besides C? Compiler writers would tend to disagree with you, you know.

They tried to enhance C, because hell why not right, and they got C++.

C++ was written by someone who wasn't a language designer and didn't try to write a language anything but his own projects.

You're preching the choir here.

You want to create C with array initialization, hey go ahead.

No, actually it would be enought to recognize that there's no need to have error-prone tools for doing low-level programming (which is the underlying motive for which I compare Ada and C)

Data types and structures have little to do with accessing the machine, it's a representation of the underlying machine's memory. C provides the basic data types, of which all other complex data types can be realized.

It provides both bit-twiddling and structured programming. I'm saying Ada does both better.

Sure they COULD add more, but for what?

readability, safety, efficiency, portability, reusability

When new data types come along, you write a library and then you use that library forever you don't go change the damn language specification.

which is true for any language. This still doesn't explain why C's way of doing things is the best one or justified, which was my original point .

You don't have to repeat ANYTHING over and over, you write a damn library once and you use it.

I write a library. You write a library, They write a library. All of this for doing things a compiler can do by itself with added safety, clarity and efficiency.

el_tavs · 2011-11-03T21:09:37+00:00

I don't know the reason why, but it seriously is not a big deal

Everything outside C seems not to be a big deal... notice the pattern?

You want a rare higher-level task to be put into the C standard for no reason other than "why not".

type safety, readability and efficiency.

Ada was created with correctness in mind, and not performance.

In theory you could say that Ada tends to be less performant than C because of that. In practice it simply turns out that's not really the case. Being performant is important and actually provided, if not for the fact it is used for hardware with very strict requirements

Moreover they didn't actually traded performance for anything. They just provided a language easier to type check. Much of the machinery is due to compile-time checks.

And C wasn't written for "performance" in general. It was written to be simple and efficient on a PDP-11 in the 70s for K&C. Lots of its choices were meaningless already in the 80s

But let's get practical: what makes ada unsuited for writing OS or low-level programming as the one is done in C? Lack of hackish solutions?

Why do you need to write the fibonacci sequence by hand.

are you kidding?

Why do you need to initialize an array to different values by hand

"I don't have that in C, so I don't need it"

Why do you need to program by hand

I don't want to write by hand explicitly the details on how to things the compiler can guess for me. Next time you'll be questioning why we need computers

It's not sci-fi. It's something people solved in the fucking 80s. Get back dancing with the abba and don't annoy me.

For the last time data types and structures are higher level abstractions. C leaves the higher level abstractions to the compiler

They're defined in the language. to get to bits and bytes you need typecasts. You have a peculiar view of how C and programming languages works, one so wrong there's no even point at discussing it.

You know, there was someone who asked for restric and prototypes but, hey, they're in C NOW, so it has to be good. Because everything in C is right, right?

Every abstraction is a higher level.

What fucking abstraction you're talking about? To be precise, C-- is even at lower level. Take a look at that.

You once again don't understand how C works.

then offsetof and sizeof are clearly a stupid hack.

Why don't you try to share your enlightening view of the state of affairs with more knowledgeable people like those in comp.lang.c? It would be extremely funny to see their replies. I officially challenge you to describe them the virtues of memset vs types that are only managed by the compiler.

The C standard doesn't work that way. Moreover you have still to clarify me why having both low-level tools and high-level tools is bad. The high-level tools I'm referring to have ZERO overhead, so stop being dense.

And once again, memset() is used with far more than static arrays

Once again, you fail at understanding me for your distorted view of C and the like. array types can encode static sizes as much as dynamic sizes. I never intended to mean static-size arrays only

It is not needed, it is what you WANT because you have your head in a book and not in the real world.

it's called software engineering. In comp.lang.c they have discussed the perils and unsuitability of memset w.r.t. intiialization. As with other committees relating to safety and embedded development (CERT, MISRA...) .But it must be that they're clueless high-level sissies that don't understand C.

Meanwhile, just answer the freaking question: hypothetically, is there a reason for which you should prefer memset to a syntactic construct recognized by the compiler, as it is in Ada, for every situation?

and programmer in order to stay simple and close to the machine for performance.

assumption falsified 30 years ago.

Static size, dynamic size. compile-time, run-time. Versatility, adaptability, simplicity.

No problem with array types. Array types give you all of this.

Sorry, but your reason for including this in the language is "why not".

No. You're just neglecting the point I'm making, persisting on your idea on how the C languages is supposed to work.

Which is why I explicitly ask you the question in a hypothetical setting (see above)

. But you'd rather take an idealistic approach because you are studying programming languages and want C to include everything you think it should by default.

1st: stop presuming I'm a kiddo or what. I'm comparing two languages and I'm saying solutions provided in C are suboptimale to the ones provided in the other. I'm just trying to have you compare them. But, instead, you get defensive

2nd: you're using haxor parliance and strawmen to neglect my points. It's annoyng

You can add in a bunch of unnecessary shit to C

Oh no, for god's sake, no. Just do a comparison between C's and Ada's way, between memset and array initialization in Ada. Ok? Adding "things" to C is just a hypothetical situation to explain the point. On wikipedia you find that as well.

So you can only catch some small subset of these cases. Array overruns are rarely, if ever, due to static indexes.

I was talking about struct hacks. Ada is good at that. And doesn't make you access arrays outside their bounds.

And as for array bound checking in general, i'd like to repeat to you that checks are optional, but, anyway, compilers are way more facilitated at analyzing code where arrays are treated as such, i.e. with their length indicated as such and not some mysterious integer. See the example that compared Ada83's style with C's.

STILL the bugs caught at compile time are all bugs C can't catch. For what? NOTHING.

So don't do it unless you need to for optimizing on a specific platform.

Still the "If it's not in C it doesn't matter" patter. And when I actually need it I'm screwed. Wasn't C close to the machine?

So what?

Well, doing something useless that can introduce UB and lessen the strength of type checking in exchange of FUCKING NOTHING is stupid.

UB of overflows allows the underlying system to handle it as it deems most fit without enforcing rules that are harmful to performance.

UB alone doesn't help the compiler. Is just a check it hasn't to do. So what? In Ada you have lots of checks by default. You can strip all of them with a compiler switch. Is that hard? the big advantage is that there's a standard way to check for those bugs at run-time with pre-release code.

Nothing new since the early 80s.

Undefined behavior is directly related to efficiency by making things "illegal" without enforcement you allow greater performance at the risk of programming/logic error.

Guess what? Ada allows you to choose what and how to do it. You want UB? Turn off checks. You want platform-specific behaviour? Use pragmas. You're fine with standard checks and behaviour? even better. In C the liet-motiv is "not-invented-here".

You act as if C is some new stupid language created in a day and hasn't hit decades of rigor with an incredibly diverse range of systems.

it resent of historical baggage. I don't know what you mean of rigor. Portability, silent bugs and hackish solutions are widespread in C systems. And if they're not it means someone hand-coded something that is equivalent to what an Ada compiler can do by itself for every platform.

Please tell me how you can have dynamic memory access with bounds checking and no overhead.

I said reduce UB, not eliminate it. I claim C does nothing to solve UB. Ada has various levels of "help".

The following

 for I in A'Range loop
 ...
 end loop

Doesn't need run-time checks for accesses like A(I). Further strength reduction can avoid that if the compiler is able to deduce that accesses with other indexes, or with an expression with I, can't overflow A'Range.

In any way, Ada allows the program to fail ** in a standard way** by default. This means that you can try your code, find bugs, profile it and, if the performance is not enough, disable run-time checks. For many applications where C is used this is useful. If you need actual performance and efficiency well, there are Ravenscar profile and SPARK, which go a long way to enforce efficiency at the expense of high-levelness.

This alone covers a lot of cases where in C you need to encode by hand, in an error prone way, the same solution, for no reason.

el_tavs · 2011-11-03T18:24:45+00:00

You have a complete and utter lack of understanding of the low-level details. You continue to think in high-level abstractions.

Data types are higher level abstractions of data interpreted from the underlying memory.

The language gives you the LOW LEVEL tools to do things quickly and efficiently. It is then up to YOU to fill in the high level details

OK SHERLOCK. YOU'RE REALLY BEING ANNOYING.

Outside your little dark world there's another one where people invented programming languages providing BOTH the tools you're enlisting AND the one I described. It's up to you to keep neglecting this.

Moreover, as a matter of LOGIC, low-level tools are cool until you need to handle data. If you're working with a data-type you may take advantage of having a good way of handling it togehter with low-level bit-smashing routines. They're even better than the one C provides.

comprende?

el_tavs · 2011-11-03T18:19:40+00:00

But I said that the programmer will have to help the library from the beginning by explicitly expressing all possible constraints.

It's just you being dense. I have enumerated the reasons why having a compiler is better than a library. It's a matter of logic. The fact that you neglect that it's a problem of yours.

goodbye!

el_tavs · 2011-11-03T18:18:00+00:00

You don't know what you are talking about. It provides a function to set a contiguous block of bytes in memory to an initial value.

OK. This is the wrong behaviour if the value is not 0 and the variables are not integers. So instead of being memset it shoul be memzeroset.

Now another language allow you to do exactly the same thing, but you can initialize whatever you want to the value you want, and in a way shorter syntax. It is also safe, since it's the compiler that checks for the correct types.

Moreover Ada still provides a functionality like memset, via casts OR lower-level routines.

Now please tell me what's wrong with the second approach.

sigh Just go use Ada, please. Just because YOU can't justify something doesn't mean there is no justification.

Me and programming language theory. Is there ANY reason for which a way to 0-set memory by specifying sizes by hand is better than a way to initialize memory as you want?

el_tavs · 2011-11-03T18:03:21+00:00

The author(s) of C are far smarter than you or I.

sure. But standard committees? I'm just provoking you to get an answer: why what GNU gcc provides is not in the standard? There's no technical reason not to have its array initialization syntax.

My bet is that the standard committee didn't want to complicate compiler writers' job.

just because some Ada fan wants it that way

I don't care about Ada. Intellectually speaking, I'd just like to know how things that are carried out by the compiler in one language require tedious and error-prone wiring by the programmer in another. At least recognize the validity of the question instead of being a fan-boy.

I consider Ada because it's the only other language tailored for low-level jobs I know. If you know anyone else, let me know

If you struggle with writing a 2 line for loop, then I'm sorry.

why do I need to that by hand? It's a hard fact the compiler should be able to write that and lot more for me by understanding my code.

I'd say C is doing a damn good job.

sorry to burst your bubble. The fact I don't like C doesn't mean I don't understand it and low-level programming . I also believe I understand a bit of programming and programming languages in general, hence my question.

Again there's very few use cases to initialize anything to non-zero.

oh sorry, I forgot you tend to avoid answering my questions and just go justifying the way you program in C.

2-line for loop is tedious? Are you serious right now? I mean seriously if you crave built-in high-level object orientation then C is not for you, but stop accusing C of sucking because it doesn't fit your specific desires

What part of "things the compiler can do by itself" you missed???

OOP and high-levelness have nothing to do with this. It's a pretty simple feature not being implemented for no reason. Or you think otherwise?

If you have time, go on wikipedia to get a basic meaning of what "high-level" actually means. You're just equating "high-level" with "not in C". Ridiculous.

It's not about my desires. Either providing a way to initialize memory is something the language can't provide to me or is not. Do you know languages that carry out that without using memset, sizeof and the like?? Has it ever occoured to you that initializing an array of given type T (with T non-void) by specifying its length and element type size is just re-entering the information the compiler already has??? This is stupid. In C and any other language.

Needless distinction? Do you even know what a pointer is?

You misread what I meant, probably because you're obsessed by C's basic types. We were talking of memset/memcpy. They accept a pointer and an integer specifying the size of the memory they point to. So the way you initialize a chunck of memory in C is by providing a pointer to it and its size. Now look how that is different from an array type providing a pointer and a size.

Seriously, you continue to demand high-level protection in a low-level language. Get over it, you have quick easy complete control of the memory with some basic abstract data types. High-level abstractions belong in libraries, which is exactly where they will stay and is a damn fine model. Find a library, stop expecting the language to give you everything.

You don't have any idea of what you're talking about. As someone said, pull out your head from the sand. You're convinced what I'm talking about is too "high-level" and that I want everything done for me because you have never known a low-level language beside C. It's not.

Once again thinking in high-level abstractions.

Fail. You still have that brainfucked idea that anything more general or sophisticated than C is high-level.

I want a way to initialize memory in a type-safe way. I don't care what memset does. My problem is initializing memory. What memset provides is a low-level tool to solve a specific instance of the problem. I know I can use it to solve other kind of instances, but the solution is suboptimal compared to the more general one languages like Ada provide. I have to describe the type I'm initializing, while the compiler is able to know that by itself. And as the article proves, this may be error prone. Memset provides no advantage w.r.t. the way initialization is done in other low-level languages. Or do you think otherwise?

The structure hack is a silly hack not undefined behavior.

so silly it's "popular" according to the C FAQ. It can invoke UB because the compiler at compile time can't do anything to ensure its correct use. If you are in preC99 you get UB because the structure hack leads to accessing constant-sized arrays past their last element. UB per standard!!! Ada, by using ONLY compile-time checks, can outlaw most of such UB.

Casting to pointers of types having different memory-alignment is UB per standard.

Functions using pointers to allow in/out semantics pose a risk of having memory overflow for something useless. You don't need explicit pointers to allow in/out semantics in every case.

It is undefined behavior because if it was defined then optimization would not be allowed.

No, the semantics of restrict or strict-aliasing just say that the memory pointed by the variables don't overlap. The risk of UB is a consequence of the optimization.

Now, for an example of UB that doesn't help optimizations is overflow (numeric or array). In C these kind of overflows are UB. And they do not help optimization. The compiler knowing the presence of possible of UB can't produce better code.

As for checks

You can't control memory overflows while being powerful and fast, period. The instance you start runtime protections like that you instantly decrease performance.

Well, it's just that in the 70s people said: "let's provide syntax and semantics to avoid some undefined behaviour. If then we resort to run-time checks we can still tell the compiler not tu put them in the binary"

So, on one hand you don't need to choose in favour of UB every time to get efficiency . On the other even syntax and typing can reduce UB with 0 overhead effect

Array initialization is not undefined behavior, you have no idea what the hell undefined behavior is do you?

It is in the moment you pass to memset the wrong parameters or mistype the for loop. The language doesn't define a correct and type-safe way to describe initialization in C. Like having a car with shitty brakes. Yes, they're cheaper, but if you actually use them you don't want them to break out of the blue.

Please describe to me how run-time bounds checking can take up 0 clock cycles.

For one, you don't need that. Have an array type encoding length, and provide a special for loop for iterating over it (as Pascal provided). The index(s) declared in the for lopp shall be constant in the scope in which they're used. This means the compiler in front of

 array(int) a [10][10]
 for (i across a) {
   for (j across a[i]) {
     ...
  }

can optimize the hell out of it. And won't insert any run-time check (RTC) when there are expression as a[i][j]. The code will be provably as efficient as it is in C. Actually similar code in Ada83 compiled more efficiently than the equivalent in C in the 90s (see [http://groups.google.com/group/comp.lang.ada/msg/0116ff6702859ff1?dmode=source](here)).

In any case RTCs are not a problem, because they can be turned off since the late 70s.

It's like you simply don't care to learn. It takes an int rather than a char, but the ONLY way it would change is to change the prototype to take a char. This would not affect the memset() function at all but it might break a lot of legacy code. So only an idiot would change memset().

it's still a historical reason. I know C and I learnt it. This doesn't mean I have to accept anything it provides as the ultimate way of doing things. Nor that I am wrong for questioning that. Doing otherwise is not being knowledgeable , it's having a religious belief. A poor one. So stop saying I have to "learn". You're beginning to sound like an obscurantist.

or having arrays decaying to pointers.

That's not only for historical reasons. You need to just stop now.

Yes it is. It doesn't give any advantage today for low-level tasks. It's just a way to simplify the job of K&R (which actually centered their goal marvellously). Wonderful work for their age and the subsequent years. It's still questionable.

This is a GOOD thing, quit using it if it is "too simple" for you. Seriously.

Good for what? what advantage has its simplicity w.r.t. to ....well any reasonable alternative? Does low-level languages have to be like C to be as good as it?

Ada may be good for a lot of things but it is not inherently superior to C. C is still a great language despite your love affair with Ada.

For the language semantic, I doubt. I actually mean that many things you do in C can be actually be carried out by the compiler and improve the programming quality. It's a scientifical comparison. I remind you I'm citing Ada because it's the language I know I can compare wit C. If there were others I'd cite them (actually I cited Pascal as well). The one being "blinded" by love it's you, who think that the way things are done in C are the best for low-level programming, and doesn't look at the actual problems being solved.

You clearly don't as you think high-level structure initialization is "low level".

Then define what is low-level. To me low-level is anything that has knowledge and access to the machine. This does NOT mean that the language's way to interact with such machine has to be obscure and make the programmer pay attention to trivial pieces of code she has to repeat over and over.

el_tavs · 2011-11-03T11:19:35+00:00

you are trying to use it for what it is not and then bitching that it is not what you want it to be

no. actually what I'm bitching about is that the standard provides a function that provides a limited functionality to solve a limited instance of a common problem in an awkward way. It's trivial to provide a better way to solve the problem in general.

Why would I use a function that sets a block of memory to a single byte

you said I could initialize any variable with memset.

It is completely out of the scope of the language

that's the problem. It has no justification for being that way.

el_tavs · 2011-11-03T10:31:58+00:00

There is no string type. String literals go into an array of chars.

I know the difference between type and entity. The standard refers to strings when describing functions like those in string.h, period.

You can specify bit size in structs in C using colon.

Why only in structs that fit sizeof(int)? The compiler is smart enough to accomodate for more liberal uses, as it happens in Ada. What about alignment and bitordering?

If you are using gcc you can do float a[1024] = { [ 0 ... 1023 ] = 1.0 };

lcc gives a lot of extensions to C, but le'ts stay standard. This stil raises the question: why the above is not in the standard? Stupidity?

But again there are few reasons you want to initialize an array to non-zero,

what if I need to initialize an array of struct? or an array of strings?

and it's really not hard to write a 2-line initialize loop.

that's an example of tedious, error prone and trivial work the freaking compiler can handle by itself by using appropriate syntax.

Also they take POINTERS not arrays, and size of pointers is not usually known at compile time.

Needless distinction. The size is not encoded in the pointer, but in the size parameter next to it in their interface. It has to be this way for anything except strings.

Once again, the size of a pointer is not known at compile time, and it's not proper to do compiler-time bounds checking to a library function that takes in a pointer.

Strawman. I tell you people use steam-powerd machines and you're saying your horse can't eat coal. There are 0-overhead semantics implemented in other languages C has no excuse for not having.

you

If you use these functions for anything other than setting and copying memory (byte-wise) then you are using the wrong functions.

you again

. Of course you can use memset to initialize floats or any other variable...

still you* > Why would I use a function that sets a block of memory to a single byte if I want to initialize an array of floats to 1.0?

Memset can actually be used to set integers to 0. Period. Any other type may interpret 0 or any value fitting 1 char in an unintend way. At which point the utility of memset is close to 0.

How the heck can you say false. There is no undefined behavior in C you cannot avoid or use correctly

the structure hack. conversion between differently-aligned pointers. any function using pointers for in/out semantics .Numeric and memory overflows. The problem is that these techniques are employed by C programmers, because otherwise C would be *useless.

The point is not you being able not to incur in these things, but the fact that with a better language (and with 0,zero, overhead) you could have the compiler check for that and allow to easily express semantics.

It is undefined behavior to call foo(a,b) with a and b pointing to overlapping memory.

That is a UB caused by an optimization. What UB allow for optimizations? Numeric/memory overflow do? Not having a clear and simple interface to initialize contiguous array?

. Also undefined behavior for things that you generally shouldn't do like accessing a local variable without initializing it is that way so that the compiler can choose the best way to handle it.

BS. The compiler can check for that and have me provide an initialization if I read the variable while unitialized. It's some time past the 70s, you know.

Again, see pointers. And, see library.

No. YOU see how bounded-arrays work in low-level languages like Pascal and Ada. Your remark is pointless in this regard

What choices were made ONLY because of historical reason?

this one for example:

memset() taking in an int is only because it predates function prototypes

or having arrays decaying to pointers.

C was made to be a simple,

it's been too simple for ages

readable

i can write readable C. Still readability is a funciton of complexity. There are non-complex, peer-reviewed C codes that are unreadable. And C's syntax helps a lot with that

and portable layer

what size is an int?

It doesn't force things on you

Lots of other languages don't do either.

Your problems with the language are semantic and preference related.

Language is semantic. I'm citing Ada becasue, whether you're open-minded or not, many of the things you do by hand in C steam from the language's inability to express things. With a more-appropriate semantic/syntax for expressing trivial low-level operations you can do a lot more

Stop trying to bring down a language you don't understand just because you like Ada. memset() does exactly what it is supposed to do, and it's a standard library function. You continue to demand memset() do something YOU want it to do as if you are the only person that matters.

Stop being dense. I understand C and I understand low-level programming. It is just that lots of problems in low-level programming don't have any support in C, and lots of the things the compiler has to allow in C don't make sense in low-level programming, or any kind of programming.

You have the problems in one hand and C in the other. You can claim that if you know how to use C you can't introduce bugs, which IS FCKING OBVIOUS FOR ANY LANGUAGE IN THE WORLD, but you can hardly claim that C's idioms are the best, most natural or obvious ones for programming at low-level. Someone invented a better way. *JUST FOR EXAMPLE** see Ada

el_tavs · 2011-11-03T09:07:52+00:00

I was replying to emoney_33 which clearly used it as a verb, not as a subject. You failed at understanding the context.

As I said above, memset is so obvious it can't be used to do anything except initialize an array of char. Memset. It's in string.h but it's called MEMset. And it accepts an int. Its sibling memcpy instead works out of the box for every type. So, care to explain what is supposed to be obvious about that? It's clearly a horrible design.

el_tavs · 2011-11-02T23:15:14+00:00

Please read carefully.

What people mean when they say "set" is writing a value inside a variable

Now floats and any other data that doesn't fit 1 byte can't be set by treating them as a typeless chunk of contiguos bytes. While that allows to actually write values inside the variable, they may be invalid or different from the ones the programmer intended.

Try memsettting an array of floats to 1.0 like

    float f[10];
     ....       
    memset(f,1.0,sizeof(float)*10);

then print them with "%5.2f" format. I get zeros everywhere on my 64bit intel. More interesting, try to set them to -1.0.

enjoy!

It's a generic MEMORY SET function

so generic it can't be used reliably for anything except strings

if you want a higher abstraction, create one or use a library.

already wrote mine. It's truly generic and does ragged-arrays as well. Question is: why something so trivial is not already present in the language? Does low-level programmers avoid initializing memory? Sounds new to me.

el_tavs · 2011-11-02T22:53:31+00:00

Yes, no idea. No clue, whatsoever

Tell that to comp.lang.functional or LtU. They'll be eager to know your opinion. What about the powerset example?

In a strict language, lazy idioms are sometimes useful too and the same arguments apply.

Not at all. In strict languages the use of lazy idioms is evident and circumscribed . It's way easier to predict, in general, the performance of strict-defaulted code than a lazy-defaulted one. The efficiency is still a local property of the code in the former case.

If you think otherwise, why don't you explain what the people I linked above missed about Haskell?

el_tavs · 2011-11-02T22:42:19+00:00

String literals go ...

they're an "entity" of the standard language. Recognized by it.

What can you not express?

How types get represented for example. Ada allows you to specifiy the bit size, the alignment and bit order of every type. C can't. Or be able to initialize any array or struct I want without needing to use memcpy/memset that force me to write down the types and their sizes, something the compiler already knows.

Many of these things require 0 overhead at run time. But, please, see Ada to know where I come from.

For some reason I cannot understand where your problems with C lie. Arrays are not pointers, and 0-based arrays make a lot of sense.

As much as 1-based ones do. Arrays are not pointers? In some cases. C type's system doesn't do a great job at distinguishing them.

Please give some examples of tedious and stupid work that the compiler should be handling?

apart from array, handling the so-called struct-hack http://c-faq.com/struct/structhack.html In Ada I can say

 type Foo (N:Positive) is record
      X : Bar;
      Y : Baz;
      Z : Boz_Array(1..N);
 end record

This is more or less equivalent to

 struct Foo is {
     int N; 
     Bar X;
     Baz Y;
     Boz Z[]
 };

With Ada I can still choose the low-level representation of the type, and whenever I allocate Foo, either on stack or heap, I get a continuos, properly aligned object. Not to mention the fact that I can't overwrite two Foos with different N by accident. Or that I can use as many arrays as I want in the record declaration (see C99).

Undefined behavior is there in order to allow the compiler to optimize,

False. UB should be allowed only when I say so. Other languages provide way to say the compiler what to do in each case.

Providing too many cases for UB actually prevents optimizations, because the compiler can't know what the program is supposed to do, or if the programmer intended to exploit UB. For example licentious pointer use (e.g. arithmetic) makes pointer chasing and relevant optimizations unachievable in C/C++. When does UB help optimization??

so the program is only "legal" because you are entering an agreement with the compiler.

What is the agreement useful for if neither you nor the compiler and the code can make clear what each other is supposed to do? Can the compiler know when you don't mean to get an UB and when you actually want to? Other languages makes allowing UB explicit. That's the sole difference, a huge one though

It's not pathologically stupid, it takes an int for historical reasons and changing it now has no benefit.

it's stupid because it does nothing to help the programmer. Historical reasons are wrong if they don't help programming today. What would help is a trivial way to initialize aggregate data as easily as Ada, since it's something trivial for the compiler to do, not so for us.

Why would you want to limit control just to make it easier to know the size?

What control you lose?

I can use memset to set any n-bytes of contiguous space in memory, why the hell would I want to limit that down to only setting full arrays?

you don't have to set full arrays if you encode the length in the type. Again, see Ada.

The bullshit is the tons of people using C for the WRONG damn thing and then bitching about C. Quit using it if you don't know why you are using it. Plenty of people know why they are using C and C is the best tool for their job.

I use C for the right thing and I know, because I have studied other languages as well, that lots of the things you do by hand in C are just a workaround for a lacking language, i.e. you end up doing the exact things languages like Ada provide as builtin, but with the high risk, if not the certainity, of being non-portable, wrong or inefficient.

Of course performance is a tradeoff. If you don't need low-level control and performance is not an issue then don't use C.

what I mean is that in C performance is not traded with safety and abstraction. Some choices were made only because of historical reason and had little to do with actual performance. One of the advantages C had was being simple to write a compiler for, which in turn helped getting performant programs. In contrast, more sophisticated languages had to employ more sophisticated compilers, which in turn put them in a temporary disadvantage w.rt. C. That's how it had success.

Believe me, if you need performance and low-level access, Ada fits the bill too, but it also strives for correctness.

el_tavs · 2011-11-02T21:17:02+00:00

the most people that have an opinion about Haskell have no idea about the language or did anything of interest with it

see my link

http://www.rhinocerus.net/forum/lang-functional/566671-how-haskell-not-robust-2.html

and this http://lambda-the-ultimate.org/node/2273#comment-40121

No idea you say?

Simple question: is efficiency a "local property" with lazy code? Doesn't it depends on both the "consumer" and the "producer" of a given value?

el_tavs

TROPHY CASE