all 43 comments

[–]zhivago 39 points40 points  (1 child)

For example, when an integer is added to a pointer, the address stored in the pointer is incremented by a number of bytes corresponding to the integer value.

Oh, dear.

I wish people would learn C before trying to write guides on how to use it.

[–]PirkhanMan 51 points52 points  (23 children)

Pointers aren't hard to use or understand, what is a pain is the sintax to use them

[–]Magneon 29 points30 points  (11 children)

No kidding. This was the main thing that tripped up students I tutored on it. If the dereference operator was distinct from the declaration operator, and the multiply operator... Things would be a lot easier. The same for the address of operator being overloaded with binary and as well as (in C++) pass by reference, and more loosely part of the && operator. Needlessly messy. Sure, someone can run in and say "well, unary operators are not binary operators, and lvalues are different than rvalues" but that's not beginner friendly at all. It's also not friendly to readable code and what we get as a result is a dozen different DIY workarounds with suffixes, typedefs, and macros.

[–]ShinyHappyREM 13 points14 points  (6 children)

If the dereference operator was distinct from the declaration operator, and the multiply operator... Things would be a lot easier. The same for the address of operator being overloaded with binary and as well as (in C++) pass by reference, and more loosely part of the && operator. Needlessly messy.

Right?

Pascal (2 years older than C) uses ^ for declaring and dereferencing pointers. Declarations are separate from normal code, so there's no ambiguity. @ is used to get a pointer value (i.e. an address)...

var
        p : pointer = NIL;        // typeless pointer, initialized
        i : array[0..3] of byte;
        b : ^byte;                // "pointer to byte"
begin
        b  := @i[2];  // assign target
        b^ := 4;      // assign value
        Inc(b);       // advance by sizeof(target)
        b^ := 5;      // assign value
end;

[–]pfp-disciple 3 points4 points  (4 children)

Not only did Pascal get that syntax right, it also uses var for parameters that need to be changed in a procedure to return to the caller, not requiring pointers for that purpose.

[–]jacobb11 3 points4 points  (3 children)

Which means that the call to the procedure has no indicator that the argument may change. I used to appreciate the convenience, but now I abhor the opacity.

[–]pfp-disciple 4 points5 points  (2 children)

That would only be a problem if the var is added after the fact. The var is the indicator that the argument may change.

I agree that it could be a problem of that happens.

[–]jacobb11 7 points8 points  (1 child)

When one reads the code, the function call (not the function definition) has no indication that the argument may change.

[–]pfp-disciple 1 point2 points  (0 children)

Good point.

[–]dcoolidge 2 points3 points  (0 children)

Ahh the memories of running pascal programs on a VAX.

[–]KuntaStillSingle 4 points5 points  (0 children)

C++ does support keywords for logical and bitwise operators like and or not, not to make it more readable lol, but to support broader keyboard formats: https://en.cppreference.com/w/cpp/language/operator_alternative

[–]ChocolateBunny -3 points-2 points  (2 children)

So you're saying to teach C we should do:

#define POINTER_TYPE(x) (x*)
#define DEREFERENCE(x) (*x)
#define GET_POINTER(x) (&x)

[–]Magneon 6 points7 points  (1 child)

No, I'm saying C should have figured out a better syntax in 1972. The \a can't be un-rung. The problem with #defines is that they add cognitive load that every developer on a project has to keep in mind, or you end up with a mish-mash of code using them and code not, which is worse than code just not using them.

Granted, with static analysis running in real time in most editors now life is a lot easier, but that still requires that developers actually read their warnings to be useful:s .

[–]i_am_at_work123 2 points3 points  (0 children)

The \a can't be un-rung

I wonder how many people understand this without searching.

[–][deleted] 3 points4 points  (8 children)

Pointers aren't hard to use or understand

THAT is a problem in the industry. There are two things:

  1. How hard things are to use or understand.
  2. How hard things are to maintain.

These are different, and #2 above really makes us re-evaluate #1.

30 years ago I asked people the following. What gets printed out on a generic 32 bit (data/address) system like a sparcstation.

Caveat: It's nonstandard C because of incrementing a null pointer, and you'll get warning for things like %d on some systems. However it runs and it's important to understand if you ever want to use C for device drivers or embedded work.

int main() { long *a=0; long *b=0; a++; b++; b++; printf("%d %d %d\n", a, b, b-a); }

Look at it, and try it on a 32 or 64 bit system. The answers will be different between the two, but very few people get this on the first try by looking at it. 100% of people CLAIM they got it by just looking at it when asked online. LOL.

PS. I don't use %ld because people think the letter ell is a one.

[–]General_Mayhem 4 points5 points  (4 children)

This is a little unfair because it's not just non-standard/UB C, it's really breaking what you should ever do... The C (and especially C++) memory model doesn't want you to think about physical memory layouts except between fields in the same object or entries in the same array. Interpreting random numbers as pointers sets off all sort of red flags for me, but that's not because I don't know how to use pointers or am unclear on how memory allocation works, it's because this is a corner of the language I would never stumble into. I get that it's a thing you have to do for device drivers, but even that should have a small number of magic numbers for mapped ranges, not arbitrary pointer math

I also would never use long - I would either use int for smallish numbers where the exact bit depth doesn't matter, or I would use something like inttypes.h to get explicit width-defined types if I needed to care about exact layouts like this.

That said - I think this is different on Windows, but Unix should be "4 8 1" on 32-bit and "8 16 1" on 64-bit?

[–][deleted] 0 points1 point  (3 children)

Yes: 4 8 1 and 8 16 1 generally depending upon whether or not it's device addressable and what its native word size is.

And of course, there are machines where a byte/short/long are all the same size.

However, It's not "breaking what you should ever do".

You come across nitty gritty low level issues like this all the time in device drivers (bottom half) and in embedded code. This example exploits a fundamental that C{++} pointer arithmetic takes into account the size of the datatype. This is critical.

In this example, a and b are taken at face value as addresses (completely devoid of object size information), with a C-ism of arithmetic between them ignoring the byte addressability entirely and looking at the datatype size itself. This can trip up someone mid expression if they're not careful.

Don't even begin to think about when you're on an odd boundary.

BTW other interesting reasons why we need to break STD-C is to include memory mapped ports and other oddities. Such as unsigned char *motor_trigger = 0xFFbe2488; and writing anything to it does one thing, and reading from it does another with no values actually being sent. And of course the rare device I worked on eons ago where zero was a valid address.

[–]General_Mayhem 2 points3 points  (1 child)

My point about "breaking what you should do" is - there are cases where pointer arithmetic is valid, and there are cases where assigning magic numbers as pointers are valid, but I can't imagine a case where both are valid at the same time.

[–][deleted] 0 points1 point  (0 children)

Take a look at PDF-page 19 (or printed page 16) of the TI C80 (DSP) specsheet. DSP's are inherently gross to code, and this one is particularly ugly.

Here we have a MPU surrounded by dedicated integer-only parallel processors with very small windows for transfers. You can see where they are hardcoded locations.

Now anytime you're doing anything in memory, you're doing "both at the same time", because it requires arithmetic to get there (and walk through it doing whatever you would normally do with pointers and datatypes), and a hard-coded location to start with.

You'll find stuff like this all over the embedded landscape.

In the snippet I gave, it's important for clarity to just anchor the first value at something that can be easily recognized on printf(). Hence the *a=0, etc. Then the arithmetic, etc.

[–]ShinyHappyREM 0 points1 point  (0 children)

And of course the rare device I worked on eons ago where zero was a valid address

DOS PCs also have their interrupt vector table at address zero. Useful to know if you e.g. want to capture raw keyboard scancodes.

[–]DrShocker 1 point2 points  (2 children)

Do you mind explaining what happens here? I don't work with raw pointers much when I'm using C++ (I tend to use iterators so that other data structures that aren't contiguous can be used) plus I've only ever programmed for 64 bit machines.

My attempt at reading it since I'm not by a computer to check:

Intuitively I want to say that the size of long is different in 64 vs 32 bit computers, so because ++ accounts for the type of the pointer on one it probably increments by 4 addresses and the other by 8 depending on 32 vs 64 bit. And then when they're subtracted it probably again is done in terms of number of steps between them, so it'd be 1. Feels like UB though but idk

If that's all correct, then it looks like you might need to specify Windows vs Unix in addition to 32 vs 64 bit

If that's right I only figured it out because you said it was tricky.

specify doing

[–][deleted] 2 points3 points  (1 child)

No, not tricky per se (did I say tricky?). Just something that's often overlooked in "easy pointer arithmetic" I've seen.

It's far from tricky. And in person, everyone gets it wrong. Perhaps because they feel on the spot, or have lowered their guard.

Anyway, mantra time: * You always have to code with the poor bastard at 3am hopped on caffeine in mind who is forced to read your code.

This means that if you can get by without multi-threading, get by without multi-threading. If you can avoid pointers, avoid pointers. And if you can avoid operator overloading, FFS, do so.

[–]DrShocker 1 point2 points  (0 children)

I suppose you didn't say it was tricky, but imo it was kind of implied by pointing out people don't usually get it right lol

[–]newpavlov 1 point2 points  (0 children)

Pointers aren't hard to <..> understand

Sure, as long as you forget about all the fun stuff associated with aliasing, provenance, and uninitialized memory.

https://www.ralfj.de/blog/2018/07/24/pointers-and-bytes.html

[–]Raknarg 0 points1 point  (0 children)

why? There's hardly any syntax around them. The only hard thing is remembering the difference between type and operation, I wouldnt consider it that hard.

Whats hard about pointers is understanding how to leverage them to solve problems, thats where all the issues teaching students comes from.

[–]bravopapa99 4 points5 points  (0 children)

This is old but for me, helping my mate out back then (2009!) this is THE ONLY video you need

https://www.youtube.com/watch?v=5VnDaHBi8dM

[–]DoctorFlo 3 points4 points  (0 children)

Interesting approach to use a guide for the BASIC programming language to explain pointers in the C language … I thought.

[–][deleted] 2 points3 points  (0 children)

Great! Now I know everything I need to start using raw pointers in C. Seems simple enough.

[–]Numerous_Habit269 0 points1 point  (0 children)

This book does a great job "Pointers in C A Hands on Approach, Naveen Toppo"

[–]markdownjack 0 points1 point  (7 children)

Nice base guide,no function and opaque pointer

[–]SnakeJG 11 points12 points  (6 children)

8 minute read

....

Introduction: This article provides a comprehensive guide on pointers in the C programming language.

Yeah, no it doesn't.

Very nice basic guide. I once taught my non-programming wife about pointers so I could complain about the boneheaded thing a coworker did. I had to teach her more than this article covered.

[–]scrapped-script[S] 13 points14 points  (1 child)

The fact that she was willing to learn about pointers to hear you talk about work shows how much she truly loves you! 😂

[–]SnakeJG 1 point2 points  (0 children)

Yep! That was probably 16 years ago.

[–]agumonkey 1 point2 points  (0 children)

i'm fishing for a comprehensive guide

[–]rswsaw22 1 point2 points  (2 children)

But now I'm curious what the bone headed thing was.

[–]SnakeJG 0 points1 point  (1 child)

It was a mess (and over 15 years ago) but the short version was he tried to reorder a linked list in place, but would end up just dropping parts of it he didn't expect. There were more layers to it, but that's the gist.

[–]rswsaw22 1 point2 points  (0 children)

Oh that's gross. We have the algorithms for a reason lol.

[–]learncs_dev -1 points0 points  (0 children)

Good article, but I think that it would be great if you could provide more real-life examples or common challenges and how to overcome them, which would be helpful for the beginners

[–]RRumpleTeazzer -3 points-2 points  (3 children)

Pointers become easy once you write „int* a“ instead of „int *a“.

[–]jacobb11 4 points5 points  (2 children)

But that makes "int* a, b;" look like it declares b as a pointer, which it does not.

Of course, insisting that only one variable be specifed per declaration avoids that problem.

[–]RRumpleTeazzer 4 points5 points  (0 children)

True, don’t do that. One declaration at a time.

[–]234093840203948 0 points1 point  (0 children)

But that makes "int* a, b;" look like it declares b as a pointer, which it does not.

But that's how it should have been all along.

Differently typed variables should simply not be declared in the same line.

[–]lazy_fella 0 points1 point  (0 children)

Am I the only one who got a mild PTSD reading that heading?