This is what you get when a person who is mostly a C++ coder (as per his bio) wr...

bigdict · on Aug 10, 2020

I never understood the reasoning behind this, is it that "the asterisk is part of the type, so we group it that way"?

That misses the point of the C declaration syntax: you write an expression that when used on its own will recover the basic type. So the asterisk goes with the symbol name, because that's how you dereference a pointer.

Further, it doesn't work if you want to declare more than one pointer like so:

  int* a, b;  /* wrong */
  int *a, *b; /* correct */

rootbear · on Aug 10, 2020

I think the issue is that programmers want a type "pointer to int", for example, but C doesn't directly provide that type. It has a type, int, with a modifier (asterisk) that can be applied to a declarator to make it a pointer to that type.

One way to create a pointer type in C would be to declare it using typedef:

    typedef int * int_p;

then one can write:

    int_p p, q, r;

and declare three pointers to int with perfect clarity, whereas using the C++ style, we'd get:

    int* p, *q, *r;

which is very confusing, or,

    int* p;
    int* q;
    int* r;

which is very verbose. I honestly don't know how C++ programmers typically handle this situation.

I have seen some code that strikes a middle ground:

    int * p;

which is a little more clear, but doesn't address the multiple declarator situation.

So why don't C++ programmers use typedef? I don't know, other than I understand Stroustrup doesn't like it (not without reason).

(Edited for formatting and minor clarity corrections.)

implicit · on Aug 10, 2020

Most modern C++ code that I've seen restricts code to declaring a single variable per statement.

It's not really a big deal because things are also always introduced at the latest possible position. Each is also typically given an initializer. I'd consider it suspicious if I were to see C++ that declared 3 uninitialized pointers back to back like this.

And then you get to the codebases where the authors have chosen to embrace auto and type inference... :)

optymizer · on Aug 10, 2020

"pointer to int" is a type, because 'pointer to' is not a type qualifier like 'const'. You can call it a modifier but you'll quickly start making exceptions for any type that has a modifier to explain away why it's actually a different type with a different size.

For example, sizeof(x) takes a type. The type argument for sizeof(int) is different than sizeof(int* ) and the results are different. sizeof(int*[3]) is different as well. These are all different types where pointers change the type. It's not the same type with a pointer modifier, there is no such thing.

0xffff2 · on Aug 10, 2020

I handle it by outlawing multiple declaration on a single line. It may be more verbose, but it's much easier to read IMO. This has been a somewhat common rule in my experience.

kazinator · on Aug 10, 2020

This is not "C++ style"; it's just a poorly considered style perpetrated by coders who are not familiar with the grammar.

The C++ syntax is the same as C in this regard: a declaration has specifiers, and then one or more declarators.

The exception are function parameters, where you have (at most) one declarator.

> why don't C++ programmers use typedef?

C++ programmers do use typedef. For instance:

  typedef std::map<from_this_type, to_this_type> from_to_map;

C++ programmers probably use typedef a bit less than they used to, because of features like auto.

When a C++ class/struct is declared, its name is introduced into the scope as a type name. Therefore, this C idiom is not required in C++:

  typedef struct foo { int x } foo;

that cuts down some typedefs. If you used a typedef for a C++ class that isn't just a "POD", you have issues, because the typedef name doesn't serve as an alias in all circumstances.

  typedef class x { x(); } y;

  y::y() // cannot write x constructor this way
  {
  }

Y_Y · on Aug 10, 2020

It's worth noting that this nice C logic falls apart when you do something like

    f(int &a);

to mean "by reference" instead of what it should be, which is "get the address of a, and that will be an int" which is , of course, nonsense.

kazinator · on Aug 10, 2020

I don't follow. The above is not C. It's a C++ extension over C declaration syntax in such a way that the & is part of the declarator just like * .

  // Inexcusable trompe l'oeil:
  int& a, b;

  // OK;
  int &a, &b;

Here, the mistake may be harder to catch, because the expressions a and b are both of type int, either way.

  // Intent: b is an alias of a.
  // Reality: b is a new variable, holding copy of x.

  int& a = x, b = a;

I think what you mean is that the "declaration follows use" principle falls apart for C++ references.

That is necessarily true because no operator is required at all to use a C++ reference, whereas the explicit & type construction operator is required in the declarator syntax to denote it.

However, it has little to do with the issue that & is part of the declarator and not of the type specifiers.

Declaration follows use also falls apart for function pointers in C, because while int (* pf)(int) can be used as result = (* pf)(arg), it is usually just used as result = pf(arg).

Declaration follows use also falls apart for the -> notation. A pointer ptr is always being used as ptr->memb, but declared as struct foo *ptr which looks nothing like it.

And of course, arrays can be used via pointer syntax, and pointers via array syntax, also breaking declaration follows use.

Declaration follows use is only a weak principle used to help newbies get over some hurdles in C declaration syntax.

UncleMeat · on Aug 10, 2020

> I honestly don't know how C++ programmers typically handle this situation.

The verbose one. A few extra lines rarely matters. I think the number of times this has come up in my code base is very very small, maybe a few dozen extra lines across hundreds of thousands.

sacado2 · on Aug 11, 2020

A C++ function declaring several uninitialized pointers one after the other is very suspicious anyway. It looks like the programmer is trying to write old-school (pre C99) C code (declaring all variables at the top of the function) in C++.

Typedeffing pointers, especially for the sole purpose of "being less verbose" when declaring uninitialized pointers, is a red flag too.

0xffff2 · on Aug 10, 2020

I'm a C++ guy who prefers `int*`. I also hate multiple declarations on a single line so I just outlaw that entirely so I never have an issue with declaration correctness.

ramzyo · on Aug 10, 2020

Three cheers for pointing out that this boils down to preference. In years programming C in teams and projects of various sizes and criticality, I've never come across a situation where the asterisk position actually made a difference, other than to incite grumbles from those who prefer it whatever which way. I sincerely look forward to a situation where it does make a difference to the maintainability or correctness of code in a way that can't be addressed with other language syntax, or impedes a team's ability to deliver reliable production software against whatever style guide or coding standard they've adopted.

Ragnarork · on Aug 10, 2020

I'm in the same boat, and I'd like to add that I agree that using `int* p, * q, * r;` is not pretty, and could possibly lead to ending up mistakenly with `int* p, q, r;`.

But we're in 2020 and we've learned to avoid to declaring stuff without initializing it at the same time to avoid the stupid mistake of using something uninitialized.

Interestingly enough, testing with clang shows that uninitialized variables get their warning, but uninitialized raw pointers don't.

secondcoming · on Aug 10, 2020

> I never understood the reasoning behind this, is it that "the asterisk is part of the type, so we group it that way"?

Yes. It just looks better IMO. Now, east-const vs west-const?

MarekKnapek · on Aug 10, 2020

    int* a, b;  /* wrong */
    int *a, *b; /* also wrong */


    /* correct */
    int* a;
    int* b;

drummer · on Aug 10, 2020

    int* a, b;  /* wrong */
    int *a, *b; /* also wrong */


    /* correct */
    int* a;
    int* b;

    /* moar correct */
    int* a{nullptr};
    int* b{nullptr};

CodeArtisan · on Aug 10, 2020

The choice between ``int p;'' and ``int p;'' is not about right and wrong, but about style and emphasis. C emphasized expressions; declarations were often considered little more than a necessary evil. C++, on the other hand, has a heavy emphasis on types.

A ``typical C programmer'' writes ``int p;'' and explains it ``p is what is the int'' emphasizing syntax, and may point to the C (and C++) declaration grammar to argue for the correctness of the style. Indeed, the binds to the name p in the grammar.

A ``typical C++ programmer'' writes ``int p;'' and explains it ``p is a pointer to an int'' emphasizing type. Indeed the type of p is int. I clearly prefer that emphasis and see it as important for using the more advanced parts of C++ well.

https://www.stroustrup.com/bs_faq2.html#whitespace

saagarjha · on Aug 10, 2020

Hacker News has made your comment fairly difficult to understand ;)

Y_Y · on Aug 10, 2020

This should be mandatory reading for everyone who ever writes code I have too look at or debug.

_wldu · on Aug 10, 2020

I would not cringe at this. It's just his preference. They have the exact same meaning. Other than personally preferring one way or the other for readability, there is no difference.

lokedhs · on Aug 10, 2020

The problem is that if they forget to include the correct header file that declares malloc, then the cast will hide any warnings.

If the cast is not there, it will tell you that there is a type mismatch.

rootbear · on Aug 10, 2020

I would assert that usability and clarity are very much a difference.

mbreedlove · on Aug 10, 2020

I've always struggled with this, I usually use the first style. To me, Bark and Bark* are both types, while the name p_dog is just a name regardless of what it's referring to.

I read the first as 'a struct of type Bark pointer named p_dog'.

How do you read the second example in your head?

bigdict · on Aug 10, 2020

https://eigenstate.org/notes/c-decl

rootbear · on Aug 10, 2020

Pointer to type "struct Bark".

bschwindHN · on Aug 10, 2020

I like to use clang-format so I don't have to even think about this, but my real solution is to use Rust when I can, which is pretty much always these days.

jaydem · on Aug 10, 2020

Why is that? I learned C++ first, but spend most of my time now using C. It makes more sense to me that the type should be one thing (struct Bark*) and the identifier a different thing (p_dog).

Also, a take from Stroustrup since I found it. https://www.stroustrup.com/bs_faq2.html#whitespace

MaxBarraclough · on Aug 10, 2020

The problem is that it doesn't reflect how the C language really works. More specifically, it is misleading in multiple declarations, see the sibling reply at https://news.ycombinator.com/item?id=24109267

Ragnarork · on Aug 10, 2020

One could argue it's wrong to do multiple declarations like that, and the whole argument collapses to mostly a matter of preference.

MaxBarraclough · on Aug 10, 2020

Even if your coding style prohibits multiple declarations, you still don't escape the fact that in C, the asterisk binds to the variable identifier, not to the type. It crops up again in the function pointer syntax:

    void *(*foo)(int *);

foo is a pointer to a function which accepts a pointer-to-int and returns a pointer-to-void.

( Taken from https://www.cprogramming.com/tutorial/function-pointers.html )

tgb · on Aug 10, 2020

Thanks, this is the best example I've seen.

saagarjha · on Aug 10, 2020

You can call a feature of a language misguided, but you can't call it wrong; that's just how C does identifiers. Sure, in your "cut out the parts of C that don't fit into my model" you can make it work, but that's not C.

bigdict · on Aug 10, 2020

https://eigenstate.org/notes/c-decl

zzo38computer · on Aug 10, 2020

I think the bad part of C is the confusing syntax for types. In a declaration of the initial value, you are assigning the value of "p_dog" and not of "*p_dog". I tend to omit the spaces on both sides of the asterisk, though, and don't declare multiple pointer variables in the same line; I will use separate lines if the type isn't something as simple as "int" or "unsigned char". If I need especially complex types, then I will use typedef.

optymizer · on Aug 10, 2020

I find your comment inflammatory and your zealousness amusing, but I'll bite and try to be reasonable by assuming you're willing to entertain another point of view.

The whole "declaration follows usage" is just a bad tradeoff. It makes it easier to parse expressions. That's my understanding of why they did it. It makes it _objectively_ harder to read, because some declarations follow this easy pattern of "name on the right and type on the left", while for some other declarations you have to employ the spiral reading pattern (e.g. for function pointers, arrays, with variable declarations being the easiest one).

You know what's easier that all that? Type on the left, name on the right.

  int a;
  int* a;
  int[] a;
  unsigned int(int) f;

Notice that you can probably easily guess what that last declaration represents, without having to consult a wise old man. I think you simply got used to how C does it, so now the actual sane way is weird to you personally, but you have to recognize that it's one additional thing _everyone_ has to learn because it's counter-intuitive. Hundreds of thousands of developers had to learn some weird spiral reading rule because 1 compiler writer found it easier to reuse a yacc rule.

That's why Java, Go and D changed this nonsense. Java supports both "int[] a" vs "int a[]". They support both to appease everyone, but they went the extra mile to support "int[] a". D changed it to "int[] a" and called it a day and Go introduced this novel [5]int syntax, which is different, but clearly easy to read ("array of five int").

Again, type on the left, name on the right (or in Go's case, name on the left, type on the right - but at least it's not a mixed bag). Once you see it that way (i.e. you give up on declaration follows usage rule), it's not weird, it's not wrong, it's intuitive, and it makes the language easier to learn and use.

I thought this is a matter of preference until I actually wrote a C compiler for fun and have permanently solidified my opinion on this issue.

The asterisk is part of the type, it's not just some random symbol, it's not a type qualifier like 'const' or 'volatile'. It's a type token that builds a distinct type, e.g. "int * * " is a type that spells "pointer to pointer to int", it's not an 'int' type with some flags attached to it.

Let's take the concrete example of 'struct Bark* p_dog'.

A typical compiler will tokenize 'struct', 'Bark', '* ' and 'p_dog' and will group them as ('struct', 'Bark', '* ') to derive the type "pointer to struct Bark" and ('p_dog') to derive the name of the symbol when it adds an entry into its symbol table, in other words - the compiler itself splits that line so that types are the left, and names on the right.

rootbear · on Aug 10, 2020

It wasn't my intention to be inflammatory and I don't consider myself a zealot. My comment has generated a lot of interesting discussion and that's a good thing, and probably worth the down votes.

I wrote C for a long time before I ever even heard of the spiral rule. I do think that the "declaration follows usage" idea worked a lot better before C declarations became so complex. I'm not claiming that the C declaration syntax is wonderful, it isn't. I just think that the C++ style of pretending that int* is a pointer type is misleading, since that's not really what's happening grammatically. Yes, * is a type token, as you say, but it modifies the identifier, not the type specifier. But since the style now is to only declare one variable per declaration, and to always initialize it, then in practice it isn't actually all that confusing.

I have always wanted to write a compiler and I'm sure I would get a new perspective on these matters if I did. Go has done a lot to clean up C's messes. I haven't used it much, but I'd like to know it better.

bigdict · on Aug 10, 2020

Agree 100%. It is a tradeoff that detracts from ease of human interpretation.

Because C makes that tradeoff,

  int *a;

is idiomatic C and

  int* a;

isn't.

For C programming to be pleasant, you have to understand and agree with the philosophy at least while you're writing C.

Go developers changed this in a particularly elegant way. Here's what Rob Pike has to say: https://blog.golang.org/declaration-syntax.

I too gained a better understanding of the problem after going through a compiler-writing exercise.

tobyhinloopen · on Aug 10, 2020

Why put the astrisk on the variable though? It’s part of its type, isn’t?

stephencanon · on Aug 10, 2020

Because in the C (and C++) grammar,

    int *a, b;

declares a pointer-to-int `a` and an int `b`, rather than two pointers-to-int. The better solution to this problem is "don't do that", but C (and C++) programmers have a fetish for terseness.

secondcoming · on Aug 10, 2020

> but C (and C++) programmers have a fetish for terseness.

Indeed, Linus Torvalds has a recent rant about people still adhering to max 80-column width code. It's pointless in the day of massive monitors.

qppo · on Aug 10, 2020

I use a rather large 4K monitor as my daily driver and maintain 80 column width code where possible, to avoid wraparound when I have many windows open.

It's not pointless unless you're using your massive monitor like you did your tiny one thirty years ago. I use mine like a bunch of tiny monitors, not one big one.

bigdict · on Aug 10, 2020

https://eigenstate.org/notes/c-decl

account42 · on Aug 10, 2020

  struct Bark * p_dog = p_cat;

staycoolboy · on Aug 10, 2020

This below my mind when I started using clang-format. There are only two options in all of clang-formats 100+ customization variables: move the star next to the type, or don't touch the star.