Unexpected behavior 2!

Consider the code below:

1
2
3
4
5
6
7
8
9
10
int total;

std::cin >> total;
int ary[total];

for(int i=0;i<total;i++)
    ary[i] = i;

for(int i=0;i<total;i++)
    std::cout << ary[i] << std::endl;


I think that when creating arrays on the stack, the size must be a constant value known at compile time. But here I think the size will not be determined until run-time, am I wrong? if I am right then why the program compiles?

You are right, C++ requires your array size to be a compile time constant.

It compiles because you're relying on a vendor specific extension.

If you care about portable code, then always compile with the strict flags the compiler offers.
To size something at compile time, you need to use dynamic memory. So a pointer would work, eg
int *ary = new int[value];

but you don't want to deal with dynamic memory yourself, it is easy to make mistakes and a bit 1990s. You want to use c++ containers that do this low level work for you, which is why we have <vector> and friends.

This is one of about 10 things that many, many compilers let you do with default settings. Others are things like the union hack (using a union to be different types on the same memory), void or even double type on main, including things you didn't actually include (eg allowing cout without an explicit include of iostream), and so on. These compilers are sort of trying to be nice to you, but in doing so, they are letting you write broken code and reinforce bad habits.
Last edited on
This feature is called variable-length array (VLA) and is an optional feature in C. It's not a standard feature in C++ but some compilers that have implemented the feature for C have decided to also allow it in C++ by default.

As Salem C said, if you care about writing portable code you should not use this feature. GCC/Clang will warn about the usage of this feature if you use the -pedantic flag.

Also note that ary is being allocated on the stack, and the size of the stack is often pretty limited, so unless you guard against too large (or negative values) it's very easy for the user to cause stack overflow/program crash.
Thank you @Salem, @jonnin and @Peter)

I'v already got the precise info I need from @Peter, with it I turned that flag off, now my compiler obeys that c++ role. Tho there are so many good points brought here I want to ask questions about them:

1: @Peter wrote
and the size of the stack is often pretty limited
, can we know here how much it is limited size-wise, for example the RAM is 4 Gigabyte, the stack is 2G? or there a minimum or maximum value of it?

2: @Peter wrote
so unless you guard against too large (or negative values)...
you mean here I need to check if the user input is valid, that is, it is not a negative, or letter.....?

3:@jonnin wrote
This is one of about 10 things that many, many compilers let you do with default settings
I would like to know every single of those, so that I can turn them off. I already did the VLA feature.

4 @jonnin wrote
void or even double type on main

Do you mean I can do this:

1
2
3
4
void/double main()
{
    // Do things
}


5 @jonnin wrote
the union hack (using a union to be different types on the same memory)
I cannot seem to build an example here

Last edited on
ninja01 wrote:
1. can we know here how much it is limited size-wise, for example the RAM is 4 Gigabyte, the stack is 2G? or there a minimum or maximum value of it?

There is no standard way of getting or setting the stack size.

On Linux I can get the default stack size by running "ulimit -s" from the terminal. For me it's 8 MB. I believe it's smaller on Windows.

It is possible to increase the stack size but I haven't had the need to do so. I have only played around with it a little, and I know that at least on Linux it is possible to set the stack size to "unlimited" but that has some consequences such as running out of RAM within seconds if I accidentally happen to write an infinite recursion.

ninja01 wrote:
2. you mean here I need to check if the user input is valid, that is, it is not a negative, or letter.....?

In a real program you should always do that. Never trust input from the user.

I guess when dealing with dynamically allocated memory it can be OK to just let the program crash if you run out of memory because it usually doesn't happen and the consequences are often not serious. But here you're dealing with the stack which has a very limited size so the chances you run into problems with a too large array size is much larger. I don't know if all implementations guard against stack overflow with VLA (anyone knows?) but if they don't that could potentially become a security hole that a hacker might be able to exploit.

ninja01 wrote:
3. I would like to know every single of those, so that I can turn them off. I already did the VLA feature.

I think -pedantic goes a long way to warn about non-standard features. If you want to turn them into errors rather than warnings you can use -pedantic-errors instead.

Note that GCC defaults to a "GNU dialect" of C++.
GCC 12 defaults to -std=gnu++17.
If you want to use standard C++17 you should instead use -std=c++17.

ninja01 wrote:
4. Do you mean I can do this:void/double main()...
I think Microsoft's compiler allow void main() but GCC and Clang don't (at least not on Linux). I wonder what compiler allows double main().

ninja01 wrote:
5. I cannot seem to build an example here
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
#include <iostream>
#include <iomanip>
#include <cstdint>

union S
{
	int value;
	std::uint8_t bytes[sizeof(value)];
};

int main()
{
	S s;
	s.value = 0x12345678;
	std::cout << std::hex;
	for (std::uint8_t byte : s.bytes)
	{
		std::cout << int(byte) << "\n";
	}
}

This program writes a value (0x12345678) to one member of the struct (value) and reads from another member (bytes).

For me, on a little-endian machine, this prints
78
56
34
12

Many compilers guarantee this to work but according to the C++ standard this is technically undefined behaviour.
Last edited on
I think -pedantic goes a long way to warn about non-standard features. If you want to turn them into errors rather than warnings you can use -pedantic-errors instead.

Note that GCC defaults to a "GNU dialect" of C++.
GCC 12 defaults to -std=gnu++17.
If you want to use standard C++17 you should instead use -std=c++17.

I already figured that out plus other new things))

I think Microsoft's compiler allow void main()
, I check that you're right.

but GCC and Clang doesn't (at least not on Linux). I wonder what compiler allow double main().
check that too, but not with clang

The one related with union is out of league.

The rest I understand it all.

I cannot think you enough @Peter, always reading your replies with joy.
Many compilers guarantee this to work but according to the C++ standard this is technically undefined behaviour.


If that ever stops working in C++ as expected (it came from c), there'll be many, many programs that will cease to work as expected. It won't happen... C++ standard should guarantee this to work as expected to work now.
It would be nice if they'd just allow it. It'd be one less thing we'd have to be pedantically naggy about :D
Last edited on
It's not as straightforward as it looks because it interferes with the aliasing rules. I'm not even sure about my own example any more (since it's using an array and is "accessed" behind the scenes by the range-based for loop).

If you have a function like the following:
1
2
3
4
5
6
double foo(int& x, float& y)
{
	x = 2;
	y = 5.0;
	return x;
}
then the compiler can assume that the return value is always going to be 2, because an int is normally not allowed to alias a float. I.e. modifying y should not affect the value of x. The compiler should even be allowed to reorder the two assignments or execute them simultaneously.

But imagine if you called the function as:
1
2
union { int i; float f; } u; 
std::cout << foo(u.i, u.f);
Should this be UB? I think so, otherwise I don't see how the compiler will be able to take advantage of the aliasing rules while optimizing (except in very specific cases).

I don't know the exact rules but if I remember correctly compilers that allow this require that you access the union members directly from a union object so that the compiler can see that it is a union member. In C++ we have references so it's not always clear if something is accessed directly so I can imagine this feature being a bit error-prone, more so than in C.
Last edited on
Topic archived. No new replies allowed.