Is it possible to not define the size of

Forum

Forum
General C++ Programming
Is it possible to not define the size of

Is it possible to not define the size of a 2d array at the beginning?

I want to import coordinates from a csv file and i want to store those data in a 2d array or maybe vector. I dont know which one is more suited for my task. The problem is the number of rows of the coordinates could vary and as I heard in one of the c++ tutorials, the guy said it's not possible to change the size of an array after definition. Now I also read somewhere, that arrays would need less memory recources than vector, is that true? Cause I'm gonna get a lot of coordinates and I also need to do some calculations with them.

TheIdeasMan (6820)

use vector. It does it's own memory management, and can be resized, but if you can take a reasonable guess as to it's size that would be better.

it's not possible to change the size of an array after definition

That's right.

Now I also read somewhere, that arrays would need less memory recources than vector, is that true?

Not really. There is some overhead, but it is not worth worrying about.

lastchance (6980)

Depends a bit on what "calculations" you need to do on them and whether you can "pre-read" the file to count the number of rows.

However, for most purposes, and lacking much information as to what you intend to do with them, if you NEED to store them then it would probably be best to store them in a vector of structs, the struct containing the x,y,z coordinates.

Note the "need". For some operations - e.g. summing columns - you don't need any more than the latest read row and a running sum.

So you need to be more explicit about what "some calculations" means.

BJKY0712 (10)

@lastchance

I think I'm able to count the number of rows. I've tried it with vectors by reading the file row by row and each time redefining the vector. But now that I got to do some calculations, I need to store them first.

As for the calculations. I need to find the MAX and MIN values and maybe calculate the average value.

lastchance (6980)

If you are only calculating max, min and average then you can keep rolling variables (max, min, sum and count) - you don't actually need to store the lot.

But if you've got other things to do then you can use a vector with push_back() to store values. That is appending to, not "redefining", a vector. That is the principal advantage of a vector.

Last edited on

seeplus (6608)

Now I also read somewhere, that arrays would need less memory recources than vector, is that true?

For the actual storage of the data, the same. The issue is that if when adding data to a vector it exceeds its capacity then the vector will auto-resize and auto-reallocate it's data. During this re-allocation additional memory is required (possibly as much as 2- 3 times the actual size depending upon the re-allocation formula used). If this memory is not available, then the vector operation will fail. If dealing with a vector of a large number of elements, it's advisable to initially reserve the number of elements that are going to be used.

vector has both capacity and size. capacity is how many elements can be added without re-allocation. size is the number of actual elements. When size > capacity re-allocation will occur. Note that memory is allocated for the capacity of the vector. If you define a vector as having a capacity of 10 ints but only 2 are stored, then memory is still allocated for 10 ints. This may be one reason why it's said that vector may need more than array.

againtry (2313)

@OP
std::vector is the better way to go simply because STL containers are designed to make the use of C-style arrays equally efficient and far less prone to errors - i.e they are designed to be user friendly, otherwise in the extreme we'd still be programming with 0's and 1's or even worse, voltages. Besides they have a wide range of added functionality as mentioned above.

However there is nothing much in handling your immediate problem to handle a pile of coordinates. So, the answer to your question is yes, you can.

An easy and probably quick solution to using a dynamic array (new/delete etc) is to first count the number of rows (or whatever the format dicates) in the file, then prepare the array, then 'rewind' the file back to the start and read the coordinate values in.

You would need the proverbial shed-load of data on the file to make a significant dent in your time.

deleted account xyzzy (5768)

The costs of manual memory management for a container outweigh IMO the overhead of using a C++ container like a std::vector. Expanding/contracting on demand a manually allocated container can be as memory and performance intensive as what the C++ standard library does "behind the curtain", assuming the manual allocations/deallocations are correct.

Increasing the dimensions with a regular container (2D array vs. 1D) ramp up the issues with manual memory management that make it easier to accidentally make a mistake.

The ease of letting the C++ container do all the housekeeping for you far outweighs the work needed to manage memory manually for a regular non-C++ container.

If the data you are wanting to stuff into your 2D container is "ragged" using a 2D vector almost a no brainer.

"ragged" data is when a block of 2D data doesn't fill the column fully, or has extended data for one or more rows exceeding the usual column lengths. When displayed in a 2D layout the "holes" or extra data looks ragged.

seeplus (6608)

A valarray may be more appropriate.
http://www.cplusplus.com/reference/valarray/valarray/

Unless there are reasons why not, if dynamic memory is needed (ie dimensions/size are not known at compile time) then use a std::container (or another 3rd party container).

Last edited on

deleted account xyzzy (5768)

Can std::valarray be used to construct to 2D container?

I guess it can, when thrown into a specialized class used to represent a matrix.

https://stackoverflow.com/questions/2187648/how-can-i-use-a-stdvalarray-to-store-manipulate-a-contiguous-2d-array

That approach seems to be more work than simply using a 2D std::vector, but with all the std::valarray abilities to slice and dice does make using one at least a viable option if the slicing options are needed.

I'll be honest, I have never used a std::valarray before, never really did any testing on what it offers over a C++ container.

jonnin (11458)

Can std::valarray be used to construct to 2D container?

one of its methods, actually more than one, 'slice' the data into '2-d' for you. This may or may not be sufficient for what you need. It certainly is fine for basic 2d operations.
you don't always want 2d containers like vector of vector or pointer of pointers because it can fragment the memory and cause page faults. Or, if you care, you want to keep a weather eye on that. 2d arrays are a solid block, and one-d arrays and pointers can be force-cast back to 2d as a solid block. This works fine if the sizes never change (like most matrix math, once a matrix is sized it rarely grows, exceptions being appending a vector for RREF or appending identity to RREF up an inverse, things like that..). There are a LOT of ways to either use 1d as if it were 2d or to make solid 2d blocks or, if you don't care, anything goes that solves your problem. I prefer addressing 1d as if it were 2d, but that has its own aggravations.

valarray is powerful but a little weird. It takes some getting used to.

if you have enough to call a lot, we can bounce ideas with you. Note that a lot today is really quite a lot. I would not even begin to sweat it until you have multiple GB worth of data loaded into memory. My home PC has 64gb of ram, most have 16+ so using even half that is a good bit of data. I have actually never used more than 10gb at once or so, and THAT was some xml spew from a large database, bloated by the XML and covering several million records.

Last edited on

lastchance (6980)

#include <iostream>
#include <fstream>
#include <string>
#include <algorithm>
using namespace std;


struct Stats
{
   double xmin, xmax, ymin, ymax, zmin, zmax;
   double sumx, sumy, sumz;
   int num = 0;

   Stats( double x, double y, double z );
   Stats( const string &csvfile );
};


Stats::Stats( double x, double y, double z )
{
   xmin = xmax = sumx = x;
   ymin = ymax = sumy = y;
   zmin = zmax = sumz = z;
   num = 1;
}


Stats::Stats( const string &csvfile )
{
   ifstream in( csvfile );
   bool first = true;
   char comma;
   for ( double x, y, z; in >> x >> comma >> y >> comma >> z; )
   {
      if ( first )
      {
         *this = Stats( x, y, z );
      }
      else
      {
         xmin = min( x, xmin );
         xmax = max( x, xmax );
         ymin = min( y, ymin );
         ymax = max( y, ymax );
         zmin = min( z, zmin );
         zmax = max( z, zmax );
         sumx += x;
         sumy += y;
         sumz += z;
         num++;
      }
      first = false;
   }
}


int main()
{
   Stats S( "test.csv" );
   if ( S.num )
   {
      cout << "Coordinates read: " << S.num << '\n';
      cout << "Bounds:\n"
           << "x: " << S.xmin << " - " << S.xmax << '\n'
           << "y: " << S.ymin << " - " << S.ymax << '\n'
           << "z: " << S.zmin << " - " << S.zmax << '\n';
      cout << "Averages:\n"
           << "x: " << S.sumx / S.num << '\n'
           << "y: " << S.sumy / S.num << '\n'
           << "z: " << S.sumz / S.num << '\n';
   }
   else
   {
      cerr << "No meaningful data read\n";
   }
}

test.csv

1,2,3
10,20,30
5,6,7

Output:

Coordinates read: 3
Bounds:
x: 1 - 10
y: 2 - 20
z: 3 - 30
Averages:
x: 5.33333
y: 9.33333
z: 13.3333

Topic archived. No new replies allowed.