bool hashtable::getFileContent(string fileName)
{
ifstream in(fileName.c_str());
// Check if object is valid
if(!in)
{
cerr << "Cannot open the File : "<< fileName << endl;
returnfalse;
}
string str;
// Read the next line from File untill it reaches the end.
while (getline(in, str))
{
// Line contains string of length > 0 then save it in vector
if(str.size() > 0)
this->fileInsert(str);
}
//Close The File
in.close();
returntrue;
}
bool hashtable::fileInsert(string str)
{
int key1=0;
string key,value;
stringstream tmpstream(str);
if(!getline(tmpstream, key, ','))
{
cerr << "no delimiter key" << endl;
returnfalse;
}
key1 = atoi(key.c_str()); //string to int
if(!getline(tmpstream, value))
{
cerr << "no value" << endl;
returnfalse;
}
hashtable::insert(key1,value);
returntrue;
}
in fact if I insert 500 lines the cpu rises to 100%.
If you are on linux, I would consider low level parsing (along with memory mapped files, mapped files are not faster, it just replaces a stream with raw data), but on windows it isn't worth it because of /r/n (carrage return), so I recommend you to keep your code as it is, because your code is sort of safe which is important (sort of safe because you don't actually handle the error but at least you printed it).
If you could modify the storage of data there is room for improvement, maybe you could do something really low level with binary data (but it is difficult because of strings). But if you plan on modifying the file using a text editor, I recommend using INI or JSON files, not for speed (they could be faster than your code, who knows), but because they are convenient to modify, and plenty of other code can benefit from a format like JSON, like for setting a configuration file or something.
One of the first things I see that could use improvement is that you are passing std::strings by value instead of by reference/const reference.
Next you have several unnecessary comparisons and conversions. There is probably no need to convert the std::strings into a C-string, just insure you're compiling to "Modern" C++ (C++14 or higher).
Instead of retrieving your number into a string just retrieve it into the proper type of variable, and avoid the atoi() function whenever possible. This C function is frowned upon even in Modern C because it can silently fail.
bool hashtable::getFileContent(const std::string& fileName)
{
ifstream in(fileName);
if(!in)
{
cerr << "Cannot open the File : "<< fileName << endl;
returnfalse;
}
string str;
// Read the next line from File untill it reaches the end.
while (getline(in, str))
{
fileInsert(str);
}
//Close The File
// in.close(); Not needed.
returntrue;
}
bool hashtable::fileInsert(const std::string& str)
{
int key;
char delimiter;
string value;
stringstream tmpstream(str);
tmpstream >> key >> delimiter;
getline(tmpstream, value);
// If either conversion failed then the stream is in a fail state.
if(!tmpstream)
{
cerr << "no value" << endl;
returnfalse;
}
hashtable::insert(key,value);
returntrue;
}
atoi or >>int are both sluggish; they invoke a complicated parser that has to handle a bunch of formats and do a lot of things. A custom high speed version that works for YOUR format can be significantly faster than both. I got a major lift in one of my programs doing this, and its trivial to code.
Thousands of lines should not even be notable; how long does this thing take for your big file, and when it does that, where is it spending the time?
100% cpu use, is that a problem? An idle core/cpu is doing nothing at all; its waiting on the user to type something or the disk to start spinning or the network to connect or something.
If the file processing takes more than a couple of ms for 500 lines, you have something wrong somewhere. But its not in the code we can see, or I don't think so, ... that hash insert may be worth a look. that is easy to test: take the insert line out, how fast does it run now?