hello everyone,
I encountered a problem. When I was performing string matching, I used the strncmp function. I found through the flashgraphs that it consumed a lot,consumption accounts for 10% of the total program. How should I optimize it.
(The length of each string shall not exceed 10)
It's difficult to say without knowing what problem you're trying to solve. If you need to actually compare strings then there may not be any faster method. Any optimization would involve avoiding performing comparisons, but they can only work if your workload follows certain patters.
For example, you could hash your strings into single unique integers. But the hashing itself takes time, so it's only worthwhile if you're going to compare the same strings again and again. There are special data structures designed to hold collections of strings and efficiently perform certain operations, but again, building them takes time.
Is it fast enough to do what you want, in the time you're willing to wait for it to do the work?
If the answer is yes, then relax and work on the next interesting feature/bug of your program.
If the answer is no, like "man, it takes an hour and I want it less than a minute tops", then 10% saving just isn't going to cut it. Your 10% in strcmp is the tip of the iceberg. The real problem is much higher up in your program with your poor choice of algorithm and/or data structure.
Every program will spend it's time 10% somewhere, because ultimately it's 100% everywhere.
For the sake of argument, if you achieve an O(1) string compare over the O(N) you have at the moment, the 10% will just move to somewhere else in the program. Are you back asking the same question. How deep is this rabbit hole?
To reduce the amount of compares you might want organize your data differently. For instance as a binary tree. It depends on what you are doing with the data. So sorting and binary search might also be an option.
A single invocation of strncmp() should be very fast, especially if the string length is as small as 10 characters. So, if your program spends a significant amount of time in strncmp() function, then you probably have a huge number of string comparisons. Either that, or your program simply doesn't do much besides strncmp() ;-)
The best you can do is to avoid string comparisons! For example, if you need to find a specific string, use an std::unordered_set (hash map) instead of scanning through a list if strings element-by-element.
General rule: If you want to optimize the performance of your program, then first try to find the "hot" path, i.e. the code path where your program spends most of its CPU time. That is where optimizations (even small ones) can give a lot of speed-up for the overall program. In "cold" code, i.e. code that is hardly ever executed, even a huge speed-up will be insignificant for the overall program. Furthermore, using a better "algorithm" to solve a problem can give you way more speed-up than trying to micro-optimize a bad (inefficient) algorithm...
(Worrying about a function that consumes "only" 10% of the CPU time may not be worth it)
comparison of strings is one of the most expensive things we do regularly.
recall that strings are really blocks of characters, and that a comparison is iterating character by character until it finds a difference. If the strings are long and very similar, and you have a lot of them, it will grow costly.
one thing you can do is numerical comparison. If you pad strings with null characters or spaces so that the size is a multiple of 8, you can cast a pointer to their guts of a 64 bit integer and compare those, checking 8 letters at a time instead of 1 by one. I think some of the built in tools already do this, though. That runs, naturally, 8 times faster + the cost of handling the padding.
There are other things you can do as well depending on exactly what you are trying to do, as noted above the best solution to the problem is to avoid as much of it as possible: don't search some sort of container for a value as the big one.
given your 10 char limit AND that you use strNcmp, if you compare 8 or less the int trick seems the way to go. something like
uint64_t sp = * ((uint64_t*)&stringdata); //sp can be reused for various strings
sp >> somevalue; //shift the data to the number of bytes being used. shift is very fast.
if(sp == otherstringint)...