How to ensure log folder is created before any log calls are made?

The actual code I'm working with is complicated (and something I can't post) so I'm going to use a proxy example.

I have a Visual Studio Solution that contains a handful of projects (GameCore, GameInstaller, GameService):

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
AsteroidsSolution
|_GameCore
  |_Header
    |_File1.h
    |_...
  |_Source
    |_File1.cpp
    |_...
|_GameInstaller
  |_Header
    |_File11.h
    |_...
  |_Source
    |_File11.cpp
    |_...
|_GameService
  |_Header
    |_Logger.h
    |_File21.h
    |_...
  |_Source
    |_Logger.cpp
    |_File21.cpp
    |_...


GameCore and GameInstaller references (#includes) GameService/Logger.h and Source/Logger.cpp. Logger.h defines a basic logging class that logs to files.

GameCore is an application. GameService is a Windows System Service that runs at startup and GameInstaller is a .dll that is invoked by Windows LogonUI at the logon screen. All of this to say - the order in which log calls from any particular .cpp file, from any particular project is not guaranteed.

It's possible that any .cpp file, from any project, can create its own logfile. But assume the complete list of logfiles and where they're written to is known e.g.,

File1.cpp uses the logfile C:\Asteroid\GameCoreLogs\File1.log
File11.cpp uses C:\Asteroid\GameInstallerLogs\File11.log
File21.cpp uses C:\Asteroid\GameServiceLogs\File21.log
(and assume no other log files are created anywhere else)

The Problem

Since we don't know which project code will run first, we can't put the code to create all of the necessary log folders (GameCoreLogs, GameInstallerLogs, GameServiceLogs) in any particular project source file. For example, if we put the logic to create the log folders in GameInstaller, but it so happens that the code for GameService runs first, log function calls made in GameService will generate errors since the GameServiceLogs folder hadn't been created.

My Thoughts

We know that all 3 projects depend on Logger.h and Logger.cpp so perhaps we can use the fact that static variables are initialized before any class object are instantiated to force the creation of those 3 log folders. Something like this:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
[Inside Logger.h]

class Logger
[
   static bool logFoldersCreated = CreateAllLogFolders();
   static bool CreateAllLogFolders();  // Returns true if successful

   // Writes a log message out to the file logfilePath
   static void Log(string msg, string functionName, string lineNumber, string logfilePath); 
}

[Inside Logger.cpp]

bool Logger::CreateAllLogFolders() 
{
   // Checks if each folder already exists. If so, return true. 
   // Creates all log folders, returns true if successful.
}


Even if each project "shares" Logger.cpp instruction code, they each might have their own copy of "logFoldersCreated" in their own virtual address space (i.e., the log folder creation status is not shared among project code). But because "logFoldersCreated" is static, it's guaranteed that any project that runs first will have to check for the creation of all log folders before any of their .cpp file calls Logger::Log. The other project code that runs subsequently, that calls CreateAllLogFolders(), will return true.

One thing that troubles me is whether it's possible for multiple processes to have the same file handle (to C:\Asteroid) in which to create the log folders. I'm assuming the OS will do the job of giving the handle to the first process (project code) but blocking subsequent processes from access to that handle until the first one has released it.

[1] SO: When are static C++ class members initialized?
https://stackoverflow.com/a/1421780
Last edited on
Why can't File1/File11/File21/... create their own directory/file?

Maybe they create their own logger instance and everything is done at that moment.
If you're writing to one file from multiple places you need to serialize the file writing using locks etc. The same applies to creating/opening the file. All places use the same serialised code so that the first to run the code creates etc and then the others just open. It then doesn't matter which place first tries to access the file.

Re static initialisation, see:
https://www.cppstories.com/2018/02/static-vars-static-lib/
https://www.cppstories.com/2018/02/staticvars/
If you have a number of separate processes, then you are going to need to use something like a named Mutex or a named Semaphore in order to synchronize these processes, so that only the first process does the initialization work. Not only must the other process not do the initialization work again, they also need to wait until the first process is completely done with the initialization!

On Windows:
https://learn.microsoft.com/en-us/windows/win32/api/synchapi/nf-synchapi-createmutexw
https://learn.microsoft.com/en-us/windows/win32/api/synchapi/nf-synchapi-createsemaphorew

On Linux/Unix:
https://man7.org/linux/man-pages/man3/sem_open.3.html

___

Probably you want to put the initialization into a library (DLL) that is used by all processes (applications).

Then, in the library code, you'd do something like:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
int do_initialization(void) {
    int result = 0;

    HANDLE mutex = CreateMutexW(NULL, TRUE, L"put-some-unique-id-here");
    if (!mutex) {
        return -1; /* fatal error */
    }

    if (GetLastError() == ERROR_ALREADY_EXISTS) {
        /* case #1: we are *not* the first process, so wait until the other process has completed initialization */
        if (WaitForSingleObject(mutex, INFINITE) != WAIT_OBJECT_0) {
            CloseHandle(mutex);
            return -1; /* fatal error */
        }
    } else {
        /* case #2: we *are* the first process (and initially own the mutex), so do the actual initialization here */
        result = _do_actual_initialization();
    }

    ReleaseMutex(mutex);
    CloseHandle(mutex);

    return result;
}

CreateMutexW function

Note: If the mutex is a named mutex and the object existed before this function call, the return value is a handle to the existing object, and the GetLastError function returns ERROR_ALREADY_EXISTS.


Each process calls do_initialization() on startup, but only the first process will actually do the initialization.

____

If you do not have separate processes, but its just multiple threads inside of one process, then have a look at:
https://en.cppreference.com/w/cpp/thread/call_once
Last edited on
coder777 wrote:

Why can't File1/File11/File21/... create their own directory/file?
Maybe they create their own logger instance and everything is done at that moment.


Apologies - I should have mentioned that the call to log a message is static. No instances are created. It'd be a considerable (but not impossible) effort to switch to a more capable logging framework - it's just outside of the scope of the bug I'm assigned to fix.

seeplus wrote:

If you're writing to one file from multiple places you need to serialize the file writing using locks etc. The same applies to creating/opening the file. All places use the same serialised code so that the first to run the code creates etc and then the others just open.


Logger.h does serialize writes to the logfile. There is a utility function someone else had written that creates a directory but it doesn't appear to be protected from simultaneous access by multiple processes. I'm not sure if protection is necessary if the OS offers some protection (i.e., it'll cause the simultaneous access to throw an exception). Exception handling might be important.

@kigar
Thanks for the suggestion. Will have to play with this. Just so I understand - a named mutex is identified by a string and is visible to all processes.
Last edited on
Logger.h does serialize writes to the logfile. There is a utility function someone else had written that creates a directory but it doesn't appear to be protected from simultaneous access by multiple processes. I'm not sure if protection is necessary if the OS offers some protection (i.e., it'll cause the simultaneous access to throw an exception). Exception handling might be important.

If you actually have multiple processes, then any kind of intra-process synchronization primitives, such as std::mutex or CRITICAL_SECTION will not help at all! You need a named Mutex or named Semaphore to synchronize between separate processes!

It is, of course, possible to write to the same file from different concurrent threads/processes, but the small chunks of data that are written by the concurrent threads/processes will be "interleaved" in an arbitrary way, likely producing "garbage" in the file 🙁

In order to be sure that only one process/thread at a time writes to the file, additional synchronization is needed. Again, within a single process you can use std::mutex or CRITICAL_SECTION, but for separate processes you need named Mutex or named Semaphore.

____

File locking may be another option, i.e. lock the file before you write to it, so that no other thread/process can write to it concurrently:

https://learn.microsoft.com/en-us/windows/win32/api/fileapi/nf-fileapi-lockfile
https://learn.microsoft.com/en-us/windows/win32/fileio/appending-one-file-to-another-file

Note: LockFile() can lock beyond the current end of the file, so you can simply lock the area that you are going to write to!

Example:
1
2
3
if (!LockFile(h, 0U, 0U, MAXDWORD, MAXDWORD)) {
    /* error handling */
}

____

Just so I understand - a named mutex is identified by a string and is visible to all processes.

A named Mutex or named Semaphore is identified by a string of your choice. All process that want to synchronize on the same Mutex or Semaphore must specify the same string (name). However, because the namespace is system-wide, you should pick a name that is unique, so that your application will not clash with another unrelated application! A randomly generated GUID is a good choice:

https://www.guidgenerator.com/
Last edited on
Topic archived. No new replies allowed.