As you may know code is shared among JEDI projects and so is part of the code that I am going to describe in this post. A few years ago I got involved in the JCL project and contributed code that I had written quite a while before. One of the things I always found puzzling, was that you were supposed to create hardlinks via the backup APIs (requiring the respective privileges) on NT4. I never understood why anyone would want to do it that way. CreateHardlink() had only been introduced in Windows 2000 and I hadn’t discovered something that I am going to mention at the end of this post.

Now, CreateHardlink() on Windows 2000 and later doesn’t have any special requirements regarding privileges, so there should be a way to mimick the behavior on Windows NT 4.0. And it turns out there is. In fact it seems that the same implementation could have been used if Microsoft could have been bothered to add this function to NT4. As we all know they didn’t and so we have to provide a little workaround; but this workaround simply mimicks what can be found in a debugger when looking at CreateHardlink() in Windows 2000 and later. In fact it resembles the behavior so closely that even the error codes should be the same if the function fails for the same reason.

Now, as I mentioned, this was contributed to the JCL, not to this project, but I was always tempted to include it in one of the units here. However, this twist in the story is the reason why you will find the declarations of used functions and structures (records) within the Hardlinks.pas unit instead of Hardlinks.pas using JwaWindows.pas (part JwaNative.pas). Sorry about that, but perhaps someone gets around to change that.

Anyway, I don’t want to bore you with the implementation details of the Delphi implementation for CreateHardlink(), since the code is well commented. Instead I invite you to a little journey into the details behind NTFS hardlinks.

In DOS times, people would always be afraid of so-called “cross-linked” files, which were multiple files in different directories, pointing to the same entry in the FAT (“file allocation table”). And rightly so, since on FAT file systems this is a corruption. Even viruses such as the aptly named “Creeping Death” ([1], [2]) used the idea of cross-linked files. In this case the virus would infect any executable and make it dependent on the virus being resident in memory. How? Well, the virus redirected all requests for those files into its own implementation of a file allocation table – all those executables would consequently point to the same FAT entry if you booted the system from a clean boot disk. While “Creeping Death” was still alive, Microsoft was already working on NTFS, which was roughly modelled after the HPFS (High Performance File System) included in OS/2. Now, in NTFS those cross-linked files are no more a corruption but instead a feature. They are only allowed for files, not for directories (I leave the reasoning behind that as an exercise to my readers ;) ) and they cannot span partitions. On NTFS the table responsible for holding all those file records is called MFT (Master File Table). A directory entry is merely a pointer to an MFT entry containing the details about the data streams. So to make a long story short, hardlinks are similar to cross-linked files in many respects, but they are a supported feature of NTFS.

Now what about the title of the post? Some of the readers will have an educated guess by now! If the directory entry is merely a pointer to a file, there has to be a reference count – i.e. how many directory entries point to a particular data record. And there is. DeleteFile() and the underlying native function will unlink a directory entry from the data and once the reference count drops to zero, the MFT record is marked as free. And guess what, CreateHardlink() merely bumps the reference count with every new link you create to a particular data record. Now anyone being interested in an effective move operation would obviously use that to their advantage and so did the architects of NTFS. If you move a file on the same partition, a new link to the data record is created in the target location and the old one is removed afterwards.

Now it’s time for you to open JwaNative.pas from the Win32API folder of the JWA distribution in your favorite editor. Search for FileRenameInformation and duly note the FileLinkInformation right below. Now continue by searching for FILE_LINK_INFORMATION. Surprise surprise – the two structures corresponding to the operation for renaming (aka moving) and hardlinking a file happen to be the same.

_FILE_LINK_RENAME_INFORMATION = record // Info Classes 10 and 11
  ReplaceIfExists: ByteBool;
  RootDirectory: HANDLE;
  FileNameLength: ULONG;
  FileName: array[0..0] of WCHAR;
end;
FILE_LINK_INFORMATION = _FILE_LINK_RENAME_INFORMATION;
PFILE_LINK_INFORMATION = ^FILE_LINK_INFORMATION;
FILE_RENAME_INFORMATION = _FILE_LINK_RENAME_INFORMATION;
PFILE_RENAME_INFORMATION = ^FILE_RENAME_INFORMATION;
TFileLinkInformation = FILE_LINK_INFORMATION;
PFileLinkInformation = ^TFileLinkInformation;

This is no coincidence. In many cases these two operations are largely the same and the underlying native function to set this information is the same anyway (NtSetInformationFile). However, the similarities transgress the equivalence of the structures. It turns out there was a function that was always capable of creating hardlinks, even without CreateHardlink() being available on NT4. The name of the function? It’s MoveFileEx(). That’s right. The extended version of the function to move files is capable of creating hardlinks too – and even on NT4. No necessity to go through all the hoops I went through in order to create the Delphi implementation of CreateHardlink(). So all in vain? Of course not. I learned a lot and so will you, if you read the source code for it (btw: a C version of the implementation is included too).

As a conclusion one could probably say that Microsoft forgot about the implementation of hardlink creation in MoveFileEx() since they still state as of this date, that the flag MOVEFILE_CREATE_HARDLINK is “reserved for future use”. Don’t get fooled by it, the functionality was available ever since NT4 and may have been included as early as Windows NT 3.51. Instead of using this (e.g. as the underlying implementation for CreateHardlink(), introduced with Windows 2000) they choose to create a separate implementation. I leave it to you to find out the subtle differences, but rest assured that they aren’t as big as one would think. If you simply need to create a hardlink and your application has to be compatible with NT4, use MoveFileEx(). The question as to “Why?” is probably better placed with Raymond Chen or other Microsoft folks.

I hope you enjoyed reading this post and provide the feedback necessary to improve in future posts.

// Oliver

PS: Note that I had to simplify the explanations, obviously. There are several books about the details of NTFS, but the projects concerning NTFS support on Linux are probably the best freely available source of information.