Articles / Largefile Support Problems

Largefile Support Problems

The Unix98 standard requires largefile support, and many of the latest operating systems provide it. However, some systems still chose not to make it the default, resulting in two models: Some parts of the system use the traditional 32bit off_t, while others are compiled with a largefile 64bit off_t. Mixing libraries and plugins is not a good idea.

64on32

While systems like FreeBSD and Darwin simply use a 64bit off_t as the default, there are also systems like Linux and Solaris that do not. Instead, they implement what's called the "transitional API" in the largefile specifications; many calls are given a 64 cousin, so that there are both "open" and "open64" in the C library, as well as "lseek" and "lseek64".

Using a define like -D_LARGEFILE_SOURCE will bring about some magic so traditional calls are remapped to the transitional API, so your source code might read "concat(...)" and "lseek(...)", but it will really be linked to the symbols "open64" and "lseek64", and will make off_t a 64bit entity.

Headers

As a result, however, it is highly dangerous to use off_t in header files in largefile-sensitive systems. Most software writers do not expect that an integral type like off_t can change its size; it is just used in the exported interface, as in making a new call "off_t my_lseek(int,off_t,int)".

In reality, the situation is similar to the old DOS modes, with a "small" mode and a "large" mode for compiling source code. The library code might be compiled with a 64bit off_t, while the application code using the library is compiled with 32bit off_t, possibly ending with a callframe mismatch.

A similar problem arises when using off_t in exported structures, as these can have different sizes and offsets for the member variables therein. A library maker should take measures to defend against improper off_t sizes, possibly making dualmode func/func64, as the C library does. Unfortunately, many software writers have not been aware of the problem.

The seek problem

Another problem is described in the section of the largefile documents that deals with holes in the protection system. It stems from the fact that some file descriptors might be opened in largefile mode while others are not, and they can even be transferred from a non-largefile application into largefile libraries, and vice versa.

The 64on32 transitional API is trying to support this scheme, mostly by introducing a new error code EOVERFLOW that will be returned when a "small"file application accesses a file that has grown beyond the two gigabyte limit due to calls from other software parts compiled as "large"file.

However, most "small"file software does not expect this error code, and many software writers do not check the return value of lseek. This can easily lead to data corruption when the file pointer is not actually moved.

Mixing it up

Most of the software problems arise on the side of "small"file applications. Generally, one should compile all software as largefile as soon as the system provides these interfaces. This is pretty easy; AC_SYS_LARGEFILE in autoconfed software can do it, or just some _LARGEFILE_SOURCE to be defined somewhere.

A lot of software, however, is not aware of a need to enable largefile mode somewhere. Hundreds of Open Source applications are compiled with 32bit off_t by default. It's simply been forgotten, and it would take a lot of work and publicity to make everyone aware, with the only result that the next new developer would miss it again.

Because of this, we should use technical support tools to track the problem area of mixing compiled code from sides which support largefile and those which do not yet do so.

Checking mismatches

A Perl script to do this can be fetched from http://ac-archive.sf.net/largefile/. It tries to classify binaries and libraries according to whether they are using "-32-" or "-64-" modes by looking for fopen() vs. fopen64() in the list of dynamic symbols. Each argument binary is checked, along with the dynamic dependencies it has. If there are mismatches, a list of them is printed.

Furthermore, the script can detect when a library is trying to exhibit itself as dualmode, exporting both func() and func64() calls (libgstreamer is an example of a library which does this). For these, it is okay that software may be in either -32- or -64- mode when linking to them, so actually, only three combinations are rejected: -64- which depends on -32-, -32- which depends on -64-, and 3264 dualmode libraries which depend on simple -32- libraries.

The distro problem

When the script is run on /usr/lib/*.so (or just /usr/bin) on a contemporary Linux system, it detects a lot of largefile mismatches. The common user will not experience any problems with that, so long as no file being handled is larger than two gigabytes. (Note that Unix98 mandates that base utilities like "cat" and "cp" be compiled with largefile support.)

Open Source OS distributions, however, carry a lot of code from many different sources. In particular, there are several graphical frontends of the filemanager type which are not compiled in largefile mode. Sooner or later, the problem will come up. It would be best if no rpm/deb/whatever binary package has a largefile mismatch in the first place.

This can be done if packagers and distro makers check binary packages while making them. It would be easy to integrate the checking routine into the set of post-%files tools (as they are called in RPM), which need to check the libraries and binaries anyway for dependent libraries (and do a "chrpath" on them, since they have been relinked in the DESTDIR staging area).

The future

The future should see all packages compiled in largefile mode, eliminating any problems with mixing libraries from different sides. A distro maker can ensure that, and if it means a few patches, that's good, since it makes the software more portable to FreeBSD/Darwin.

At some point, one should really think about dropping the 32bit off_t default altogether, as was done with FreeBSD. Linux 2.4 and glibc 2.2 should be ready for this the step, leaving the days of "small"files behind.

Links

RSS Recent comments

26 Jan 2003 03:21 jdassen

LFS sanity checking in Debian

It would be best if no rpm/deb/whatever binary package has a largefile mismatch in the first place.

This can be done if packagers and distro makers check binary packages while making them.

The appropriate place for the checking routine in Debian would be in the lintian and linda tools; I've filed wishlist bugs reports against the packages (linda and lintian) requesting this checking be added.

26 Jan 2003 05:24 slarty2

Other OSs
Is largefile support in (for example) Windows any better? Win32 provides support for applications to use files >2Gb, and it is AFAIK supported on nt4 with ntfs, however I rather suspect that a lot of applications cannot deal with it correctly.

The same is probably true of libraries on other platforms too.

26 Jan 2003 06:01 davidawheeler

Propose it to the Linux Standards Base (LSB), esp appchk.
If you want applications to widely enable largefile
support, propose it to the Linux Standards Base (LSB)
group, www.linuxbase.org (www.linuxbase.org).
If they accept that idea, they could add your test tool to
their general application checking tool, appchk.

26 Jan 2003 08:28 Avatar guidod

Re: Other OSs
The win32-API does not use off_t, all posix-like calls use `long` directly. It does therefore not suffer from dualmode problems, and compiling programs on win64 makes them use 64bit offsets, just as common win32 programs will be using 32bit offsets. The win-API knows a different `transitional API` using lseeki64/telli64/fstati64, note the "i64" instead of just "64". The absence of off_t makes it inappropriate to add a shapeshifting -Ddefine somewhere that could make standard posixish programs to use 64bit offsets on platforms using otherwise 32bit. The 64on32 extensions and modifications of the unix98-api were introduced after the win95-api was made up, and microsoft did never care to adapt with introducing "off_t" as well. The "msdn" still lists "lseek" and friends to use plain `long` instead of having a `typedef long off_t` around. At the same time, I can not recall any contemporary unix-compatible system that is missing `off_t`, even that it is not impossible in theory - see autoconf AC_TYPE_OFF_T which would add a `typedef long off_t` in this case. I guess that the only unix-compatible systems be missing off_t that do not use 64bit file-access internally - contrary to win32 platforms.

26 Jan 2003 08:38 Avatar guidod

Re: Other OSs
To get to the point - I do not expect any programmer to use the `transitional API` directly, suddenly using tell64/telli64 directly. The off_t magic allows to run 64on32 with just a -Define, one should just care to add that -Define everywhere. That will make the software still compatible with very old system that do not use 64bit file-access internally, however it does also make it use plain 32bit file-access on systems that did never introduce off_t, thereby missing the chance to change the default sizeof off_t later.

26 Jan 2003 16:25 dmaas

Re: Other OSs
Guido's info applies to the Win32 stdio implementation, however I think lots (if not most) Windows software uses the Win32 file I/O API (CreateFile, ReadFile, etc). (in fact stdio on Windows is implemented by using CreateFile internally; there is no open()/read()/write() at the kernel level in Windows)

The Win32 file I/O API has always used explicit-width file offsets, with type 'long' (which is *always* 32 bits on Windows, even on 64-bit CPUs - not sure where the other respondent was coming from). To deal with files longer than 2^31 bytes, many Windows API functions take file size/offset arguments as a pair of longs (lo/hi words). Some of these functions seem to have been retro-fitted for lo/hi arguments - witness GetFileSize(), which returns the lo word on the stack and the hi word through a pointer argument!

You might say that the large-file issue on Windows is clearer than on POSIX, since 64-bit file sizes are explicit rather than being hidden behind a #define.

26 Jan 2003 16:48 dmaas

signedness and off_t
One area where the UNIX file API has always been inconsistent is the signedness of file sizes and offsets. Naturally a file size is an unsigned quantity, but consider read() and write(), which want to return -1 for errors, or lseek(), which naturally should allow seeking backwards. I think one can build a case for either changing the APIs to use unsigned arguments everywhere, or giving up a bit and using signed arguments everywhere. But either of those have got to be better than the status quo, where you can't seek to an offset greater than 2^31 or 2^63 in one step, or write more than 2^31 or 2^63 bytes at once. (granted nobody is likely to want to write 2^64 bytes in one call, but 2^32 could happen today - and there is nothing in the API that should indicate it will fail, aside from the return value being interpreted incorrectly)

I should add that very few programmers are careful enough to write code that will continue to work when the file size creeps up into the sign bit. (e.g. this is a problem with 32-bit AVI files on Windows - the AVI standard consistently works for file sizes between 2^31 and 2^32, but lots of AVI-parsing software breaks when the size exceeds 2^31).

One problem with redefining off_t is that lots of software uses other types internally. e.g.

int my_get_filesize(const char *path)
{
struct stat ss;
stat(path, &ss);
return ss.st_size;
}

which is never going to work for 64-bit files (because it returns an 'int'). With luck the compiler will notice the 64->32 conversion and alert the programmer. However, you can't assume that by re-defining off_t most software will "just work."

26 Jan 2003 16:49 NelsonIS

What's the probem exactly?
I've dealt with this a number of times and I've always thought it was kind of easy. The bigger problem I've seen is that people make calls to read, write, seek, etc with int instead of off_t. Then even if you compile the code properly it still screws up. Using the proper types I've had no problems getting it to work.

There may be some issues exporting support in libraries and such but for most apps the ability to read a file with a 64bit offset is self contained.

26 Jan 2003 17:20 Avatar guidod

Re: Other OSs

> many Windows API functions take file size/offset
> arguments as a pair of longs (lo/hi words).

Thanks, the msdn page about "lseek" does not have a crosslink to "SetFilePointer", so I did miss this one. However, the handling of hi/lo values is error-prone (admitted in msdn docs), and the msdn even hints that:

Windows 95/98/Me: If the pointer lpDistanceToMoveHigh is not NULL, then it must point to either 0, INVALID_SET_FILE_POINTER, or the sign extension of the value of lDistanceToMove. Any other value will be rejected.

It happens that winXP/2K knows a function SetFilePointerEx using a LARGEFILE_INTEGER instead which will ensure that programmers will handle largefile values correctly.

Still, the win-APIs are not subject to the off_t shapeshifting and callframe mismatch problems of 64on32 systems. That's my primary concern that made me write the article.

26 Jan 2003 19:06 hubertf

it's not only the kernel - g4u vs. linux
I have this small harddisk image cloning utility that uploads a gzipped disk image to an FTP server, g4u (check freshmeat ;). I get lots of complaints from people that they have problems transferring images >2GB. This happens only on Linux, and the problem is not only the kernel (which should be ok starting with 2.4), but also the application that accesses the file, in my case the ftp daemon. It needs to support "large" files too.

<ob-rant>
Now if people would just use NetBSD they wouldn't have such problems, but life would be SO boring...
</ob-rant>

- Hubert

26 Jan 2003 20:32 jfroebe

why not...
Why not follow IBM's lead with AIX? All calls made to lseek() are automatically 64bit at the kernel level. No recompilation necessary.

Basically, make it completely transparent to the application.

lseek and lseek64, scan and scan64,etc should always be 64bit. that way, no one needs to worry about compiling for >2gb files.

jason

27 Jan 2003 02:09 Avatar guidod

Re: What's the probem exactly?
Using "int" instead of "long" is another problem, see also discussions about
"64bit and data size neutrality".

27 Jan 2003 03:07 gvy

Re: Propose it to the Linux Standards Base (LSB), esp appchk.

> propose it to the Linux Standards Base (LSB)

Will it help? LSB is *so* incredibly crappy for a "community standard" at times that it has distracted myself completely.

Take a look at ugly self-promo with say /lib/ld-lsb.so. Why change things that aren't broken? :(

27 Jan 2003 18:49 jamesh

Re: Propose it to the Linux Standards Base (LSB), esp appchk.

> Take a look at ugly self-promo with say
> /lib/ld-lsb.so. Why change things that
> aren't broken? :(

Well, you need to choose some name for the dynamic linker. Since I started using linux, it has changed from /lib/ld.so (a.out) to /lib/ld-linux.so.1 (libc5 elf) to /lib/ld-linux.so.2 (glibc). Choosing a different name means that if glibc changes in an incompatible way, distros can distribute a separate dynamic linker for LSB apps, that will load up LSB compliant shared libraries from another location. Would you prefer if the LSB stood in the way of Linux improving?

27 Jan 2003 19:12 gvy

Re: Propose it to the Linux Standards Base (LSB), esp appchk.

> % Take a look at ugly self-promo with say
> % /lib/ld-lsb.so. Why change things that
> % aren't broken? :(
> Choosing a different name
> means that if glibc changes in an incompatible
> way, distros can distribute a separate dynamic
> linker for LSB apps, that will load up LSB compliant
> shared libraries from another location. Would you
> prefer if the LSB stood in the way of Linux improving?

Hmm... somehow it seemed to me not to be "compat" option but rather "mainstream". Then it makes sense, thanks for explanation of what I've not finished reading in anger.

Improving: well but _not_ collecting crap. Still 3rd party software has its price to pay, this way too.

28 Jan 2003 19:27 bootswork

Re: why not...

> Why not follow IBM's lead with AIX? All
> calls made to lseek() are automatically
> 64bit at the kernel level. No
> recompilation necessary.
> jason

Because, obviously, it can't be done only in the kernel. What do you want to have happen when the application stores the result of lseek() into a 32-bit variable?

28 Jan 2003 19:31 bootswork

thanks
Nice summary.

Screenshot

Project Spotlight

C-ICAP Classify

An AI content filter.

Screenshot

Project Spotlight

Classing{js}

A library that mimics the classical OOP style in JavaScript.