`FileStream` not suitable for `FileIO` on POSIX systems

`_io.FileIO` is implemented by utilizing mostly `FileStream` to access files in the OS file system. Unfortunately, this class does not work well when there are multiple simultaneous writers. This is possibly the Win32 legacy, where simultaneous writes to a file may cause an exception during write through another handle, according to documentation. I have not observed exceptions, but I have noticed that simultaneous writes overwrite each other. This is not POSIX behaviour, which safely allows multiple writes through the same descriptor, duplicate descriptor, or another opened descriptor to the same file, if appropriate file mode flags are used (e.g. `O_APPEND`).

Consider the following example:
```c#
// Test code that accesses one file opened in Append mode simultaneously on two threads
string filePath = Path.Combine(Environment.GetFolderPath(Environment.SpecialFolder.UserProfile), "testfile.txt");

if (File.Exists(filePath)) {
    File.Delete(filePath);
}
// Number of writes
const int ndata = 100200300;

Task task1 = Task.Run(() => WriteToFile(filePath, Encoding.ASCII.GetBytes("xxxxxxxxx\n")));
Task task2 = Task.Run(() => WriteToFile(filePath, Encoding.ASCII.GetBytes("zzzzzzzzz\n")));

Task.WaitAll(task1, task2);

void WriteToFile(string name, byte[] data) {
	using (var fs = new FileStream(name, FileMode.Append, FileAccess.Write, FileShare.Write)) {
		for (int i = 0; i < ndata; i++) {
			fs.Write(data, 0, data.Length);
		}
	}
}
```

This snippet uses two tasks to perform 100200300 writes, each write 10 bytes long, so each task produces 1002003000 bytes. Two such tasks should produce a file twice that size, that is, 2004006000 bytes. However, the file created is only 1002003000 bytes long (sometimes a bit more), containing a mixture of `x`'s and `z`'s, clearly a sign of the tasks overwriting the data from each other.

For comparison, here is the equivalent example in Python:
```python
import os
import threading

file_path = os.path.join(os.path.expanduser("~"), "testfile.txt")
if os.path.exists(file_path):
    os.remove(file_path)

# Number of writes
ndata = 100200300

def write_to_file(file_path, data):
    with open(file_path, 'ab') as f:
        for _ in range(ndata):
            f.write(data)

thread1 = threading.Thread(target=write_to_file, args=(file_path, b"xxxxxxxxx\n"))
thread2 = threading.Thread(target=write_to_file, args=(file_path, b"zzzzzzzzz\n"))

thread1.start()
thread2.start()

thread1.join()
thread2.join()
```

This code, when run with CPython on Linux or macOS (not Windows), correctly produces a file that is 2004006000 bytes long. IronPython, obviously, does not.

I am considering the following possible solutions:
1. In place of `System.IO.FileStream`, use `Mono.Unix.UnixStream` (which operates directly on the file descriptor) for all file access in IronPython when run on POSIX OSes. However:
    1. `UnixStream` is unbuffered, which changes the runtime profile of IronPython. This may actually be not a bad thing since at this level `FileIO` is supposed to provide a "raw" (unbuffered) access to the file. Nevertheless, it's a change, and let's hope that the buffered wrappers above it do a good job in buffering.
    2. The OS errors inside `UnixStream` are translated to native CLR exceptions, as much as possible. This is not desirable for IronPython which, to match CPython,  should produce `OSError` with an appropriate errno code.
    3. `UnixStream` does not support efficient `ReadOnlySpan<byte>` interfaces of .NET.
  All three concerns can be addressed in various ways (proxy class, exception unpacking etc.)
2. Write own dedicated stream class that makes low level OS calls to perform IO operations (e.g. using `Mono.Unix.Native`). Such a class can be easily integrated into the rest of the IronPython runtime.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

`FileStream` not suitable for `FileIO` on POSIX systems #1846

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

FileStream not suitable for FileIO on POSIX systems #1846

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

`FileStream` not suitable for `FileIO` on POSIX systems #1846