Windows I/O completion - One little trick
I’ve been learning how to deal with I/O Completion ports for my latest project and found a few libraries that manage it all for me but I was getting strange behavior, So I ended up having to dig deep enough to understand what was happening. I didn’t find a really clear post so here is my attempt.
When I was reading some of the code I found all had slightly different ways of accomplishing detection of a completed I/O call. The two libraries I was referencing were Rust’s Mio crate and go’s winio.
Understanding that they were accomplishing the same task in different ways was key:
- Winio library is treating the read as a synchronous call. When you call
Read
orWrite
on the file it will issue the read call, then wait till the async operation completes. These means if you wish to use this as an async call you should do it on a Go Routine. - Mio library creates an event loop and the
read/write
processing should be handled once the readiness signal is returned. It also converts Windows IO completion into a readiness signal using an internal buffer.
I/O completion’s one little trick
Those are the two key differences in the way each library approaches doing I/O but I was still confused as to how the program “wakes” back up after the I/O completes.
Let’s take a look at the winio code that returns after the system finished the async call to GetQueuedCompletionStatus
. Note that the system call to getQueuedCompletionStatus
will suspend the thread that calls it.
// ioOperation represents an outstanding asynchronous Win32 IO.
type ioOperation struct {
o syscall.Overlapped
ch chan ioResult
}
func ioCompletionProcessor(h syscall.Handle) {
for {
var bytes uint32
var key uintptr
var op *ioOperation
err := getQueuedCompletionStatus(h, &bytes, &key, &op, syscall.INFINITE)
if op == nil {
panic(err)
}
op.ch <- ioResult{bytes, err}
}
}
What is going on here? How does the Operating System call know how to fill in an op *ioOperation
and how can we then pass data into the channel?
To figure this out we need to see how I/O is “prepared” and then invoked. To prepare the I/O we create an I/O operation and this is where a channel is created:
func (f *win32File) prepareIO() (*ioOperation, error) {
f.wgLock.RLock()
if f.closing.isSet() {
f.wgLock.RUnlock()
return nil, ErrFileClosed
}
f.wg.Add(1)
f.wgLock.RUnlock()
c := &ioOperation{}
c.ch = make(chan ioResult)
return c, nil
}
Then we issue the Read
passing the reference to the ioOperation
and wait for it to complete in ;’asyncIO’. Note that even though this is called asyncIO
it is a blocking operation. The thread that gets suspended isn’t this one, it is the one running the go routine with ioCompletionProcessor
loop.
...snip...
var bytes uint32
err = syscall.ReadFile(f.handle, b, &bytes, &c.o)
n, err := f.asyncIO(c, &f.readDeadline, bytes, err)
runtime.KeepAlive(b)
...snip...
Inside the `asyncIO we find we are waiting for the channel to be filled:
...snip...
var r ioResult
select {
case r = <-c.ch:
err = r.err
if err == syscall.ERROR_OPERATION_ABORTED { //nolint:errorlint // err is Errno
if f.closing.isSet() {
err = ErrFileClosed
}
} else if err != nil && f.socket {
// err is from Win32. Query the overlapped structure to get the winsock error.
var bytes, flags uint32
err = wsaGetOverlappedResult(f.handle, &c.o, &bytes, false, &flags)
}
case <-timeout:
...snip...
}
}
...snip...
If you read the rest of the code you will not find that channel being used anywhere!
But as you might have guessed by now that channel we saw in the ioCompletionProcessor
is the same! How do the two channels get linked together?
The key is a little trick that is used extensively when working with Windows I/O completion ports. When calling getQueuedCompletionStatus
we are passing a pointer to the structure Overlapped. The struct we passed look is actually a wrapper:
type ioOperation struct {
o syscall.Overlapped
ch chan ioResult
}
Since we set up the channel during prepareio
then passed the pointer to the Read
sys call and the OS only fills in the bits for the Overlapped
struct when we get the notification that the thread is unsuspended we now have a pointer the the struct that we prepared: ioOperation
with a go routine. Then we can pass the value through the channel (which is waiting in the asyncIO
function) and the read completes!
This little trick is also used the in the Mio project but slightly differently. Since the Mio project has created an event loop it doesn’t actually wait for the read it just needs to know the event it is associated too (in fact it does copy the buffer internally but that is slightly different than the application doing the reading). The read by the end program will happen at another time. So instead the structure looks a little different but the same trick is used:
#[repr(C)]
pub(crate) struct Overlapped {
inner: UnsafeCell<OVERLAPPED>,
pub(crate) callback: fn(&OVERLAPPED_ENTRY, Option<&mut Vec<Event>>),
}
In this case they’ve make it a generic callback function that can be filled with anything.
In other cases you might just have some basic information in and not a call back or channel. It really is up to your use case.
Conclusion
It took me awhile to figure how these calls came together and it was hard to find it explicitly called out anywhere. Hopefully this helps someone who is struggling to figure out the “one small trick” being used here.
I did find eventually find this in a few resources on the topic. I highly recommend reading the following which go over the details of this process in much more detail:
- https://cfsamsonbooks.gitbook.io/epoll-kqueue-iocp-explained/
- https://dschenkelman.github.io/2013/10/29/asynchronous-io-in-c-io-completion-ports/
- https://leanpub.com/windows10systemprogramming (Chapter 11 File and Device I/O)
Comments