Windows I/O completion - One little trick

I’ve been learning how to deal with I/O Completion ports for my latest project and found a few libraries that manage it all for me but I was getting strange behavior, So I ended up having to dig deep enough to understand what was happening. I didn’t find a really clear post so here is my attempt.

When I was reading some of the code I found all had slightly different ways of accomplishing detection of a completed I/O call. The two libraries I was referencing were Rust’s Mio crate and go’s winio.

Understanding that they were accomplishing the same task in different ways was key:

  • Winio library is treating the read as a synchronous call. When you call Read or Write on the file it will issue the read call, then wait till the async operation completes. These means if you wish to use this as an async call you should do it on a Go Routine.
  • Mio library creates an event loop and the read/write processing should be handled once the readiness signal is returned. It also converts Windows IO completion into a readiness signal using an internal buffer.

I/O completion’s one little trick

Those are the two key differences in the way each library approaches doing I/O but I was still confused as to how the program “wakes” back up after the I/O completes.

Let’s take a look at the winio code that returns after the system finished the async call to GetQueuedCompletionStatus. Note that the system call to getQueuedCompletionStatus will suspend the thread that calls it.

// ioOperation represents an outstanding asynchronous Win32 IO.
type ioOperation struct {
	o  syscall.Overlapped
	ch chan ioResult
}

func ioCompletionProcessor(h syscall.Handle) {
	for {
		var bytes uint32
		var key uintptr
		var op *ioOperation
		err := getQueuedCompletionStatus(h, &bytes, &key, &op, syscall.INFINITE)
		if op == nil {
			panic(err)
		}
		op.ch <- ioResult{bytes, err}
	}
}

What is going on here? How does the Operating System call know how to fill in an op *ioOperation and how can we then pass data into the channel?

To figure this out we need to see how I/O is “prepared” and then invoked. To prepare the I/O we create an I/O operation and this is where a channel is created:

func (f *win32File) prepareIO() (*ioOperation, error) {
	f.wgLock.RLock()
	if f.closing.isSet() {
		f.wgLock.RUnlock()
		return nil, ErrFileClosed
	}
	f.wg.Add(1)
	f.wgLock.RUnlock()
	c := &ioOperation{}
	c.ch = make(chan ioResult)
	return c, nil
}

Then we issue the Read passing the reference to the ioOperation and wait for it to complete in ;’asyncIO’. Note that even though this is called asyncIO it is a blocking operation. The thread that gets suspended isn’t this one, it is the one running the go routine with ioCompletionProcessor loop.

	...snip...
	var bytes uint32
	err = syscall.ReadFile(f.handle, b, &bytes, &c.o)
	n, err := f.asyncIO(c, &f.readDeadline, bytes, err)
	runtime.KeepAlive(b)
	...snip...

Inside the `asyncIO we find we are waiting for the channel to be filled:

	...snip...
	var r ioResult
	select {
	case r = <-c.ch:
		err = r.err
		if err == syscall.ERROR_OPERATION_ABORTED { //nolint:errorlint // err is Errno
			if f.closing.isSet() {
				err = ErrFileClosed
			}
		} else if err != nil && f.socket {
			// err is from Win32. Query the overlapped structure to get the winsock error.
			var bytes, flags uint32
			err = wsaGetOverlappedResult(f.handle, &c.o, &bytes, false, &flags)
		}
	case <-timeout:
	    ...snip...
		}
	}
	...snip...

If you read the rest of the code you will not find that channel being used anywhere!

But as you might have guessed by now that channel we saw in the ioCompletionProcessor is the same! How do the two channels get linked together?

The key is a little trick that is used extensively when working with Windows I/O completion ports. When calling getQueuedCompletionStatus we are passing a pointer to the structure Overlapped. The struct we passed look is actually a wrapper:

type ioOperation struct {
	o  syscall.Overlapped
	ch chan ioResult
}

Since we set up the channel during prepareio then passed the pointer to the Read sys call and the OS only fills in the bits for the Overlapped struct when we get the notification that the thread is unsuspended we now have a pointer the the struct that we prepared: ioOperation with a go routine. Then we can pass the value through the channel (which is waiting in the asyncIO function) and the read completes!

This little trick is also used the in the Mio project but slightly differently. Since the Mio project has created an event loop it doesn’t actually wait for the read it just needs to know the event it is associated too (in fact it does copy the buffer internally but that is slightly different than the application doing the reading). The read by the end program will happen at another time. So instead the structure looks a little different but the same trick is used:

#[repr(C)]
pub(crate) struct Overlapped {
    inner: UnsafeCell<OVERLAPPED>,
    pub(crate) callback: fn(&OVERLAPPED_ENTRY, Option<&mut Vec<Event>>),
}

In this case they’ve make it a generic callback function that can be filled with anything.

In other cases you might just have some basic information in and not a call back or channel. It really is up to your use case.

Conclusion

It took me awhile to figure how these calls came together and it was hard to find it explicitly called out anywhere. Hopefully this helps someone who is struggling to figure out the “one small trick” being used here.

I did find eventually find this in a few resources on the topic. I highly recommend reading the following which go over the details of this process in much more detail:

  • https://cfsamsonbooks.gitbook.io/epoll-kqueue-iocp-explained/
  • https://dschenkelman.github.io/2013/10/29/asynchronous-io-in-c-io-completion-ports/
  • https://leanpub.com/windows10systemprogramming (Chapter 11 File and Device I/O)

Rust Traits are not interfaces (and a little on Lifetimes)

I have been learning rust lately. I eventually came to some clarity (hopefully) after reading lots of different sources so I’ve compiled there here with some commentary.

Note: I am still learning so please let me know if I’ve got something not quite right. This is my current understanding and your mileage may vary.

Using Traits for dynamic implementations

My first approach to using traits was to use them exactly like I had with interfaces from other languages. This led me to using things like Box and dyn to create trait objects.

We create a trait object by specifying some sort of pointer, such as a & reference or a Box smart pointer, then the dyn keyword, and then specifying the relevant trait. (We’ll talk about the reason trait objects must use a pointer in Chapter 19 in the section “Dynamically Sized Types and the Sized Trait.”) ^[1](https://doc.rust-lang.org/book/ch17-02-trait-objects.html#defining-a-trait-for-common-behavior)

When implementing it I found I was forced to use the Box and pass the types around since their `Size` was unknown:


trait Listener {
    fn Accept(&mut self) -> Result<Box<dyn Connection>>, io::Error>;
}

impl Listener for PipeServer {
    fn Accept(&mut self) -> Result<Box<dyn Connection>>, io::Error>;
}

fn start_handler(conn: Box<dyn Connection>){
  ...snip...
}

This has it’s place in implementations where you don’t know the types ahead of time, such as libraries where you don’t know the types that will be used. There isn’t anything particularly wrong with this but there can be a slight performance hit due to using dynamic dispatch instead of static dispatch:

monomorphization process performed by the compiler when we use trait bounds on generics: the compiler generates non-generic implementations of functions and methods for each concrete type that we use in place of a generic type parameter. The code that results from monomorphization is doing static dispatch, which is when the compiler knows what method you’re calling at compile time. This is opposed to dynamic dispatch, which is when the compiler can’t tell at compile time which method you’re calling. In dynamic dispatch cases, the compiler emits code that at runtime will figure out which method to call. ^[1]https://doc.rust-lang.org/book/ch17-02-trait-objects.html#trait-objects-perform-dynamic-dispatch

Monomorphization approach with Traits

To get around this and avoid the additional runtime overhead, I added an Associated type and then specified the type on the implementation which allows the compiler to need the box. I could also use impl Trait in the parameter position instead of needing the Box<dyn > type:

trait Connection {
    fn close(&mut self) -> io::Result<()>;
}

trait Listener {
    type Type: Connection; // specify trait restriction
    fn Accept(&mut self) -> Result<Self::Type, io::Error>;
}

impl Listener for PipeServer {
    type Type = PipeInstance;  // this is concrete type that implements trait Connection
    fn Accept(&mut self) -> Result<Self::Type, io::Error> {
      ...snip...
    }

fn start_handler(con: impl Connection) {
  ...snip...
}

Note that this is using the impl Trait in the parameter location which is syntactic sugar for Trait Bound Syntax. And so the start_handler could have been writing like:

fn start_handler<T: Connection>(con: T) {
  ...snip...
}

Although they are similar there are two slight differences:

  • If you have two (or more) parameters then when using Trait Bound Syntax the two parameters must be the same. With impl Trait they could be different types
  • With Trait Bound syntax you can use the turbofish syntax (start_handler::<PipeInstance>) and with impl Trait you cannot use that syntax.

Lifetimes with Traits

Now that I removed the overhead of using Trait objects I ran into an issue with Lifetimes I didn’t have when I wasn’t using traits since start_handler actually spins off a thread:

fn start_handler(con: impl Connection) -> thread::JoinHandle<()>
 {
    let newconnection = con;
    
    let h = thread::spawn(move || {
      ...snip...
    });
    h
  }

  !!! doesn't compile:

  error[E0310]: the parameter type `impl Connection` may not live long enough
  help: consider adding an explicit lifetime bound...
   |
41 | fn start_handler(con: impl Connection + 'static) -> thread::JoinHandle<()>

The above using impl Trait doesn’t compile but interesting if I use the concrete type it does(!?!):

fn start_handler(con: PipeInstance) -> thread::JoinHandle<()>
 {
    let newconnection = con;
    
    let h = thread::spawn(move || {
      ...snip...
    });
    h
  }

So what is going on here and why is 'static required? Specifically is using 'static bad? As a newbie I really thought it was something to avoid but found a few articles that helped me understand this.

T vs &T

I found two existing questions that were similar to my situation (1, 2) in the Rust help that eventually led to more understanding. It took me some time to parse the answers so let’s break it down:

From the first answer we learn that since the complier doesn’t know what Connection is, it treats it as type T. Type T is encompasses borrowed &T. This means that we could possibly “smuggle a borrowed item to the thread”.

The second answer gives us an example of the difference between T and borrowed &T when running on a thread (try it out on the rust playground)

fn test<T: Send + Sync + std::fmt::Display>(val: T) {
    thread::spawn(move || println!("{}", val));
}

fn evil() {
    let x = 10;
    return test(&x);
}

!!! doesn't compile:

error[E0597]: `x` does not live long enough
17 |     return test(&x);
   |            -----^^-
   |            |    |
   |            |    borrowed value does not live long enough
   |            argument requires that `x` is borrowed for `'static`
18 | }
   |  - `x` dropped here while still borrowed

This makes sense since it would create a dangling pointer to the x when x goes out of scope on function evil. So the Compiler doesn’t allow us to do this.

So if T is encompasses &T that means we could pass in a reference that would go out of scope. So we need the 'static' to ensure that the variable lives at least as long as the program. This makes sense intuitively once you know the rule. On top of that we also understand that if the struct that T represents contains a borrowed pointer this would also be a problem.

But then comes the question, isn’t 'static something that should be avoided?

Using ‘Static

As a beginner, I thought using static would be wrong but lets look into this a bit more. This is pretty common mis-conception as pointed out by the link provided in the second answer:

They get told that “str literal” is hardcoded into the compiled binary and is loaded into read-only memory at run-time so it’s immutable and valid for the entire program and that’s what makes it ‘static. These concepts are further reinforced by the rules surrounding defining static variables using the static keyword

If we continue reading the post a bit more we learn a type can also be bounded by a 'static lifetime which is different from a the case where a variable is compiled into the binary. (I highly recommend reading the post for other insights beyond this).

a type with a ‘static lifetime is different from a type bounded by a ‘static lifetime. The latter can be dynamically allocated at run-time, can be safely and freely mutated, can be dropped, and can live for arbitrary durations.

This means static isn’t all that bad; It is ensuring that we don’t have any borrowed items when we are using the object, because if we did the thread could point at something that goes out of scope (creating a dangling pointer).

Ok so that means we can add 'static and it will compile:

fn start_handler(con: impl Connection + 'static) -> thread::JoinHandle<()>
 {
    let newconnection = con;
    
    let h = thread::spawn(move || {
      ...snip...
    });
    h
  }

but why doesn’t the example where PipeInstance is passed need the static declaration?

This is because it is an Owned Type and doesn’t contain Borrowed types! This means that the lifetime is in the current Scope. Because it is a concrete type, the compiler can find the type and confirm this that it does not contain any borrowed types. Once it is switched to a impl Trait it can’t be confirmed by the compiler, so I needed to make it explicit.

As an example, If I add a Borrowed type on the struct of the PipeInstance, I would need to annotate the type of Lifetimes. In order to use it in the start_handler function I would then need to ensure that the lifetime of the borrowed item was at least 'static!

This is very nicely described in the Misconception #6.

Another point for 'static lifetimes not being terrible in this scenario is the definition of spawn function itself:

pub fn spawn<F, T>(f: F) -> JoinHandle<T>
where
    F: FnOnce() -> T,
    F: Send + 'static,
    T: Send + 'static,

The ‘static constraint means that the closure and its return value must have a lifetime of the whole program execution. The reason for this is that threads can outlive the lifetime they have been created in.

Indeed if the thread, and by extension its return value, can outlive their caller, we need to make sure that they will be valid afterwards, and since we can’t know when it will return we need to have them valid as long as possible, that is until the end of the program, hence the ‘static lifetime.

So there we have it, 'static lifetimes aren’t something that we need to be afraid of.

Elision Hints

So all this lifetime stuff can be a bit confusing, especially due the fact that many Lifetimes can be Elided so we don’t seem them when we are doing basic work.

One thing I’ve done during this learning period was to turn on rust-analyzer.inlayHints.lifetimeElisionHints.enable in VSCode Extension. At least initially these helps me see the various lifetimes that my program. I might turn it off in the future but for now it helps me understand the implicit lifetimes.

One thing to note, which was initially confusing for me, is that a signature like fn start_handler(con: PipeInstance) -> thread::JoinHandle<()> won’t have a Lifetime! This was initially confusing because I expected to see 'static but now see that it is an Owned type (with no borrowed types in the struct) its life time is the current scope and so the Lifetime annotations aren’t required to track it.

Conclusion

Well hope that helps someone (most likely myself when I forget all this in 6 months). This is my current understand as of the writing of this post, so something might change or be incorrect. Look forward to learning and sharpening this understanding, so leave a comment if something seems off.

Running Windows Unit tests for Kubernetes on Windows

Most of the kubernetes build tooling doesn’t work on Windows but we do have unit tests written for Windows components. These unit tests must run on a Windows machine. So how do you run them if the build tooling doesn’t work?

Manually

From a Windows machine you can manually run them:

go test . -mod=mod -run listContainerNetworkStats

If you want to build build them on Linux then run them on Windows.

On Linux, cd to the folder where your tests are and build a test executable:

cd pkg\kubelet\stats\
GOOS=windows go test -c .

Then on Windows, the test executable can then be run (where listContainerNetworkStats is the name of the test to focus on):

#copy test executable to windows and run:
stats.test.exe -test.run listContainerNetworkStats

Goland

To enable running them in Goland on Windows:

Then you can run the test (and debug!) via the UI.

VSCode

To enable running them in VS Code on Windows add the following to your settings file. It has to be done in the file and as of the writing can’t be configured in the UI.

{
    "go.testFlags": [
        "-mod=mod"
    ]
}

Then you can run the test (and debug!) via the UI.

Helpful Cluster API commands for Devs

If you are doing development or working with Kubernetes cluster api these are some helpful tips.

Handy tools

  • clusterctl - helps with cluster creation but more importantly cluster debugging
  • kubie - allows you to connect to multiple clusters at same as well as switch namespaces and clusters quickly. Helpful becuase it lets you be connected to both the management cluster and the workload cluster.

Getting a kind management cluster kubeconfig

kind get kubeconfig --name capz-e2e > kubeconfig.e2e

Getting a Workload Cluster kubeconfig

When connected to management cluster

kubectl get clusters                        
NAME               PHASE
capz-conf-q6vvi1   Provisioned

Now download the clusters kubeconfig:

clusterctl get kubeconfig capz-conf-q6vvi1 > kubeconfig.e2e.conformance.windows

Viewing the state of the cluster

https://cluster-api.sigs.k8s.io/clusterctl/commands/describe-cluster.html

clusterctl describe cluster capz-conf-q6vvi1
NAME                                                                 READY  SEVERITY  REASON                   SINCE  MESSAGE                                                                               
/capz-conf-q6vvi1                                                    True                                      2m53s                                                                                        
├─ClusterInfrastructure - AzureCluster/capz-conf-q6vvi1              True                                      6m16s                                                                                        
├─ControlPlane - KubeadmControlPlane/capz-conf-q6vvi1-control-plane  True                                      2m53s                                                                                        
│ └─Machine/capz-conf-q6vvi1-control-plane-845sj                     True                                      2m55s                                                                                        
└─Workers                                                                                                                                                                                                   
  ├─MachineDeployment/capz-conf-q6vvi1-md-0                                                                                                                                                                 
  └─MachineDeployment/capz-conf-q6vvi1-md-win                                                                                                                                                               
    └─2 Machines...                                                  False  Info      WaitingForBootstrapData  3m15s  See capz-conf-q6vvi1-md-win-645fbb7c79-jhjjt, capz-conf-q6vvi1-md-win-645fbb7c79-vf44j

Using kubie

Use the kubeconfigs above to load a cluster:

kubie ctx -f kubeconfig.e2e

or if the cluster is in your default kubeconfig:

kubie ctx kind-capz

ssh’ing to capz machines

ssh’ing made easy in cluster api for azure (capz) VM’s and VMSS:

https://github.com/kubernetes-sigs/cluster-api-provider-azure/tree/main/hack/debugging#capz-ssh

Find the machine:

kubectl get azuremachine
NAME                                 READY   STATE
capz-cluster-0-control-plane-5b5fc   true    Succeeded
capz-cluster-0-md-0-fljwt            true    Succeeded
capz-cluster-0-md-0-wbx2r            true    Succeeded

Now ssh to it:

kubectl capz ssh -am capz-cluster-0-md-0-wbx2r            

Deleting all of your dev clusters

Don’t do this in prod! :-)

kubectl delete cluster -A --all --wait=false

Add vim to Windows nodes

Sometimes you need to edit the files while developing:

iwr -useb get.scoop.sh | iex
scoop install vim

What else?

What are your favorites? What am I missing?

Track active file in Goland

In a large code base it can be hard to figure out where files are. I’ve always enabled the “track active file in project drawer” in Visual Studio and VS Code. I’ve become a big fan Goland recently and found it hard to find this setting. So here it is:

goland track the active file