C# 4.0 introduces the notion of Covariance and Contravariance of generic type parameters for interfaces and delegate types. Eric Lippert has put together a bunch of posts that goes into details of the why and how, an excellent read but not for the faint of heart. I would strongly suggest reading these posts to get a firm grounding and a better appreciation of this feature. It took me a while to get my head wrapped around this, especially since none of the VS2010 Betas were not out at the time and there was no way to try out any code. Now that I have Beta2 of VS2010 at hand I decided to try out some code samples first hand and solidify my understanding of the feature.
Suppose we have the following inheritance hierarchy:
and a processor to process animals:
It would be natural to be able to create a Processor to process any kind of animal, but the following code does not compile today on the C# 3.0 compiler
In C# 4.0 no such cast is needed and everything just works as expected. Why? because in .NET 4.0 IEnumerable<T> is covariant in it's type parameter T, this is what the definition looks like
C# In Out; which means that you can pass in any type which is more derived than 'T' as the type argument, hence we can pass in IEnumerable<Giraffe> or IEnumerable<Whale> to the Process method which accepts an IEnumerable<IAnimal> and everything just works as expected in a type safe way.You can define your own covariant types to allow for such conversions. Consider the following generic class definition to create instances of a specified type
Note that it is illegal for a covariant type parameter to occur in an input position, it can only occur in output positions such as return type of a method in the type. This restriction is key in making covariance work,allowing the type parameter in an input position would break covariance. How you might ask. Imagine if the following illegal code were legal and you could specify T in an input position.
It is precisely for this reason that a covariant type parameter can only occur in output positions(hence the keyword out) and the runtime error is now a compile time error.
But what if we wanted to have the type paramter in an input position to get the benefits variance has to offer, enter contrvariance which is the dual of covariance. In that a type argument can be safely replaced by a more derived type but only in an input position.
If T appears ONLY in an input position (such as a parameter in methods of IProcessor) then we can mark T as contravariant using the in keyword, allowing us to substitute any sub type of T.
This makes assignments such as the following legal.
Finally let's look at variance in delegate types. Consider the following delegate definition and it's subsequent usage
Now let's combine both co and contravariance, consider the following delegate
Covariance And Contravariance In C# Delegates
Covariance and Contravariance may sound like a mouthful and daunting to understand but at its core it's quite simple to grasp once you understand its purpose, it enables writing generic code more naturally and allows safe type coercions. That's it for today. Happy coding.
Posted on Sunday, January 10, 2010 12:50 PM C# | Back to top
C# 4.0 introduces covariance and contravariance (together variance) for generic type parameters. These two concepts are similar and allow the use of derived or base classes in a class hierarchy.
An easy way to understand the difference between the two concepts is to consider the activity of the user of the variables passed to / from the generic method (interface or delegate.)
Contravariance
If the method implementation is only using variables passed with the parameter for read activity – then the generic parameter is a candidate for contravariance. The ‘read only’ role of the parameter type can be formalised by marking it with the in keyword – i.e. it is an input to the implementation.
Any generic type parameter marked with the in keyword will be able to match to types that are derived from the named type. In this case the implementation is the user of the variables. It makes sense to allow derived classes as any read activities that are available on the base class will also be available on any derived class.
Covariance
Similarly – if a method implementation is only using variables passed with the parameter for write activity – then the generic parameter is a candidate for covariance. The ‘write only’ role of the parameter type can be formalised by marking it with the out keyword – i.e. it is an output from the implementation.
Any generic type parameter marked with the out keyword will be able to match to types that are base classes of the named type. In this case the calling / client code can be considered the user of the variables – so again it makes sense. Any operation (read or write) that is available on the base class that the client code has requested will also be available on the actual (derived) type that the implementation instantiates / sends as output.
Issues to consider
Once type parameters have been marked with the in or out keyword the compiler will validate the interface / delegate compliance with the assigned variance. E.g. if the first example is changed to return the parameter passed, then the compiler will report an error – as the parameter is not being used for input only.
Variance uses reference type conversion – so it will not work with value types. Even though within the type system int inherits from object, the following will not compile.
Advertisements
Covariance and contravariance have had a tweak in C# 4.0. But before we can start using the benefits in our code, we need to know exactly what they are! The subject can be a little bit tricky when you first encounter it, so in this post I'm going to introduce it in a way that I hope is easy to follow.
Almost all of the information in this post comes from Eric Lippert's excellent series of posts, and I recommend you go and take a look at his blog right now (links to the whole series are at the bottom of this post.)
If you 'get it' straight away, great - stick with that, he's the expert. If you'd like a slightly gentler introduction, avoiding (I hope!) common 'gotchas' for learning the subject, read my post (this post) first.
What I have done is to explain the same concepts using the same examples, but in an order and with an emphasis which I think make it much easier to understand the basics. This is particularly good for you if you are a programmer coming at it from the cold, i.e. you haven't encountered covariance and contravariance before, or you have but you don't understand them yet.
When you're done here I'd suggest you go back and read Eric's posts in order - they should be much easier for you to read by then. Eric's posts will flesh out all the interesting details, and continue on to discuss more advanced topics.
Inheritance and assignability
We're not going to begin talking about covariance and contravariance straight away. First, we're going to make a distinction between inheritance and assignability.
As Eric points out, for any two types
T and U , exactly one of the following statements is true:
Now, one of the things that seems to have caused some confusion on Eric's blog (see the comments) is usage of the phrase 'is smaller than'. It is used frequently, and is key, so I want to make it's definition crystal clear now before we move on. Eric says:
'Suppose you have a variable, that is, a storage location. Storage locations in C# all have a type associated with them. At runtime you can store an object which is an instance of an equal or smaller type in that storage location.'
In simple scenarios, this is something so familiar to programmers that it's barely worth mentioning. We all know that, looking at the list above, only in the middle two scenarios is
T assignable to U . The smaller than relation:
This is the first thing that comes to mind, right? An inheritance hierarchy.
But Eric didn't mention inheritance hierarchies. Sure, an inheritance hierarchy is one way to make a
T which is assignable to a U , but what about this one:
.. or, the same statement using classes from the animals hierarchy:
The type
Animal inherits from Giraffe , but the type Animal[] doesn't inherit from Giraffe[] . They are assignable, but not linked by inheritance, and this tells us something about what 'is smaller than' means:
can be read as
You can visualise it this way:
As we have seen, in some cases this direction of assignability may be because of an inheritance relationship, but in others it is simply because the CLR and languages (C#, Java etc.) happen to support that particular assignment operation.
There is still an inheritance hierarchy involved, i.e. this wouldn't work:
But the key thing is that there is a difference between inheritance and assignability: they are not the same thing.
I'll say it one more time (for good luck!): The phrase 'is smaller than' refers to assignability, not inheritance. The direction of assignability always flows from the smaller type to the larger type. We'll come back to this in a moment.
Covariance and Contravariance
Eric's second post discusses the array assignment operation (the one I used in the Animal /Giraffe example above), and the problems with it. It's definitely worth reading, but park it for now, because things really come alive in post number three.
Eric's example uses delegate methods, and I'll use a simplified version of it here, just to get us started.
It is clear why this is a legal operation:
Notice that in the assignment operation,
Animal is on the left and Giraffe is on the right. That is, the declared type is based on Animal and the assigned type is based on Giraffe .
Now let's look at another example:
Notice that
Giraffe is on the left and Animal is on the right. That is, the declared type is based on Giraffe and the assigned type is based on Animal .
The
Func<out T> assignment operation supports covariance. The Action<in T> assignment operation supports contravariance.
What does that mean?
Have a quick look at this summary: (remember to read < as 'is smaller than' and 'is assignable to')
Now read Eric's definition of covariance and contravariance, from the first post in his series:
(the 'operation' which manipulate types being the two assignment operations) Consider an 'operation' which manipulates types. If the results of the operation applied to any T and U always results in two types T' and U' with the same relationship as T and U, then the operation is said to be 'covariant'. If the operation reverses bigness and smallness on its results but keeps equality and unrelatedness the same then the operation is said to be 'contravariant'.
Hopefully it should start to become clear. In line 4 above, the direction of assignability with respect to the original types, was preserved, while in line 5 it was reversed!
Line 4 represents a covariant operation, and line 5 represents a contravariant operation.
The main heuristic
Let's put it back to C# code so that we can see it with the right-to-left assignability we are used to (now the smaller types are on the right):
Notice how in the covariant operation,
Animal and Giraffe are on the same sides as in the basic type assignment operation. And notice how in the contravariant operation, they are on opposite sides - the operation 'reverses bigness and smallness'.
In both cases, the opposites are illegal. As Eric puts it in post number five:
'Stuff going 'in' may be contravariant,.. but not vice-versa:
And by the way, if there's one heuristic you remember as a result of reading this post, it's probably best to make it the one above!
I'll repeat it later in this article.
Hang on, methods aren't types!
A quick aside - at this stage you might be asking why I'm referring to methods as though they were types. The straight answer is, I'm copying Eric. His caveat: 'A note to nitpickers out there: yes, I said earlier that variance was a property of operations on types, and here I have an operation on method groups, which are typeless expressions in C#. I’m writing a blog, not a dissertation; deal with it!'
Can't argue with that.
What's new in C# 4.0?
Well, 'new' is the wrong word since the stable release of C# 4.0 was two years ago! But all of the types of variance we've looked at so far in this post have been supported since C#2 or before.
We as developers didn't really have to think about those types of variance to use them, because it wasn't exposed syntactically. In other words, we didn't have to write anything different to make it happen, it's just what is and what isn't supported by C# compilers and the CLR.
In post numbers four and six, Eric discusses types of variance which went on to become part of the specification for C# 4.0, and it's those types of variance that I'll discuss now.
Real delegate variance
The first one is easy, and it's discussed in post number four. It's simply about taking the operations which were already legal in terms of method groups and making the same operations legal in terms of typed expressions.
Take our covariant example from earlier:
Well, in C#3 this essentially equivalent operation was illegal, whereas in C#4 it is legal:
In fact because of lambda syntax and inferred typing, it can be shortened to:
You can now do with typed expressions what you could already do with method groups. Simple.
African percussion instruments pack fl studio. But here's where it makes sense to quickly explain something I breezed over earlier.
Covariance and Contravariance, at once
Take a look again at the heuristic: 'Stuff going 'in' may be contravariant,
So what happens when you are dealing with a type which has both an 'in' and an 'out'?
The short answer is: it can be covariant, contravariant, both, or neither. But it's easier than that makes it sound!
Take a look at this example. It's a
Func that accepts a Mammal and returns a Mammal :
Now here are some assignment operations:
.. and, well, I'm sure I don't need to spell out the neither!
Interface variance
The other new feature, as discussed in post number six, is the extension of variance to interfaces. There's not much to add here - it's just the same thing, but using interfaces. Eric gives a really nice example of the practical benefit here, and I'm going to repeat it almost verbatim.
Take a look at this code block. This is another example of something which is illegal in C#3, and legal in C#4:
Just as earlier on, when we call
FeedAnimals(IEnumerable<Animal> animals) we are assigning a 'smaller' type to a 'larger' type:
Of course, anywhere else that you reference that assigned-to variable (
IEnumerable<Animal> ), what comes out will be typed as Animal . All pretty uncontroversial.
In and out
But finally, let's look at the in and out keywords, and how they fit in when designing your own interfaces (or using the upgraded C#4 ones.) Recall one more time the heuristic:
'Stuff going 'in' may be contravariant,
In C# 4.0,
IEnumerable<T> has becomeIEnumerable<out T> . The out marks the IEnumerable as supporting covariance on the T . This means that, as in the example above, you can assign based on something smaller than T .
But it also means that the interface cannot accept the type
T as an input! It will only allow the interface to send T out, in whatever fashion you like - but it will never accept a T in. If you try it, the compiler won't allow it. Hence, the name: out .
Reading through this code block should make it clear why:
Think of it this way - an
IEnumerable<Animal> variable can have an IEnumerable<Giraffe> assigned to it and it will churn out Giraffe s typed as Animal s all day long. Because of how it's declared, users of the IEnumerable<Animal> variable expect to be dealing with Animal s.
But a
Tiger is also an animal. What would happen if there were a method on the interface that allowed a user to put an Animal in?
The user could put a
Tiger in instead, and the backing store - IEnumerable<Giraffe> - wouldn't be able to cope.
The same in reverse
Now here's a similarly invalid code block, this time using the in keyword:
So when a type is marked as
out , it's out only. And when a type is marked as in , it's in only too! A type can't be both in and out .
How to read it
So when you read an out type in an interface, read it this way:
And for an
in type in an interface:
If it helps, try reading those again - but this time, with the
out T interfaces read T as Animal and <=T as Giraffe .
And with the
in T interfaces read T as Giraffe and >=T as Animal .
Or, more concisely
Here's out again more concisely:
And for
in :
I hope that helps!
The payoff
As Eric points out, the only way to make the above example of FeedAnimals work in C#3 is to use a 'silly and expensive casting operation':
He goes on:
'This explicit typing should not be necessary. Unlike arrays (which are read-write) it is perfectly typesafe to treat a read-only list of giraffes as a list of animals'
And the example which Eric suggests hypothetically in that post, Matt Hidinger later demonstrates for us using C#4!
The full series
That's about as much as I want to write on the subject!
Below are links to the full series. Bear in mind that these explanatory posts were written prior to the release of C# 4.0. But they are still an excellent programmer's introduction, with much more info than I have covered in this post:
Comments are closed.
|
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |