C#: Nullable Reference Types (aka How The Fuck Was This Working Before?)

c#dotnetswearing

Reference types 101

Firstly, let's make sure we're all on the same page of what a reference type even is.

C# has two kinds of types: Reference types and Value types.

In short, reference types are classes. Anything that inherits from System.Object is a reference type.

Value types, on the other hand, are structs. int, bool, DateTime are some common value types.

C# lets you assign null to variables of reference types. It does not allow you to assign null to variables of value types. Go ahead and try it:

int a = 0; // works fine!
int b = null // CS0037 Cannot convert null to 'int' because it is a non-nullable value type

object x = new Object(); // works fine!
object y = null; // also works fine!

The reason for this, is that reference types are actually just things that point to where the (instance of the) data is actually stored, while a value type is actually an instance of the data itself.

Without going into too much detail, this is kind of how C#, or rather the underlying runtime, can manage memory allocations for us.

But reference types are already nullable?

That's what I thought when nullable reference types were announced. What on earth do you mean I can enable nulls? Literally a few sentences ago I said that C# lets you assign null to reference types. What gives?

It turns out, that nullable reference types are actually a compiler analysis feature, which lets it determine, and subsequently tell you, if you've done your due dilligence of checking for nulls.

Personally, I think it's a poorly named feature. In fact, the concept of nulls are pretty crappy, especially compared to the Option type of F# or the Option enum of Rust. But that's a story for another day.

Let's look at an example

Consider the following code snippet:

Person p = new Person();

Console.WriteLine(p.FirstName.ToUpper());

class Person
{
    public string FirstName { get; set; }
    public string LastName { get; set; }
}

Assuming you're using .NET 6 or later, this code will compile with zero warnings or errors. And yet, when you actually run the program, it will explode with a NullReferenceException.

This is not news. We didn't assign the FirstName property, so it has its default value of null (as all reference types do).

Now let's enable nullable reference types and see what happens with the same code:

#nullable enable
Person p = new Person();

Console.WriteLine(p.FirstName.ToUpper());

class Person
{
    public string FirstName { get; set; } // Non-nullable property 'FirstName' must contain a non-null value when exiting constructor. Consider declaring the property as nullable.
    public string LastName { get; set; } // Non-nullable property 'LastName' must contain a non-null value when exiting constructor. Consider declaring the property as nullable.
}

We have some new yellow squigglies in our editor which means the compiler is warning us about something. In fact, when you enable nullable reference types, you're going to see this warning a lot. Let's unpack it:

Non-nullable property 'FirstName' must contain a non-null value when exiting constructor. Consider declaring the property as nullable.

Firstly, we are told which property has the problem, and we are told it is non-nullable. Strings are reference types, and that means we are allowed to assign null to them, and their default value is null. Yet the compiler says non-nullable. This is what turning on nullable reference types does: reference types are now implicitly non-nullable.

Secondly, we are told the property must contain a non-null value when exiting the constructor.

What the hell? We didn't write a constructor, so why are we being told about the constructor?

Well, it's because every class has a default constructor that the compiler creates for you behind the scenes, since they are a necessary part of C# plumbing. Without a constructor, you could not use new. When you define a constructor, the compiler uses your one instead of making one for you.

This part of the warning is simply telling us that by the time whichever constructor exits, the property value must not be null.

Finally, the last part: consider declaring the property as nullable. This is the escape hatch: if this property can possibly contain null, then we should indicate that by declaring it as nullable. We can do that by putting a ? after the type, for example, string?.

Let's see what happens if we silence these warnings by declaring both properties as nullable.

#nullable enable
Person p = new Person();

Console.WriteLine(p.FirstName.ToUpper()); // CS8602: Dereference of a possibly null reference.

class Person
{
    public string? FirstName { get; set; }
    public string? LastName { get; set; }
}

The yellow squigglies we had before are gone, but we have a new one where we access FirstName. Let's take a closer look at the warning:

Dereference of a possibly null reference.

What's a "dereference"? Think about the name reference types. Remember that a reference type is literally a thing that refers to some data. Calling .ToUpper() on a thing that refers to something else does not make a lot of sense, and what we actually want to do is call that method on the thing that the reference is referring to. The act of accessing the data that a reference is referring to, is called dereferencing.

Now think about what would happen if we tried to find out what null is referring to. It makes no sense. Null does not refer to anything. And this is the source of a NullReferenceException.

So basically, the compiler has figured out, that since FirstName has been marked as nullable, it is possible that FirstName might refer to null. It's telling us that we might have a bug.

And it is right, in this case, because when we run the program, it explodes with a NullReferenceException. Good job compiler!

Now, how do we fix the bug? One easy solution is to check for null:

#nullable enable
Person p = new Person();

if (p.FirstName != null)
{
    Console.WriteLine(p.FirstName.ToUpper());
}

class Person
{
    public string? FirstName { get; set; }
    public string? LastName { get; set; }
}

Now when we compile and run this, the warning is gone. The compiler is smart enough to realise that we added a check that FirstName is not null, so it can reasonably guarantee that when we dereference it when we call .ToUpper() on it, it will not be null.

And of course, we don't get an exception when we run this now.

So they're just... warnings?

Yup. Pretty much. At the moment.

You can read the proposal for this feature where it indicates that this is a very obvious breaking change that needs to be introduced over time. So for now, these are just warnings.

However, they're also not just warnings.

Enter System.Text.Json

For a really long time, pretty much everyone that has needed to do anything JSON related in .NET has used Json.NET, aka Newtonsoft.Json. And for good reason too, it's really simple to get it up and running, so it became ubiquitous and has even been used by Microsoft themselves for multiple parts of the .NET ecosystem, including ASP.NET.

Then, in .NET Core 3.0, System.Text.Json was announced. For a multitude of reasons, Microsoft decided that they needed to stop using Json.NET, and provide their own implementation of JSON processing.

System.Text.Json has evolved since then, and in .NET 6, Microsoft introduced the ability to use source generators with System.Text.Json. The aim of this feature is to improve performance: usually you have to use runtime reflection to do JSON serialization and deserialization. While reflection is not the performance hog we were all warned about years ago, it does still have some consequences around memory allocation and startup time (at least these are the reasons cited by Microsoft in their post about the feature).

I am using this feature in this website: the photos you see on this site are all retrieved from Adobe Creative Cloud's APIs which are all JSON based, which I access using this source generator feature. Performance is not really a concern for me, as I'm not doing the millions and billions of requests that would benefit from any performance gains, I opted for it just because I wanted to learn how to use it.

What does this have to do with nullable reference types?

Well, as I discovered today, a lot. Before I explain the link, I need to show you one more feature about nullable reference types: the required modifier, which was added in C# 11 aka .NET 7. Documentation here.

Let's go back to our original example:

#nullable enable
Person p = new Person();

Console.WriteLine(p.FirstName.ToUpper());

class Person
{
    public string FirstName { get; set; } // Non-nullable property 'FirstName' must contain a non-null value when exiting constructor. Consider declaring the property as nullable.
    public string LastName { get; set; } // Non-nullable property 'LastName' must contain a non-null value when exiting constructor. Consider declaring the property as nullable.
}

There is another way to silence this warning: by marking the properties as required, like so:

#nullable enable
Person p = new Person();

Console.WriteLine(p.FirstName.ToUpper());

class Person
{
    public required string FirstName { get; set; }
    public required string LastName { get; set; }
}

By doing this, we are telling the compiler that for this class to be valid, the property must be initialized. Now we have a red squigly, oh no! An error! Let's have a look at it:

error CS9035: Required member 'Person.FirstName' must be set in the object initializer or attribute constructor.
error CS9035: Required member 'Person.LastName' must be set in the object initializer or attribute constructor.

The compiler knows now that creating a Person without setting FirstName and LastName would be an error, because we told it that those properties are required. So it won't let us do that! We can fix this either by adding a constructor in which we set the properties, or by assigning them with an object initializer, like so:

Person p = new Person() { FirstName = "Logan", LastName = "Dam" };

Are we there yet?

Finally, getting to the point. It turns out that System.Text.Json supports this too. System.Text.Json did not support the concept of required properties at all before .NET 7. Now, it does it in a language integrated way, and initially, I'm quite a fan.

When I started enabling nullable reference types in my project, I actually wasn't expecting much, since I am usually pretty good at remembering that nulls can happen, and my code base is small enough that any issues were not likely to be showstoppers. I started out with all my logic, since there wasn't too much of it. All went well.

Then I turned it on in the JSON POCOs.

I expected some dragons there. I'm working with a publicly documented API, but at the time I built the thing, Adobe's API specification was (and is still) invalid. So auto generating a client was out of the question, and so I built one myself.

It would appear that I made a series of mistakes while doing that, and enabling nullable reference types actually shone some light upon the issue.

Although the last time I had touched the code was over 3 years ago, I generally have a pretty good memory of how stuff works, especially stuff I built, and so I delved in and sprinkled in nullable annotations where I thought they made sense, and I also sprinkled in some required keywords where I thought they made sense.

Surely an Id property is required, right? 🤡

Well, as I found out, not always, especially if your code makes wrong assumptions.

(Rightfully deserved) Explosions

Exception: System.Text.Json.JsonException: JSON deserialization for type 'ldam.co.za.lib.Lightroom.Asset' was missing required properties, including the following: id, type, subtype, updated, created, links, payload

First, can we all just take a moment to appreciate how lovely this error message is? It's telling me exactly what is missing (although the use of the word including hints that it might not be that simple), and it's telling me exactly what type it was trying to deserialize. And isn't it awesome that this is enforced at compile time and at runtime?

Second, what the fuck? I've been using this code for 3 years. Have I just never accessed any of these properties? Actually, I'm certain I've used the links one, because I use that property to navigate the API's paging.

And now you're telling me it's missing?

Turns out, I messed up the structure of the API response for getting the assets within an album. But not enough for it to matter, apparently, because my code has been happily running for the past 3 years without issue.

But I wouldn't have known there was an issue there at all had I not gone and experimented with nullable reference types.

Moral of the story

If you've made it this far, congratulations, go treat yourself to a milkshake, because I've been taking a long and winding route to get here and you must be thirsty by now.

I could talk about how good type safety is (naturally I am a big fan of Typescript) because they're an extra tool in your belt that help you avoid surprises. And that much is true, they help me prove ahead of time that my code is "safe".

But that is another topic for another day, and instead, I want to say this:

Give nullables a shot.

You can turn them on and off as you please. Stick a #nullable enable at the top of a file, and boom, you get extra null checks. Don't want to deal with them anymore? #nullable disable. Or you can go the nuclear option like I did and throw <Nullable>enable</Nullable> in your project file and deal with the fallout all at once.

You might just find yourself exclaiming "how the fuck is this working?" and that in and of itself is a fantastic exercise to a) learn more and b) find sneaky bugs. Also swearing is fun.