Getting started with ML.NET in Jupyter Notebooks

For many years .NET developers have been building classic console, desktop or web applications through a stop-modify-recompile-restart cycle. In machine learning and other data-centric tasks this cycle creates so much overhead an delay that it simply doesn’t make sense. Data scientists involved in data analysis, data preparation, and model training prefer a fully interactive environment that allows mixing content, live source code, and program output in a single (web) page that gives immediate feedback when the data or the code changes.

The Jupyter ecosystem provides such an interactive environment, and there’s good news for .NET developers. The list of language kernels supported by Jupyter -including Python, R, Julia, Matlab and many others- has been extended with .NET Core.

The Jupyter Notebook app enables us today to run on-premise interactive machine learning scenarios with ML.NET using C# or F# in a web browser, without bringing software or hardware costs. In this article we try to get you started with developing and running C# machine learning scenarios in Jupyter notebooks. We’re going to assume that you already know the basics of ML.NET.

Installation

Download

There are many ways to install the environment, but you’ll always need these two ingredients:

  • the Jupiter Notebook (a Python program), and
  • dotnet interactive (formerly known as dotnet try), an version of the .NET Core runtime that allows you to run C# and F# as a scripting language.

Here’s probably the easiest way get operational – although leaner procedures may exist:

  1. Install the latest Anaconda:
    Anaconda
  2. Install the latest .NET Core SDK:
    NetSdk

Configure

On the command line, type the following to install dotnet interactive as a global tool:

dotnet tool install -g dotnet-try

Open the Anaconda Prompt and run the following command to register the .NET kernel in Jupyter:

dotnet try jupyter install

Validate

To verify the installation:

  • open the Anaconda3 menu from the start menu,
  • start the Jupyter Notebook app, and
  • press the ‘new’ button in the top right corner to create a new Notebook.

If .NET C# and .NET F# appear in the list of kernels, then you’re ready to go:

NetKernels

You find the most recent installation guide right here.

First steps

Fire it up

Starting a Notebook spawns a language kernel process with two clients

  • a browser with the IDE to edit and run notebooks, and
  • a terminal window displaying the logs of the associated kernel.

Here’ what to expect on your screen:

NotebookAndServer

Hello World

The browser IDE allows you to maintain a list of so-called cells. Each of these that can host documentation (markup or markdown) or source code (in the language of the kernel you selected).

Jupyter Notebooks code cells accept regular C#. Each cell can be individually ran, while the kernel keeps track of the state.

It’s safe to assume that the canonical “Hello World” would look like this:
HelloWorld1

The C# kernel hosts extra functions to control the output of cells, such as the ‘display()’ function:
HelloWorld2

Display() is more than just syntactic sugar for Console.WriteLine(). It can also render HTML, SVG, and Charts. There’s more info on this function right here.

Jupyter Notebook runs a Read-Eval-Print Loop (a.k.a. REPL). Instructions are executed, and expressions are evaluated against the current state that is maintained in and by the kernel. So instead of an instruction to print a string, you can simply type “Hello World” in a cell (without a semicolon). It will we treated as an expression:
HelloWorld3

Loading NuGet packages

Code cells in Jupyter Notebooks can host instructions, expressions, class definitions and functions, but you can also load dll’s from NuGet packages. The #r instruction loads a NuGet package into the kernel:

NuGet

 

#r and using statements do not have to be written at the top of the document – you can just add them whenever you need them in the notebook.

Doing diagrams

One of the NuGet packages that you definitely want to use is XPlot, a data visualization framework written in F#. It allows you create a huge number of chart types delegating the rendering to well-known open source graphing libraries such as Plotly and Google Charts.

Here’s how easy it is to define and render a chart in a C# Jupyter Notebook:

XPlotSample

You’ll find many more examples here.

Running a Canonical Machine Learning Scenario

Over the last couple of months we have been busy creating an elaborated set of machine learning scenario’s in a UWP sample app. In the last couple of weeks we managed to migrate most of these to the Jupyter Notebook, and added some new ones.

Let’s run through one of the new samples. It predicts the quality of white wine based on 11 physicochemical characteristics. The problem is solved as a regression – in our UWP sample app we solved it as a binary and a multiclass classification.

The sample runs a representative scenario of

  • reading the raw data,
  • preparing the data,
  • training the model,
  • calculating and visualizing the quality metrics,
  • calculating and visualizing the feature contributions, and finally
  • predicting.

In this article we won’t dive into the ML.NET details. Here’s how the full scenario looks like in a Jupyter Notebook. :

Regression1

(some helpers were omitted here)Regression2

Regression3

Regression4

Regression5

Regression6

Here’s how the full web page with source and results rendering looks like.

The Jupyter Notebook provides a much more productive environment to create such a scenario than classic .NET application development does. Let’s dive into some of the reasons and take a deeper look into some Jupyter features that make the ML.NET-Jupyter combo attractive to data scientists.

Let’s focus on Data Analysis

Interactive Diagrams

Data analysis requires many types of diagrams, and Jupyter notebooks makes it easy to define and modify these.

Here’s how to import the XPlot NuGet package and render a simple interactive boxplot chart. The rendered diagram highlights the details of the element under the mouse:

InteractiveDiagram

We created a more elaborated example of boxplot analysis right here. Here’s how the resulting diagram looks like:

ComplexBoxPlot

As part of Principal Component Analysis data scientists may want to draw a heat map with the correlation between different features. Here’s how easy it is to create such a chart:

HeatMap

We created a more elaborated example on the well-know Titanic data set right here. This is the resulting diagram:

TitanicHeatMap

These diagrams and many more come out-of-the-box with XPlot.

Object Formatters

Just as developers, data scientists spend most of their time debugging. They continuously need detailed feedback on the work in progress. We already encountered the new display() function that prints the value of an expression. Jupyter notebooks allow you to override the HTML that is printed for a specific class by registering an ObjectFormatter – something that sits between decorating a class with the DebuggerDisplay attribute and writing a custom debugger visualizer.

Let’s write a small –but very useful- example. Here’s how an instance of a ML.NET ConfusionMatrix is displayed by default:
ConfusionMatrix1

That’s pretty confusing, right? [Yes: pun intended!] Let’s fix this.

The few ObjectFormatter examples that we already encountered, were spawning IHtmlContent instances (from ASP.NET Core) that were created and styled through the so-called PocketView API – for which there is no documentation yet. Here’s how to fetch the list of HTML tags that you can create with it, and a small sample on how to apply a style to them:

var pocketViewTagMethods = typeof(PocketViewTags)
    .GetProperties()
    .Select(m => m.Name);
display(pocketViewTagMethods);

var pocketView = table[style: "width: 100%"](tr(td[style:"border: 1px solid black"]("Hello!")));
display(pocketView);

Here’s the result:

PocketView

Here’s how to register a formatter that nicely displays a table with the look-and-feel that a data analyst expects for a (binary!) confusion matrix:

Formatter<ConfusionMatrix>.Register((df, writer) =>
{
    var rows = new List<IHtmlContent>();

    var cells = new List<IHtmlContent>();
    var n = df.Counts[0][0] + df.Counts[0][1] + df.Counts[1][0] + df.Counts[1][1];
    cells.Add(td[rowspan: 2, colspan: 2, style: "text-align: center; background-color: transparent"]("n = " + n));
    cells.Add(td[colspan: 2, style: "border: 1px solid black; text-align: center; padding: 24px; background-color: lightsteelblue"](b("Predicted")));
    rows.Add(tr[style: "background-color: transparent"](cells));
    
    cells = new List<IHtmlContent>();
    cells.Add(td[style:"border: 1px solid black; padding: 24px; background-color: #E3EAF3"](b("True")));
    cells.Add(td[style:"border: 1px solid black; padding: 24px; background-color: #E3EAF3"](b("False")));
    rows.Add(tr[style: "background-color: transparent"](cells));
    
    cells = new List<IHtmlContent>();
    cells.Add(td[rowspan: 2, style:"border: 1px solid black; text-align: center; padding: 24px;  background-color: lightsteelblue"](b("Actual")));
    cells.Add(td[style:"border: 1px solid black; text-align: center; padding: 24px; background-color: #E3EAF3"](b("True")));    
    cells.Add(td[style:"border: 1px solid black; padding: 24px"](df.Counts[0][0]));
    cells.Add(td[style:"border: 1px solid black; padding: 24px"](df.Counts[0][1]));
    rows.Add(tr[style: "background-color: transparent"](cells));
    
    cells = new List<IHtmlContent>();
    cells.Add(td[style:"border: 1px solid black; text-align: center; padding: 24px; background-color: #E3EAF3"](b("False")));
    cells.Add(td[style:"border: 1px solid black; padding: 24px"](df.Counts[1][0]));
    cells.Add(td[style:"border: 1px solid black; padding: 24px"](df.Counts[1][1]));
    rows.Add(tr(cells));

    var t = table(
        tbody(
            rows));

    writer.Write(t);
}, "text/html");

Here’s how a confusion matrix now looks like – much more intuitive:
ConfusionMatrix2

DataFrame

It’s not always easy to prepare the data for a machine learning pipeline or a diagram, and this is where the new DataFrame API comes in. DataFrame allows you to manipulate tabular in-memory data in a spreadsheet way: you can select, add, and/or filter rows and columns, apply formulas and so on. Here’s how to pull in the NuGet package, and add a custom formatter for the base class. DataFrame is currently only in version 0.2 so you may expect some changes. You may also expect the object formatters to be embedded in future releases:

 

DataFrame1

The DataFrame class knows how to read input data from a CSV:

DataFrame2

In our binary classification sample we use some of the DataFrame methods to replace the “Quality” column holding the taster’s evaluation score (a number from 1 to 10) by a “Label” column with the Boolean indicating whether the wine is good or not (i.e. the score was 6 or higher). Here’s how amazingly easy this is:

var labelCol = trainingData["Quality"].ElementwiseGreaterThanOrEqual(6);
labelCol.SetName("Label");
trainingData.Columns.Add(labelCol);
trainingData.Columns.Remove(trainingData["Quality"]);

Here’s the result (compare the last column with the one in the previous screenshot):

DataFrame3

Since there’s no documentation available yet, we had to dig into the C# source code.

DataFrame is a very promising new .NET API for tabular data manipulation, useful in machine learning and other scenarios.

Sharing is Caring

The source code of our Jupyter notebooks lives here on GitHub (if you like it then you should have put a Star on it). The easiest way to explore the code and the rendered results is via nbViewer.

We also invite you to take a look at these other ML.NET-Jupyter samples from Microsoft and from fellow MVP Alexander Slotte.

Enjoy!

Introducing WinUI ItemsRepeater and Friends

In this article we’ll build a fluent NetFlix-inspired single-page UWP app on top of iTunes Movie Trailers data. The app uses some of the newer WinUI controls such as ItemsRepeater, TeachingTip, CommandBarFlyout, as well as classic XAML elements like MediaPlayerElement, acrylic brushes and animation. 

The app allows you to scroll horizontally and vertically through a list of movies per genre, select a movie, and play its trailer. The app is fully functional in touch, mouse, and keyboard mode. Here’s how it looks like – we decided to call it XamlFlix (the runners-up names were .NET Flix and WinUI tunes):

WinUISample

In a previous blog post we described WinUI as the future of XAML development.  For more details, take a look at this great session on “Windows App Development Roadmap: Making Sense of WinUI, UWP, Win32, .NET” from the Ignite event. For a summary of this session, check Paul Thurrot’s article from which we borrowed the following illustration:

win10-app-platform

Getting the data

Our sample app starts with getting the most recent iTunes Movie Trailers content. It is exposed as a public XML document that looks like this:

iTunesXml

We defined a class to represent a movie, with its title and the whereabouts of its poster image and QuickTime trailer:

public class Movie
{
    public string Title { get; set; }

    public string PosterUrl { get; set; }

    public string TrailerUrl { get; set; }
}

For convenience, movies are grouped by genre. Here’s the corresponding class:

public class Genre
{
    public string Name { get; set; }

    public List<Movie> Movies { get; set; }
}

Using HttpClient.GetStringAsyc() we fetch the XML from the internet. Then we Parse() it into an XDocument. First we get the genre names through a fancy XPATH expression using XPathSelectElements():

using (var client = new HttpClient())
{
    xml = await client.GetStringAsync("http://trailers.apple.com/trailers/home/xml/current.xml");
}

var movies = XDocument.Parse(xml);

var genreNames = movies.XPathSelectElements("//genre/name")
                .Select(m => m.Value)
                .OrderBy(m => m)
                .Distinct()
                .ToList();

Then we use some more of this XPATH and LINQ magic (Eat my shorts, JSON!) to query the movies per genre –a movie may appear in more than one genre- and immediately populate all the element collections:

foreach (var genreName in genreNames)
{
    _genres.Add(new Genre()
    {
        Name = genreName,
        Movies = movies.XPathSelectElements("//genre[name='" + genreName + "']")
            .Ancestors("movieinfo")
            .Select(m => new Movie()
            {
                Title = m.XPathSelectElement("info/title").Value,
                PosterUrl = m.XPathSelectElement("poster/xlarge").Value,
                TrailerUrl = m.XPathSelectElement("preview/large").Value
            })
            //.OrderBy(m => m.Title)
            .ToList()
    });
}

Here’s the DataTemplate that represents a Movie in the app. It’s just an image that we made clickable and focusable by placing it inside a button:

        <DataTemplate x:Key="MovieTemplate"
                      x:DataType="local:Movie">
            <Button Click="Movie_Click"
                    DataContext="{x:Bind}"
                    BorderThickness="0"
                    Padding="0"
                    CornerRadius="0">
                <Image Source="{x:Bind PosterUrl}"
                       PointerEntered="Element_PointerEntered"
                       PointerExited="Element_PointerExited"
                       Height="360"
                       Stretch="UniformToFill"
                       HorizontalAlignment="Stretch"
                       VerticalAlignment="Stretch"
                       ToolTipService.ToolTip="{x:Bind Title}">
                </Image>
            </Button>
        </DataTemplate>

Adding some WinUI components

Let’s bring in some WinUI components to create the visual foundation of the app. These components live in the Microsoft.UI.Xaml NuGet package:

WinUINuGet

After adding the NuGet package, don’t forget to import its styles and other WinUI resources. App.xaml is the best place to do this:

<Application.Resources>
    <ResourceDictionary>
        <ResourceDictionary.MergedDictionaries>
            <XamlControlsResources xmlns="using:Microsoft.UI.Xaml.Controls" />
            <!-- Other merged dictionaries here -->
        </ResourceDictionary.MergedDictionaries>
        <!-- Other app resources here -->
        <x:Double x:Key="ContentDialogMaxWidth">800</x:Double>
    </ResourceDictionary>
</Application.Resources>

Here comes ItemsRepeater

XamlFlix is built around collections: it displays a collection of genres that each display a collection of movies. ItemsRepeater is an ideal host for this: it’s a WinUI element that is designed to be used inside custom controls that display collections. It does not come with a default UI and it provides no policy around focus, selection, or user interaction. It supports virtualization so it allows you to deal with very large collections out of the box.

The main beef of the app’s UI are ItemsRepeater controls: there’s one that repeats the genres vertically, and genre comes a horizontal repeater on the movies.

We started with defining two  reusable StackLayout resources: 

<controls:StackLayout x:Key="HorizontalStackLayout"
                        Orientation="Horizontal" />
<controls:StackLayout x:Key="VerticalStackLayout"
                        Orientation="Vertical"
                        Spacing="0" />

An ItemsRepeater is a data-driven panel that does not come with its own scrolling infrastructure, so you may need to wrap it in a ScrollViewer. Here’s the declaration of the repeater for the Genre instances:

<ScrollViewer VerticalScrollBarVisibility="Auto"
                VerticalScrollMode="Auto"
                HorizontalScrollMode="Disabled"
                Grid.Row="2">
    <controls:ItemsRepeater x:Name="GenreRepeater"
                            ItemTemplate="{StaticResource GenreTemplate}"
                            Layout="{StaticResource VerticalStackLayout}"
                            HorizontalAlignment="Stretch"
                            VerticalAlignment="Stretch" />
</ScrollViewer>

Each genre has its name displayed on top of a horizontal list of Movie instances – again implemented as an ItemsRepeater inside a ScrollViewer:

<DataTemplate x:Key="GenreTemplate"
                x:DataType="local:Genre">
    <Grid>
        <Grid.RowDefinitions>
            <RowDefinition Height="auto" />
            <RowDefinition Height="*" />
        </Grid.RowDefinitions>
        <Image VerticalAlignment="Stretch"
                HorizontalAlignment="Left"
                Margin="6 0 0 0"
                Width="52"
                Source="/Assets/FilmStrip.png"
                Stretch="Fill"
                Grid.RowSpan="2" />
        <TextBlock Foreground="Silver"
                    FontSize="36"
                    Margin="66 0 0 0"
                    Text="{x:Bind Name}" />
        <ScrollViewer HorizontalScrollBarVisibility="Visible"
                        HorizontalScrollMode="Enabled"
                        VerticalScrollMode="Disabled"
                        Margin="66 0 0 0"
                        Grid.Row="1">
            <controls:ItemsRepeater ItemsSource="{x:Bind Movies}"
                                    ItemTemplate="{StaticResource MovieTemplate}"
                                    Layout="{StaticResource HorizontalStackLayout}" />
        </ScrollViewer>
    </Grid>
</DataTemplate>

After the content was loaded, we set its items source programmatically:

GenreRepeater.ItemsSource = Genres

The page then looks like this:

ItemsRepeater

Let’s move our focus to the behavior now.

On a touch screen the page reacts appropriately to horizontal and vertical panning. Mouse scrolling is a bit problematic: the movie repeaters take almost all the screen area and hence they get the mouse input and only trigger horizontal scrolling. So we decided to place a film strip on the left. It creates an area that facilitates vertical scrolling through the genres.

Hello TeachingTip

The mouse behavior is not very intuitive for first-time users, so we added a TeachingTip –a WinUI Flyout with an arrow- to draw the user’s attention to the film strip on the left.

Here’s its XAML declaration:

<controls:TeachingTip x:Name="ScrollTeachingTip"
                        Target="{x:Bind FakeTarget}"
                        Title="I looks like you are trying to scroll."
                        Subtitle="With the mouse on this filmstrip you'll scroll vertically."
                        PreferredPlacement="Right"
                        IsOpen="False">
    <controls:TeachingTip.IconSource>
        <controls:SymbolIconSource Symbol="Sort" />
    </controls:TeachingTip.IconSource>
</controls:TeachingTip>

The TeachingTip needs to point to the film strip as its a Target. While visually the strip looks like a whole, it’s actually composed of multiple images: it’s repeated per genre. That makes it hard to use as target for the teaching tip – at least declaratively.

We provided an alternative Target by placing a UI-less control center left of the page:

<ContentControl x:Name="FakeTarget"
                VerticalAlignment="Center"
                HorizontalAlignment="Left"
                Width="40"
                Grid.Row="2" />

Here’s the result, the teaching tip points right to the middle of the film strip:

TeachingTip

We couldn’t resist creating a version with an image of Clippy as HeroContent:

TeachingTipClippy

It doesn’t make sense to pop-up the tip while the movie images are still loading, so we delayed its appearance:

await Task.Delay(2000);
ScrollTeachingTip.IsOpen = true;

We covered both touch and mouse input for navigation. Let’s now take a look at keyboard input.

To our great surprise we observed that the page already behaved properly on keyboard input. Here’s how navigation with the arrow keys looks like – not bad for a lightweight control that does not support selection. The visual border around the movie is the focus rectangle of the button in the Movie data template:

ArrowNavigation

To highlight the movie under the mouse cursor, our first attempt was a ToolTip with the title. It’s hardly visible and it feels like HTML. Here’s an example (hint: it’s on Spice in disguise):

Tooltip

We went for a more fluent experience. When digging through the awesome XAML Controls Gallery we found this nice sample that animates the size of a button on hover (don’t try it out here, it’s just a screenshot):

XamlControlsGalleryAnimation

We decided to subtly grow and shrink the image under the mouse cursor on hovering. It only required a straight copy/paste from the gallery sample code. The code uses the Compositor to hook a SpringVector3NaturalMotionAnimation to the scale of the image. We increase it by 2% on PointerEntered and then shrink back on PointerExited:

private Compositor _compositor = Window.Current.Compositor;
private SpringVector3NaturalMotionAnimation _springAnimation;

private void CreateOrUpdateSpringAnimation(float finalValue)
{
    if (_springAnimation == null)
    {
        _springAnimation = _compositor.CreateSpringVector3Animation();
        _springAnimation.Target = "Scale";
    }

    _springAnimation.FinalValue = new Vector3(finalValue);
}

private void Element_PointerEntered(object sender, PointerRoutedEventArgs e)
{
    // Scale up a little.
    CreateOrUpdateSpringAnimation(1.02f);

    (sender as UIElement).StartAnimation(_springAnimation);
}

private void Element_PointerExited(object sender, PointerRoutedEventArgs e)
{
    // Scale back down.
    CreateOrUpdateSpringAnimation(1.0f);

    (sender as UIElement).StartAnimation(_springAnimation);
}

Here’s how the result looks like – the real animation is a lot smoother that the animated gif suggests:

ScaleAnimation

That looks decent, no? Let’s add some more functionality now.

Introducing CommandBarFlyout

The CommandBarFlyout is another WinUI control that lives up to its name: it is literally a CommandBar in a Flyout. It groups AppBarButton instances in primary and secondary commands that are applicable to a specific UI element.

XamlFlix uses a CommandBarFlyout to display the possible actions for the selected movie: play, buy, rate, …. For the sake of simplicity we hooked them all to the same event handler: whatever menu you select, you always get to play the movie trailer.

Here’s the declaration of the movie menu:

<controls:CommandBarFlyout x:Name="MovieCommands"
                            Placement="Right">
    <AppBarButton Label="Play"
                    Icon="Play"
                    ToolTipService.ToolTip="Play"
                    Click="Element_Click" />
    <AppBarButton Label="Info"
                    Icon="List"
                    ToolTipService.ToolTip="Info"
                    Click="Element_Click" />
    <AppBarButton Label="Download"
                    Icon="Download"
                    ToolTipService.ToolTip="Download"
                    Click="Element_Click" />
    <controls:CommandBarFlyout.SecondaryCommands>
        <AppBarButton Label="Buy"
                        Click="Element_Click" />
        <AppBarButton Label="Rate"
                        Click="Element_Click" />
    </controls:CommandBarFlyout.SecondaryCommands>
</controls:CommandBarFlyout>

The control comes with several FlyoutShowOptions to configure position, placement, and behavior. Here we define it to appear on top of the image, in an expanded state and grabbing the focus. The ShowAt() method is called when a button in a movie data template is clicked. It opens the menu for the targeted UI element:

private void Movie_Click(object sender, RoutedEventArgs e)
{
    FlyoutShowOptions options = new FlyoutShowOptions();
    options.ShowMode = FlyoutShowMode.Standard;
    options.Placement = FlyoutPlacementMode.Top;

    MovieCommands.ShowAt(sender as FrameworkElement, options);
}

This is how the menu looks like in XamlFlix (on top of Top Gun):

CommandBarFlyout

Here’s MediaPlayerElement

For playing the movie trailer there are not too much options. We went for the classic UWP MediaPlayerElement and placed it in a ContentDialog:

<ContentDialog x:Name="MediaPlayerDialog"
                Closing="MediaPlayerDialog_Closing"
                CloseButtonText="Close">
    <StackPanel>
        <TextBlock x:Name="TitleText"
                    Margin="0 0 0 20" />
        <MediaPlayerElement x:Name="Player"
                            AreTransportControlsEnabled="True"
                            MinWidth="600"
                            AutoPlay="True" />
    </StackPanel>
</ContentDialog>

The dialog needs more space than the maximum of 548 pixels that a popup normally gets in UWP, so you have to override ContentDialogMaxWidth in app.xaml:

<x:Double x:Key="ContentDialogMaxWidth">800</x:Double>

When an element of the command bar flyout is clicked, we open the dialog and create a MediaSource from the trailer’s URL. We need to Hide() the command bar flyout, since it lives in the same layer as the dialog:

private async void Element_Click(object sender, RoutedEventArgs e)
{
    // It stays on top of the dialog.
    MovieCommands.Hide();

    var movie = (sender as FrameworkElement)?.DataContext as Movie;
    var source = MediaSource.CreateFromUri(new Uri(movie.TrailerUrl));

    TitleText.Text = movie.Title;
    Player.Source = source;
    await MediaPlayerDialog.ShowAsync();
}

private void MediaPlayerDialog_Closing(ContentDialog sender, ContentDialogClosingEventArgs args)
{
    // Prevent the player to continue playing.
    Player.Source = null;
}

Here’s the resulting UI:

MediaPlayer

The styles of the current WinUI v2.2 do not apply to ContentDialog yet, that’s why the dialog itself and its buttons have no rounded corners. Don’t worry: this will change in WinUI v2.3.

Let’s sprinkle some Acrylic

These days, fluent apps need a touch of acrylic material. The XamlFlix UI area is almost entirely covered with movie poster images, so we decided to use an AcrylicBrush for the whole background:

<Page.Background>
    <AcrylicBrush BackgroundSource="HostBackdrop"
                    TintColor="{ThemeResource SystemColorBackgroundColor}"
                    TintOpacity="0.9"
                    FallbackColor="{ThemeResource ApplicationPageBackgroundThemeBrush}" />
</Page.Background>

We blended that background into the title bar:

private void ExtendAcrylicIntoTitleBar()
{
    CoreApplication.GetCurrentView().TitleBar.ExtendViewIntoTitleBar = true;
    ApplicationViewTitleBar titleBar = ApplicationView.GetForCurrentView().TitleBar;
    titleBar.ButtonBackgroundColor = Colors.Transparent;
    titleBar.ButtonInactiveBackgroundColor = Colors.Transparent;
}

In this setting Windows only draws the system buttons, and makes you responsible for displaying the app title. Here’s an appropriate (and reusable) window title declaration:

<TextBlock xmlns:appmodel="using:Windows.ApplicationModel"
            Text="{x:Bind appmodel:Package.Current.DisplayName}"
            Style="{StaticResource CaptionTextBlockStyle}"
            IsHitTestVisible="False"
            Margin="12 8 0 0" />

The Code

The XamlFlix sample app lives here on GitHub. For more WinUI samples also check the XAML Controls Gallery and the Windows Community Toolkit Sample App in the Store (sources are also on GitHub).

Building explainable Machine Learning models with ML.NET in UWP

In this article we’ll describe how to design and build explainable machine learning models with ML.NET. We used a UWP app to host these models, and OxyPlot to create the diagrams to visualize the importance of features for a model and/or a prediction.

We will discuss the following three tasks that you can execute with the ML.NET Explainability API:

  • calculate the feature weights for a linear regression model,
  • calculate the feature contributions for a prediction, and
  • calculate Permutation Feature Importance (PFI) for a model.

The two corresponding pages it the UWP sample app look like this:

PredictionFeatureContribution

PfiCalculation

In many Machine Learning scenarios it is not only important to have a model that is accurate enough, but also to have one that is interpretable. In Health Care or Finance, models should be transparent enough to explain why they made a particular prediction. Conclusions like “You are 75% healthy” or “Your loan application was not approved” may require a better excuse than “because the computer says so”. Model explainability –knowing the importance of its features- is not only useful in justifying predictions, but also in refining the model itself – through feature selection. Investigating explainability allows you to remove features that are not significant for a model, so that you probably end up with shorter training times and less resource intensive prediction making.

The code in this article is built around the 11-features white wine quality dataset that we already covered several times in this article series. We solved it as a binary classification problem and as a multiclass classification problem for AutoML. This time we finally approach it the correct way: as a regression problem where the outcome is a (continuous) numerical value – the score– within a range. We’ll use an Stochastic Dual Coordinate Ascent regression trainer as the core of the model pipeline. Let’s first build and train that model.

Here’s the class to store the input data:

public class FeatureContributionData
{
    [LoadColumn(0)]
    public float FixedAcidity;

    [LoadColumn(1)]
    public float VolatileAcidity;

    [LoadColumn(2)]
    public float CitricAcid;

    // More features ...

    [LoadColumn(9)]
    public float Sulphates;

    [LoadColumn(10)]
    public float Alcohol;

    [LoadColumn(11), ColumnName("Label")]
    public float Label;
}

As in any ML.NET scenario we need to instantiate an MLContext:

public MLContext MLContext { get; } = new MLContext(seed: null);

Here’s the code to build and train the model. We’ll refine it later in several ways to enhance its explainability:

private IEnumerable<FeatureContributionData> _trainData;
private IDataView _transformedData;
private ITransformer _transformationModel;
private RegressionPredictionTransformer<LinearRegressionModelParameters> _regressionModel;

public List<float> BuildAndTrain(string trainingDataPath)
{
    IEstimator<ITransformer> pipeline =
        MLContext.Transforms.ReplaceMissingValues(
            outputColumnName: "FixedAcidity",
            replacementMode: MissingValueReplacingEstimator.ReplacementMode.Mean)
        .Append(MLContext.Transforms.Concatenate("Features",
            new[]
            {
                "FixedAcidity",
                "VolatileAcidity",
                "CitricAcid",
                "ResidualSugar",
                "Chlorides",
                "FreeSulfurDioxide",
                "TotalSulfurDioxide",
                "Density",
                "Ph",
                "Sulphates",
                "Alcohol"}))
        .Append(MLContext.Transforms.NormalizeMeanVariance("Features"));

    var trainData = MLContext.Data.LoadFromTextFile<FeatureContributionData>(
            path: trainingDataPath,
            separatorChar: ';',
            hasHeader: true);

    // Keep the data avalailable.
    _trainData = MLContext.Data.CreateEnumerable<FeatureContributionData>(trainData, true);

    // Cache the data view in memory. For an iterative algorithm such as SDCA this makes a huge difference.
    trainData = MLContext.Data.Cache(trainData);

    _transformationModel = pipeline.Fit(trainData);

    // Prepare the data for the algorithm.
    _transformedData = _transformationModel.Transform(trainData);

    // Choose a regression algorithm.
    var algorithm = MLContext.Regression.Trainers.Sdca();

    // Train the model and score it on the transformed data.
    _regressionModel = algorithm.Fit(_transformedData);

    // ...
}

Which features are hot, and which are not

Feature Weights in linear models

Machine Learning has several techniques for calculating how important features are in explaining/justifying the prediction. When your main algorithm is a linear classifier (e.g. linear regression) then it’s relatively easy to calculate feature contributions. The prediction is the linear combination of the features values, weighted by the model coefficients. So at model level there’s already a notion of feature contribution. In ML.NET these Weights are found in the LinearModelParameters class. This is the base class for all linear model parameter classes, like LinearRegressionModelParameters (used by SDCA) and OlsModelParameters (used by the Ordinary Least Squares trainer).

For any linear model, you can fetch the overall feature weights like this:

// Return the weights.
return _regressionModel.Model.Weights.ToList();

Here’s how the weights for the wine quality model look like in a diagram:

ModelFeatureContribution

The alcohol feature seems to dominate this particular model. Its weight is positive: a higher alcohol percentage results in a higher appreciation score. We may have just scientifically proven that alcohol is important in wine quality perception. In other news there are also at least three characteristics that could be ignored in this model – and maybe up to seven.

Feature Contribution Calculation

A second component in ML.NET’s Explainability API is a calculator that computes the list of feature contributions for a specific prediction: the FeatureContributionCalculator. Just like the overall feature weights from the previous section, the calculated feature contributions can be positive or negative. The calculator works for a large number of regression, binary classification, and ranking algorithms. The documentation page on the FeatureContributionCalculatingEstimator class contains a list of all compatible trainers. It includes

  • all linear models – because they inherently come with feature weights,
  • all Generalized Additive Models (GAM) – because their squiggly, wiggly shape functions are created by combining linear models, and
  • all tree based models – because they can calculate feature importance based on the values in the decision paths in the tree(s).

Thanks to ML.NET’s modular approach it’s easy to plug a feature contribution calculator into a model pipeline, even if the model is already trained. Here’s how we did this in the sample app.

First we extended the prediction data structure (which originally had only the Score) with an array to hold the contribution values for each feature:

public class FeatureContributionPrediction : FeatureContributionData
{
    public float Score { get; set; }

    public float[] FeatureContributions { get; set; }
}

Using the CalculateFeatureContribution() method we created and estimator, and trained it on just one input sample to become a transformer. This adds the calculation to the pipeline and the feature contributions to the output schema. The transformer was then appended to the trained model.

Here’s how that looks like in C#:

private PredictionEngine<FeatureContributionData, FeatureContributionPrediction> _predictionEngine;

public void CreatePredictionModel()
{
    // Get one row of sample data.
    var regressionData = _regressionModel.Transform(MLContext.Data.TakeRows(_transformedData, 1));

    // Define a feature contribution calculator for all the features.
    // 'Train' it on the sample row.
    var featureContributionCalculator = MLContext.Transforms
        .CalculateFeatureContribution(_regressionModel, normalize: false)
        .Fit(regressionData);

    // Create the full transformer chain.
    var scoringPipeline = _transformationModel
        .Append(_regressionModel)
        .Append(featureContributionCalculator);

    // Create the prediction engine.
    _predictionEngine = MLContext.Model.CreatePredictionEngine<FeatureContributionData, FeatureContributionPrediction>(scoringPipeline);
}

For testing the model, we didn’t dare to bother you with an input form for 11 features. Instead we added a button that randomly fetches one of the almost 4000 training samples and calculates the score and feature contributions for that sample.

Here’s the code behind this button:

public FeatureContributionPrediction GetRandomPrediction()
{
    return _predictionEngine.Predict(_trainData.ElementAt(new Random().Next(3918)));
}

Here’s how the results look like in a plot. This is an example of a linear model, so we can compare the overall model weights with the feature contributions for the specific prediction. We did not normalize the results in the feature contribution calculator configuration (the second parameter in the call) to keep these in the same range as the model weights:

PredictionFeatureContribution

Feature Contribution Calculation also works for models that don’t come with overall weights. Here’s how the results look like for a LightGBM regression trainer – one of the decision tree based algorithms:

LightGbmFeatureContribution

Check the comments in the sample app source code for its details. Also in comments, is the code for using an Ordinary Least Squares linear regression trainer. This 18th century algorithm is one is even more biased towards alcohol than our initial SDCA trainer.

For yet another example, check this official sample that adds explainability to a model covering the classic Taxi Fare Prediction scenario.

Permutation Feature Importance

The last calculator in the ML.NET Explainability API is the most computationally expensive one. It calculates the Permutation Feature Importance (PFI). Here’s how PFI calculation works:

  1. A baseline model is trained and its main quality metrics (accuracy, R squared, …) are recorded.
  2. The values of one feature are shuffled or partly replaced by random values – to undermine the relationship between the feature and the score.
  3. The modified data set is passed to the model to get new predictions and new values for the quality metrics. The result is expected to be worse than the baseline. If your model got better on random data then there was definitely something wrong with it.
  4. The feature importance is calculated as the degradation of a selected quality metric versus the one in the baseline.
  5. Steps 2, 3, and 4 are repeated for each feature so that the respective degradations can be compared: the more degradation for a feature, the more the model depends on that feature.

In ML.NET you fire up this process with a call to the PermutationFeatureImportance() method. You need to provide the model, the baseline data set, and the number of permutations – i.e. the number of feature values to replace:

// ... (same as the previous sample)

// Prepare the data for the algorithm.
var transformedData = transformationModel.Transform(trainData);

// Choose a regression algorithm.
var algorithm = MLContext.Regression.Trainers.Sdca();

// Train the model and score it on the transformed data.
var regressionModel = algorithm.Fit(transformedData);

// Calculate the PFI metrics.
var permutationMetrics = MLContext.Regression.PermutationFeatureImportance(
    regressionModel, 
    transformedData, 
    permutationCount: 50);

The call returns an array of quality metric statistics for the model type. For a regression model it’s an array of RegressionMetricStatistics instances – each holding summary statistics over multiple observations of RegressionMetrics. In the sample app we decided R Squared to be the most important quality metric. So the decrease in this value determines feature importance.

We defined a data structure to hold the result:

public class FeatureImportance
{
    public string Name { get; set; }
    public double R2Decrease { get; set; }
}

The list of feature importances is created from the result of the call and visualized in a diagram. Since we decided that R Squared is the main quality metric for our model, we used RSquared.Mean (the mean decrease in R Squared during the calculation) as the target value for feature importance:

for (int i = 0; i < permutationMetrics.Length; i++)
{
    result[i].R2Decrease = permutationMetrics[i].RSquared.Mean;
}

Here’s how the plot looks like:

PfiCalculation

 

We used the same regression algorithm –SDCA- as in the previous samples, so the addiction to alcohol (the feature!) should not come as a surprise anymore.

Here’s a detailed view on one of the metric statistics when debugging:

PrivateMetricsStatistics

That is an awful lot of information. Unfortunately the current API keeps most of the details private – well, you can always use Reflection. Access to more details would allow us to create more insightful diagrams like this one (source). It does not only show the mean, but also min and max values during the calculation:

importance-bike-1

For another example of explaining model predictions using Permutation Feature Importance, check this official ML.NET sample that covers house pricing.

The Source

In this article we described three components of the ML.NET Explainability API that may help you to create machine learning models that have the ability clarify their predictions. The UWP sample app hosts many more ML.NET scenario’s. It lives here on GitHub.

Enjoy!

A Lap around the WinUI TeachingTip Control

In this article we will run through a couple of scenarios using the UWP TeachingTip control. This relatively new control is an animated flyout that according to the documentation “draws the user’s attention on new or important updates and features, remind a user of nonessential options that would improve the experience, or teach a user how a task should be completed”.

The high quality of this official documentation made us decide to skip a high-level introduction and to immediately expose the TeachingTip control to some more challenging ‘enterprise-ish’ scenarios. Here are the things you can expect us to cover in this article:

  • programmatically creating a TeachingTip,
  • precision targeting a XAML control,
  • state management,
  • auto-hiding a TeachingTip on time-out and navigation, and
  • building an inherited control.

We also added a sample that expresses our concerns on the light dismiss behavior, and identified an interesting use case for the TeachingTip as ‘Form Field Wizard’. Since all the official samples use the light theme, we decided to go for dark – with a custom highlighted border.

As usual we built a small sample app, this is how it looks like:

HomePage

The TeachingTip control is shipped as a part of the Windows UI Library (a.k.a. WinUI) which is mostly known for its down-level compatibility versions of the official native UWP controls. The Windows UI Library also hosts brand new controls that aren’t shipped as part of the default Windows platform. Here are some of the controls that were shipped with WinUI 2.1 in April 2019:

The latest WinUI 2.2 release from September 2019 introduced a very promising TabView control – and slightly rounded corners for all controls. In the beginning of 2020 also an Edge Chromium based WebView is expected. 

Getting started with the Windows UI Library

The WinUI toolkit is available as NuGet packages that can be added to any existing or new project. Make sure not to forget to add the Windows UI Theme Resources to your App.xaml resources – as explained in the Getting Started guide. Here’s how this is done in the sample app:

<Application.Resources>
    <ResourceDictionary>
        <ResourceDictionary.MergedDictionaries>
            <!-- Win UI Controls -->
            <XamlControlsResources xmlns="using:Microsoft.UI.Xaml.Controls" />
            <!-- Other Dictionaries and Styles -->
            <!-- ... -->
        </ResourceDictionary.MergedDictionaries>
    </ResourceDictionary>
</Application.Resources>

WinUI exposes an extended API and a set of new controls. Currently these controls live in the Microsoft.UI.Xaml namespace to distinct them from the classic (future legacy?) UWP controls in Windows.UI.Xaml. IntelliSense will reveal some of the some doubles:

ControlsList

According to the roadmap WinUI’s intention is to decouple the entire XAML stack from the rest of UWP, to ship it as a NuGet package. UWP developers will observe that existing UWP XAML APIs (currently shipping as part of the OS) will fade out to no longer receive new feature updates – only security updates and critical fixes. Other UWP features such as application and security model, media pipeline, shell integrations, and broad device support will continue to evolve. WPF, Windows Forms and MFC developers will observe that WinUI is going to absorb (or render obsolete?) XAML Islands, a feature that is currently available through the Windows Community Toolkit.

Here’s an illustration of the current and target architecture:

roadmap_winui2

roadmap_winui3

The TeachingTip Control

A teaching tip is a control that provides contextual information, often used for informing, reminding, and teaching users about important and new features. Visually the TeachingTip is a Flyout with a Tail (an arrow pointing to its Target) and smooth opening and closing animations (love these!). The Teaching Tip is truly an Open Source control: it has been on GitHub from its proposal and specifications to its C++ source code. The TeachingTip documentation will get you started on the fly, and the API is simple. It’s safe to infer from the state of the related GitHub Issues that the control is new but stable.

It’s time to make our hands dirty.

Programmatically instantiating a TeachingTip

Most of our sample pages come with a Home page that describes the sample, and a Main page with all the action. One of the TeachingTip’s missions is to get the user started or to point out new features, so we decided to immediately open one on the Home page when the app starts. Here’s the C# code to define a TeachingTip programmatically, and hook it into the visual tree:

_mainMenuTeachingTip = new TeachingTip
{
    Target = glyph,
    Title = "Welcome",
    Content = "The Main page is where all the action is.",
    HeroContent = new Image
    {
        Source = new BitmapImage(new Uri("ms-appx:///Assets/MindFlayer.jpg"))
    },
    PreferredPlacement = TeachingTipPlacementMode.BottomRight,
    IsOpen = true,
    BorderThickness = new Thickness(.5),
    BorderBrush = new SolidColorBrush(Colors.DarkRed)
};
_mainMenuTeachingTip.Closed += MainMenuTeachingTip_Closed;

contentGrid.Children.Add(_mainMenuTeachingTip);

Precision Targeting

A TeachingTip is a Flyout, with an optional arrow that points to its Target. The positioning sample on the Main page displays what the impact is of PreferredPlacement for targeted and non-targeted teaching tips. Here’s a screenshot of it:

Positioning

Keep in mind that the TeachingTip will find its own place when there is not enough room for it. Here’s what happens when we specify LeftBottom when there’s no space there:

PositioningOverride

If you aim for visual perfection, you have to identify the exact Target that you want your TeachingTip to point its arrow to. In our sample app we wanted to target a menu item – a templated list item. Targeting the list item itself produced a rather fuzzy result, because of the default position calculation and the fact that TeachingTip never touches its Target. So we decided to target the icon inside the menu item instead of the menu item itself.

In most samples the Target is declared in XAML referencing a static control on the page. In most real life scenarios you will probably programmatically dig through a complex dynamic XAML structure, relying on things like the VisualTreeHelper, the GetChild() and FindName() methods, and looking up a XAML element that corresponds to an item in a list, with ContainerFromIndex().

Here’s how the sample app looks up the menu icon to be set as Target:

// Find the Main menu item and the Content grid.
var shell = (Window.Current.Content as Frame)?.Content as FrameworkElement;
var contentGrid = shell?.FindName("ContentGrid") as Grid;
var menu = shell?.FindName("Menu") as ListView;
var mainPageMenu = menu?.ContainerFromIndex(1) as ListViewItem;

// Find the Icon.
var itemsPresenter = ((VisualTreeHelper.GetChild(mainPageMenu, 0)) as FrameworkElement);
var stackPanel = ((VisualTreeHelper.GetChild(itemsPresenter, 0)) as FrameworkElement);
var glyph = stackPanel?.FindName("Glyph") as FrameworkElement;

State management

We assume that you liked the animated Teachingtip when opening the app for the first time, and perhaps the second time too. Just be aware that automatically opening TeachingTips becomes boring and annoying very rapidly to the end user. It really makes sense to let your remember that it showed a tip a couple of times, and then hide it forever.

To enhance this type of annoyment we made a second TeachingTip that opens when the first one raises its Closed event:

ReplayTip

We remember whether or not the TeachingTip has been displayed in an entry in the LocalSettings of the current ApplicationData:

public static void DisplayReplayButtonTip()
{
    var localSettings = Windows.Storage.ApplicationData.Current.LocalSettings;

    if (localSettings.Values["replayButtonTeachingTipDisplayed"] != null &&
        localSettings.Values["replayButtonTeachingTipDisplayed"].ToString() == "True")
    {
        return;
    }

    // ...

    _replayButtonTeachingTip = new TeachingTip
    {
        Target = replayButton,
        // ...
    };

    localSettings.Values["replayButtonTeachingTipDisplayed"] = "True";

    (homePage.Content as Grid).Children.Add(_replayButtonTeachingTip);
}

Here’s the code to remove the state again, e.g. in case of a reset-to-factory-settings situation:

private void ResetButton_Click(object sender, Windows.UI.Xaml.RoutedEventArgs e)
{
    var containerSettings = (ApplicationDataContainerSettings)ApplicationData.Current.LocalSettings.Values;
    var keys = containerSettings.Keys;
    foreach (var key in keys)
    {
        ApplicationData.Current.LocalSettings.Values.Remove(key);
    }

    ResetButtonTeachingTip.IsOpen = true;
}

A Word about Light-Dismiss

The IsLightDismissEnabled property makes an open teaching tip dismiss when the user scrolls or interacts with other elements of the application. In this mode, the PopupRoot –the top layer that hosts Flyouts, Dialogs, and TeachingTips- invisibly covers the whole page and swallows all UI events. Except for the system buttons in the top right corner, none of the app’s buttons will respond to a click or tap – including the Back button, the Hamburger menu and all Navigation items in our sample app.

We’re not a big fan of this mode. If you want to see for yourself, there’s one light-dismiss enabled TeachingTip in the sample app:

LightDismiss

Auto-Hiding a TeachingTip Control

It makes sense for a TeachingTip to hide itself when it’s no longer relevant, like when its Target is not displayed anymore (e.g. on Navigation) or when we can assume that the user had enough time to admire it. Let’s write some code to support such scenarios.

Hiding on Navigation

By default an open TeachingTip remains open when the user navigates to another page. When the code behind that TeachingTip (e.g. a Close event handler) refers to its Target or other items on the original page, your app will crash:

PopupRoot

On the left hand side in the above screenshot you see the Visual Studio Live Visual Tree. It reveals how the PopupRoot holding the TeachingTip is completely separated from the RootScrollViewer holding its Frame, Page, and Target. It’s up to you to synchronize these.

It’s easy for a Page to close a TeachingTip in its OnNavigatingFrom event:

protected override void OnNavigatingFrom(NavigatingCancelEventArgs e)
{
    PositioningTip.IsOpen = false;
    base.OnNavigatingFrom(e);
}

A more reusable solution is to enumerate all open Popup elements (Dialogs, Flyouts, ToolTips, TeachingTips) with GetOpenPopups() and then close each of these:

var openPopups = VisualTreeHelper.GetOpenPopups(Window.Current);
foreach (var popup in openPopups)
{
    popup.IsOpen = false;
}

Here’s an alternative for the fans of one-liners:

VisualTreeHelper.GetOpenPopups(Window.Current).ToList().ForEach(p => p.IsOpen = false);

Hiding on Timeout

If you want to close a TeachingTip after a certain time interval then you need to start a DispatcherTimer when you open it:

var timer = new DispatcherTimer();
timer.Tick += MainMenuTimer_Tick;
timer.Interval = TimeSpan.FromSeconds(5);
timer.Start();

The code that you write in its Tick() event executes on the UI Thread, so it can visually impact controls. You can set the IsOpen property of the TeachingTip to false there, and do more cleanup if you wish. Don’t forget to stop the timer:

private static void MainMenuTimer_Tick(object sender, object e)
{
    (sender as DispatcherTimer).Stop();

    if (_mainMenuTeachingTip == null)
    {
        return;
    }

    // Close and cleanup the TeachingTip
    _mainMenuTeachingTip.IsOpen = false;
}

Building an Inherited Control

Some time ago most Microsoft (and Open Source) teams that build UWP controls stopped making these controls sealed. Since then it is relatively easy to create inherited controls – such as a TeachingTip that auto-closes itself when its AutoCloseInterval elapses.

The sample app hosts such a control, here’s how an instance is declared in one of the XAML pages:

<xbcontrols:AutoCloseTeachingTip x:Name="ResetButtonTeachingTip"
                                    AutoCloseInterval="3000"
                                    Title="Reset"
                                    Content="Your app was reset to factory settings."
                                    PreferredPlacement="Right">
    <controls:TeachingTip.IconSource>
        <controls:SymbolIconSource Symbol="Repair" />
    </controls:TeachingTip.IconSource>
</xbcontrols:AutoCloseTeachingTip>

Here are the steps to create the control:

  1. Create a class that inherits from TeachingTip (no shit, Sherlock),
  2. add a property to hold the timeout value -it does not need to be a dependency property– with a decent default value,
  3. call RegisterPropertyChangedCallback() to define a listener for changes in the IsOpen dependency property,
  4. implement the DispatcherTimer based auto-close algorithm, and
  5. make sure to unregister all event handlers to avoid memory leaks.

Here’s the entire class definition:

/// <summary>
/// A teaching tip that closes itself after an interval.
/// </summary>
public class AutoCloseTeachingTip : Microsoft.UI.Xaml.Controls.TeachingTip
{
    private DispatcherTimer _timer;
    private long _token;

    public AutoCloseTeachingTip() : base()
    {
        this.Loaded += AutoCloseTeachingTip_Loaded;
        this.Unloaded += AutoCloseTeachingTip_Unloaded;
    }

    /// <summary>
    /// Gets or sets the auto-close interval, in milliseconds.
    /// </summary>
    public int AutoCloseInterval { get; set; } = 5000;

    private void AutoCloseTeachingTip_Loaded(object sender, RoutedEventArgs e)
    {
        _token = this.RegisterPropertyChangedCallback(IsOpenProperty, IsOpenChanged);
        if (IsOpen)
        {
            Open();
        }
    }

    private void AutoCloseTeachingTip_Unloaded(object sender, RoutedEventArgs e)
    {
        this.UnregisterPropertyChangedCallback(IsOpenProperty, _token);
    }

    private void IsOpenChanged(DependencyObject o, DependencyProperty p)
    {
        var that = o as AutoCloseTeachingTip;
        if (that == null)
        {
            return;
        }

        if (p != IsOpenProperty)
        {
            return;
        }

        if (that.IsOpen)
        {
            that.Open();
        }
        else
        {
            that.Close();
        }
    }

    private void Open()
    {
        _timer = new DispatcherTimer();
        _timer.Tick += Timer_Tick;
        _timer.Interval = TimeSpan.FromMilliseconds(AutoCloseInterval);
        _timer.Start();
    }

    private void Close()
    {
        if (_timer == null)
        {
            return;
        }

        _timer.Stop();
        _timer.Tick -= Timer_Tick;
    }

    private void Timer_Tick(object sender, object e)
    {
        this.IsOpen = false;
    }
}

For the sake of completeness, here’s how an instance of the control is created programmatically:

var _closeTeachingTip = new AutoCloseTeachingTip
{
    Title = "Then what are you still doing there?",
    Content = "We are all waiting for you in the factory.",
    HeroContent = new Image
    {
        Source = new BitmapImage(new Uri("ms-appx:///Assets/BrimbornSteelworks.png"))
    },
    PreferredPlacement = TeachingTipPlacementMode.Right,
    IsOpen = true
};

Using a TeachingTip as Form Field Wizard

Out of the box the TeachingTip supports an optional button: the Action Button. When one is defined, the cross icon in the upper left corner is replaced by a genuine Close Button at the bottom. The corresponding action can be defined in a traditional ActionButtonClick handler or in a more MVVM way through an ActionButtonCommand property. The action button does not close the TeachingTip.

Here’s how a TeachingTip with two buttons is declared in the sample app:

<controls:TeachingTip x:Name="ButtonsTip"
                        Target="{x:Bind ButtonsButton}"
                        Title="Were you already Flayed?"
                        CloseButtonContent="Yes"
                        CloseButtonCommand="{x:Bind CloseCommand}"
                        ActionButtonContent="Not sure"
                        ActionButtonCommand="{x:Bind ActionCommand}" />

And this is how it looks like – the action button was clicked, so the TeachingTip remained open when we opened the untargeted one on the left:

ActionButton

TeachingTip inherits from ContentControl, so it supports rich content. Together with its titles, its buttons, and its tail, this makes it an ideal host for a local wizard that can help the end user filling out a specific input field on a form – as a small dialog box that provides options or a calculation tool. This functionality is typically visually represented by decorating the form field with a tiny button showing an ellipsis or a magic wand or a calculator icon.

Our UWP app contains an example of using the TeachingTip as a form field wizard. The ellipsis button next to the TextBox opens a teaching tip presenting some options. The standard Action and Close buttons allow the user to select one of the options, or ignore the suggestion:

WizardTip

This is a scenario in which the TeachingTip could really shine, and we’re definitely planning to use it like this in some of our apps.

TeachingTip is a useful new control in the UWP ecosystem, with a simple API and smooth animations.

The Code

The sample app lives here of GitHub.

Enjoy!

Machine Learning with ML.NET in UWP: Automated Learning

In this article we take the ML.NET automated machine learning API for a spin to demonstrate how it can be used in a C# UWP app for discovering, training, and fine-tuning the most appropriate prediction model for a specific machine learning use case.

The Automated Machine Learning feature of ML.NET was announced at Build 2019 as a framework that automatically iterates over a set of algorithms and hyperparameters to select and create a prediction model. You only have to provide

  • the problem type (binary classification, multiclass classification, or regression),
  • the quality metric to optimize (accuracy, log loss, area under the curve, …), and
  • a dataset with training data.

The ML.NET automated machine learning functionality is exposed as

This article focuses on the automated ML API, we’ll refer to it with its nickname ‘AutoML’. In our UWP sample app we tried to implement a more or less realistic scenario for this feature. Here’s the corresponding XAML page from that app. It shows the results of a so-called experiment:

AutoML

Stating the problem

In the sample app we reused the white wine dataset from our binary classification sample. Its raw data contains the values for 11 physicochemical  characteristics -the ‘features’- of white wines together with an appreciation score from 0 to 10 – the ‘label’:

Input

We’ll rely on AutoML to build us a model that uses these physicochemical features to predict the score label. We’ll treat it as a multiclass classification problem where each distinct score value is considered a category.

Creating the DataView

AutoML requires you to provide an IDataView instance with the training data, and optionally one with test data. If the latter is not provided, it will split the training data itself. For the training data, a TextLoader on a .csv file would do the job: by default AutoML will use all non-label fields as feature, and create a pipeline with the necessary components to fill out missing values and transform everything to numeric fields. In a real world scenario you would want to programmatically perform some of these tasks yourself – overruling the defaults. That’s what we did in the sample app.

We used the LoadFromTextFile<T>() method to read the data into a new pipeline, so we needed a data structure to describe the incoming data with LoadColumn and ColumnName attributes:

public class AutomationData
{
    [LoadColumn(0), ColumnName("OriginalFixedAcidity")]
    public float FixedAcidity;

    [LoadColumn(1)]
    public float VolatileAcidity;

    [LoadColumn(2)]
    public float CitricAcid;

    [LoadColumn(3)]
    public float ResidualSugar;

    [LoadColumn(4)]
    public float Chlorides;

    [LoadColumn(5)]
    public float FreeSulfurDioxide;

    [LoadColumn(6)]
    public float TotalSulfurDioxide;

    [LoadColumn(7)]
    public float Density;

    [LoadColumn(8)]
    public float Ph;

    [LoadColumn(9)]
    public float Sulphates;

    [LoadColumn(10)]
    public float Alcohol;

    [LoadColumn(11)]
    public float Label;
}

We added a ReplaceMissingValues transformation on the FixedAcidity field to keep control over the ReplacementMode and the column names, and then removed the original column with a DropColumns transformation.

Here’s the pipeline that we used in the sample app to manipulate the raw data:

// Pipeline
IEstimator<ITransformer> pipeline =
    MLContext.Transforms.ReplaceMissingValues(
        outputColumnName: "FixedAcidity",
        inputColumnName: "OriginalFixedAcidity",
        replacementMode: MissingValueReplacingEstimator.ReplacementMode.Mean)
    .Append(MLContext.Transforms.DropColumns("OriginalFixedAcidity"));
                
    // No need to add this, it will be done automatically.
    //.Append(MLContext.Transforms.Concatenate("Features",
    //    new[]
    //    {
    //        "FixedAcidity",
    //        "VolatileAcidity",
    //        "CitricAcid",
    //        "ResidualSugar",
    //        "Chlorides",
    //        "FreeSulfurDioxide",
    //        "TotalSulfurDioxide",
    //        "Density",
    //        "Ph",
    //        "Sulphates",
    //        "Alcohol"}));

A model is created from this pipeline using the Fit() method, and the Transform() call creates the IDataView that provides the training data to the experiment:

// Training data
var trainingData = MLContext.Data.LoadFromTextFile<AutomationData>(
        path: trainingDataPath,
        separatorChar: ';',
        hasHeader: true);
ITransformer model = pipeline.Fit(trainingData);
_trainingDataView = model.Transform(trainingData);
_trainingDataView = MLContext.Data.Cache(_trainingDataView);

// Check the content on a breakpoint:
var sneakPeek = _trainingDataView.Preview();

Here’s the result of the Preview() call that allows to peek at the contents of the data view:

DataViewPreview

Keep in mind that AutoML only sees this resulting data view and has no knowledge of the pipeline that created it. It will for example struggle with data views that have duplicate column names – quite common in ML.NET pipelines.

Round 1: Algorithm Selection

Defining the experiment

In the first round of our scenario, we’ll run an AutoML experiment to find one or two candidate algorithms that we would like to explore further. Every experiment category (binary classification, multiclass classification, and regression) comes with its own ExperimentSettings class where you specify things like

  • a maximum duration for the whole experiment (AutoML will complete the test that’s running at the deadline),
  • the metric to optimize for (metrics depend on the category), and
  • the algorithms to use (by default all algorithms of the category are included in the experiment).

The experiment is then instantiated with a call to one of the Create() methods in the AutoCatalog. In the sample app we decided to optimize on Logarithmic Loss: it gives a more nuanced view into the performance then accuracy, since it punishes uncertainty. We also decided to ignore the two FastTree algorithms that are not yet 100% UWP compliant. Here’s the experiment definition:

var settings = new MulticlassExperimentSettings
{
    MaxExperimentTimeInSeconds = 18,
    OptimizingMetric = MulticlassClassificationMetric.LogLoss,
    CacheDirectory = null
};

// These two trainers yield no metrics in UWP:
settings.Trainers.Remove(MulticlassClassificationTrainer.FastTreeOva);
settings.Trainers.Remove(MulticlassClassificationTrainer.FastForestOva);

_experiment = MLContext.Auto().CreateMulticlassClassificationExperiment(settings);

Running the experiment

To execute the experiment … just call Execute() on the experiment, providing the data view and an optional progress handler to receive the trainer name and quality metrics after each individual test. The winning model is returned in the BestRun property of the experiment’s result:

var result = _experiment.Execute(
    trainData: _trainingDataView,
    labelColumnName: "Label",
    progressHandler: this);

return result.BestRun.TrainerName;

The progress handler must implement the IProgress interface which declares a Report() method that is called each time an individual test in the experiment finishes. In the sample app we let the MVVM Model implement this interface, and pass the algorithm name and the quality metrics to the MVVM ViewModel via an event. Eventually the diagram in the MVVM View -the XAML page- will be updated.

Here’s the code in the Model:

internal class AutomationModel : 
    IProgress<RunDetail<MulticlassClassificationMetrics>>
{
    // ...

    public event EventHandler<ProgressEventArgs> Progressed;

    // ...

    public void Report(RunDetail<MulticlassClassificationMetrics> value)
    {
        Progressed?.Invoke(this, new ProgressEventArgs
        {
            Model = new AutomationExperiment
            {
                Trainer = value.TrainerName,
                LogLoss = value.ValidationMetrics?.LogLoss,
                LogLossReduction = value.ValidationMetrics?.LogLossReduction,
                MicroAccuracy = value.ValidationMetrics?.MicroAccuracy,
                MacroAccuracy = value.ValidationMetrics?.MacroAccuracy
            }
        });
    }
}

The next screenshot shows the result of the algorithm selection phase in the sample app. The proposed model is not super good, but that’s mainly our own fault – the wine scoring problem is more a regression than a multiclass classification. If you consider that these models are unaware of the score order (they don’t realize that 8 is better than 7 is better than 6, etcetera – so they also not realize that a score of 4 may be appropriate if you hesitate between a 3 and a 5), then you will realize that they’re actually pretty accurate.

Here’s a graphical overview of the different models in the experiment. Notice the correlation –positive or negative- between the various quality metrics:

Round1

Some of the individual tests return really bad models. Here’s an example of an instance with a negative value for the Log Loss Reduction quality metric:

WorseThanRandom

This model performs worse than just randomly selecting a score. Make sure to run experiments long enough to eliminate such candidates.

Round 2: Parameter Sweeping

When using AutoML, we propose to first run a set of high level experiments to discover the algorithms that best suit your specific machine learning problem and, and then run a second set of experiments with a limited number of algorithms –just one or two- to fine-tune their appropriate hyperparameters. Data scientists call this parameter sweeping. For developers the source code for both sets is almost identical. In round 1 we start with all algorithms and Remove() some, and in round 2 we first Clear() the ICollection of trainers first and then Add() the few that we want to evaluate.

Here’s the full parameter sweeping code in the sample app:

var settings = new MulticlassExperimentSettings
{
    MaxExperimentTimeInSeconds = 180,
    OptimizingMetric = MulticlassClassificationMetric.LogLoss,
    CacheDirectory = null
};

settings.Trainers.Clear();
settings.Trainers.Add(MulticlassClassificationTrainer.LightGbm);

var experiment = MLContext.Auto().CreateMulticlassClassificationExperiment(settings);

var result = experiment.Execute(
    trainData: _trainingDataView,
    labelColumnName: "Label",
    progressHandler: this);

var model = result.BestRun.Model as TransformerChain<ITransformer>;

Here’s the result of parameter sweeping on the LightGbmMulti algorithm, the winner of the first round in the sample app. If you compare the diagram to the Round 1 values, you’ll observe a general improvement of the quality metrics. The orange Log Loss curve consistently shows lower values:

Round2

Not all parameter swiping experiments are equally useful. Here’s the result that compares different parameters for the LbfgsMaximumEntropy algorithm in the sample. All tests return pretty much the same (bad) model. This experiment just confirms that this is not the right algorithm for the scenario:

Round2Bis

Inspecting the Winner

After running some experiments you probably want to dive into the details of the winning model. Unfortunately this is the place where the API currently falls short: most if not all of the hyperparameters of the generated models are stored in private properties. Your options to drill down to the details are

  • open the model that you saved (it’s just a .zip file with flat texts in it), or
  • rely on Reflection.

We decided to rely on the Reflection features in the Visual Studio debugger. In all multiclass classification experiments that we did, the prediction model was the last transformer in the first step of the generated pipeline. So in the sample app we assigned this to a variable to facilitate inspection via a breakpoint.

Here are the OneVersusAll parameters of the winning model. They’re the bias, the weights, splits and leaf values of the underlying RegressionTreeEnsemble for each possible score:

HyperParameters1

That sounds like a pretty complex structure, so let’s shed some light on it. For starters, LightGbmMulti is a so-called One-Versus-All (OVA) algorithm. OVA is a technique to solve a multiclass classification problem by a group of binary classifiers.

The following diagram illustrates using three binary classifiers to recognize squares, triangles or crosses:

one-vs-all

When the model is asked to create a prediction, it delegates the question to all three classifiers and then deducts the result. If the answers are

  • I think it’s a square,
  • I think it’s not a triangle, and
  • I think it’s not a cross,

then you can be pretty sure that it’s a square, no?

The winning model in the sample app detected 7 values for the label, so it created 7 binary classifiers. This means that not all the scores from 0 to 10 were given. [This observation made us realize that we should have treated this problem as a regression instead of as a classification.] Each of these 7 binary classifiers is a a LightGBM trainer – a gradient boosting framework that uses tree-based learning algorithms. Gradient boosting is –just like OVA- a technique that solves a problem using multiple classifiers. LightGBM builds a strong learner by combining an ensemble of weak learners. These weak learners are typically decision trees. Apparently each of the 7 classifiers in the sample app scenario hosts an ensemble of 100 trees, each with a different weight, bias, and a set of leafs and split values for each branch.

The following screenshot shows a simpler set of hyperparameters. It’s the result of a parameter sweeping round for the LbfgsMaximumEntropy algorithm, also known as multinomial logistic regression. This one is also a One-Versus-All trainer, so there are again 7 submodels. This time the models are simpler. The algorithm created a regression function for each of the score values. The parameters are the weight of each feature in that function:

HyperParameters2

At this point in time the API’s main target is to support the Command Line Tool and the Model Builder, and that’s probably why the model’s details are declared private. All of them already appear in the output of the CLI however, so we assume that full programmatic access to the models (and the source code to generate them!) is just a matter of time.

Wow!

Here’s the overview of a canonical machine learning use case. The phases that are covered by AutoML are colored:

MachineLearningModel_2_3_4

To complete the scenario, all you need to do is

  • make your raw data available in an IDataView-friendly way: a .csv file or an IEnumerable (e.g. from a database query), and
  • consume the generated model in your apps, that’s just three lines of code: load the .zip file, create a prediction engine, call the prediction engine.

AutoML will do the rest. Impressive, no? There’s no excuse for not starting to embed machine learning in your .NET apps…

The Source

The sample app lives here on GitHub.

Enjoy!

Getting started with gRPC in UWP and ASP.NET Core

In this article we describe how to build a UWP app that uses the gRPC protocol -the new WCF- to communicate with a remote host application running on ASP.NET Core 3.0. We’ll discuss:

  • defining the service model,
  • generating client and server code,
  • executing simple RPC calls,
  • executing server-side streaming RPC calls,
  • executing client-side streaming RPC calls, and 
  • executing bidirectional streaming calls.

We built a small sample app that demonstrates the different types of calls. It looks like this:

ClientScreenshot

WHY ?

The announcements regarding .NET Core 3.0 and its successor .NET 5 by Scott Hunter and Richard Lander at the //BUILD 2019 conference confirmed that the classic .NET Framework is being mothballed and the server side of Windows Communication Foundation (WCF) with it. Cross-platform .NET Core is the future, and it will not support WCF because WCF is too tightly coupled to the Windows family of operating systems.

As it goes with every breaking change, the news is causing confusion and anger in some parts of the developer community, while some try saving the furniture by porting the technology to the new platform, and others prepare to move and already started comparing the pros and cons of alternative technologies. We proudly belong to the last category and are checking the alternatives – not only for the basic WCF functionality but also for its advanced features like NetTCP bindings, transactions, stateful sessions, etcetera. If you share our interest in these topics, we strongly recommend to keep an eye on the unWCF blog -great name, great content- by Mark Rendle

WHAT ?

Microsoft is pushing forward Web API and gRPC as the alternatives for WCF in ASP.NET Core. Check this link for a high level comparison between these two. In this article we’ll only focus on gRPC, an open source platform independent HTTP/2-based RPC framework with a solid Google pedigree.

Many RPC systems -including WCF- are based on the idea of defining a service, specifying the methods that can be called remotely, together with their parameters and return types. In gRPC all of these are defined using protocol buffers, a language neutral, platform neutral, human readable but binary serializable way of describing structured data, typically in *.proto files. Here’s an example of a such ‘protobuf’ message:

message Person {
  string name = 1;
  int32 id = 2;  // Unique ID number for this person.
  string email = 3;

  enum PhoneType {
    MOBILE = 0;
    HOME = 1;
    WORK = 2;
  }

  message PhoneNumber {
    string number = 1;
    PhoneType type = 2;
  }

  repeated PhoneNumber phones = 4;

  google.protobuf.Timestamp last_updated = 5;
}

// Our address book file is just one of these.
message AddressBook {
  repeated Person people = 1;
}

Check this tutorial to find out how to generate the corresponding C# classes from this, and to do parsing and serialization of instances.

The message keyword is used to describe the data structures that can be used as parameters and return types of service calls. The service keyword is used to describe … services, with an rpc entry for each method. Here’s the full protocol buffer definition of a canonical remote ‘Hello World’ service:

syntax = "proto3";

package Greet;

// The greeting service definition.
service Greeter {
  // Sends a greeting
  rpc SayHello (HelloRequest) returns (HelloReply) {}
}

// The request message containing the user's name.
message HelloRequest {
  string name = 1;
}

// The response message containing the greetings.
message HelloReply {
  string message = 1;
}

The Grpc.Tools NuGet package contains the protoc compiler and some helpers and integration tools around it, to generate C# (or C++) code from the .proto files on the server and client side. Let’s make our hands dirty and try it out in UWP and ASP.NET Core. Here’s how client and server look like at runtime:
Screenshots

HOW ?

Before we build a client, we need a server somewhere. Make sure to have the prerequisites : .NET Core SDK 3.0 Preview and a Visual Studio 2019 with ASP.NET and web development workload. Last but not least, it should be configured for previews.

ActivatePreview

If all the prerequisites are in place, then the whole gRPC server side setup comes out-of-the-box: you just need to create a new project from the appropriate template. We followed this tutorial to create the server project:

WebProject

The project template as well as the required NuGet packages are evolving rapidly, so make sure to have a complete list of up-to-date packages:

ServerNuGets

For the server part all references, code generation configurations, and the startup code are in the project template.

For the client you have to do everything manually, since there is no ‘add web reference’ for gRPC (yet). Here are the NuGet packages that we used in the UWP client:

ClientNuGets

If you decide generate the client code from the *.proto file(s) in a separate project, then the Grpc.Tools package is not required.

Replacing the HelloWorld service

For the sample app we needed a small and representative service definition. So we imagined a fascinating technology where a Modern UI client console communicates with a server to teleport a living creature (a life form) or a group of living creatures (a party) from a location to the server (beam up) and/or from the server to a location (beam down). The matching hardware for our sample app would look like this, with the ASP.NET Core transporter room server in the back, and the UWP client console in front:

TransporterRoom

Setting up the code generation

When working with gRPC-specific .proto files, you’ll need the Grpc.Tools NuGet package. It hosts the compilers and tooling that transform the .proto files into C# or C++. With its 12MB it is relatively big. Fortunately you only need it for code generation so you don’t need to deploy it as part of your app. In its default configuration, the protobuf compiler generates code

  • for server side, client side, or both,
  • when you change something in the .proto files and when you compile the project,
  • in a folder in the /obj region, and
  • without adding the generated files to the Visual Studio project.

Here’s were the code is generated on the server side:

GeneratedServerCode

On the client side, we wanted to customize the process. Using this extremely helpful documentation on configuring gRPC in .NET, we tried to find a Grpc.Tools configuration within the UWP XAML-C# compilation pipeline that keeps the generated client files up to date and part of the project. We didn’t find one, but that’s not a big deal: we’re pretty sure that this will be fixed. Besides, you will most probably keep the .proto files and their C# counterparts in a separate project anyway.

As an example, here’s a configuration that works for the (extra) console app in the sample solution – it’s inside the .proj file:

<ItemGroup>
  <Protobuf Include="..\XamlBrewer.Uwp.Grpc.Server\Protos\startrek.proto" 
            OutputDir="Generated" 
            Link="Protos\startrek.proto" 
            GrpcServices="Client" 
            CompileOutputs="false" />
</ItemGroup>

It generates only the client side code, and stores it in the Generated folder which is part of the project. The UWP Client project then links to these generated C# files. The console app’s .proto files are also linked: to the server project. So when the service contract changes, the UWP client is immediately updated.

Defining the service

For the Transporter Room data structures we used simple messages with only string fields. The ‘Person’ definition in the beginning of this article shows that you can add a lot more complexity if needed. Here are the declarations of the LifeForm and Location messages:

// It's life, Jim, but not as we know it.
message LifeForm {
  string species = 1;
  string name = 2;
  string rank = 3;
}

// A place in space.
message Location {
  string description = 1;
}

The protoc compiler transforms this declaration into C# classes that allow you to create, compare, clone, serialize, deserialize, and parse instances:

MessageClassDiagram

For messages (data structures) the generated code is the same on the server and the client side. For services that’s obviously not the case. Here’s the declaration of the transporter room service:

// Transporter Room API
service Transporter {
  // Beam up a single life form from a location.
  rpc BeamUp(Location) returns (LifeForm) {}

  // Beam up a party of life forms to a location.
  rpc BeamUpParty(Location) returns (stream LifeForm) {}

  // Beam down a single life form, and return the location.
  rpc BeamDown(LifeForm) returns (Location) {}

  // Beam up a party of life forms, and return the location.
  rpc BeamDownParty(stream LifeForm) returns (Location) {} 

  // Replace a beamed down party of life forms by another.
  rpc ReplaceParty(stream LifeForm) returns (stream LifeForm) {}

  // For the sake of API completeness: lock the beam to a location.
  rpc Lock(Location) returns (Location) {}
}

We’ll dive into the methods later, for now just observe the very straightforward syntax. Here’s what the protoc compiler generates for this on the server side. It’s an abstract base class that defines the public methods that you need to implement in your concrete service:

ServerClassDiagram

The ASP.NET Core template comes with the necessary helpers to startup this service and expose its methods.

On the client side the protoc compiler generates a class that resembles to what Visual Studio does with SOAP service references. It’s a client that inherits from ClientBase<T>. You can hook it to a service entry point to call its methods:

ClientClassDiagram

To instantiate a client, you need to open a Channel to the service endpoint, and pass that to its constructor:

private Channel _channel;
private TransporterClient _client;

private async void MainPage_Loaded(object sender, RoutedEventArgs e)
{
    _channel = new Channel("localhost:50051", ChannelCredentials.Insecure);
    _client = new TransporterClient(_channel);
}

Creating a channel is an expensive operation compared to invoking a remote call so in general you should reuse a single channel for as many calls as possible.

Implementing the contracts

We created two helper classes to easily generate random instances of LifeForm and Location to be used as parameter or return value. Here’s part of the Location helper:

public static class Locations
{
    private static Random _rnd = new Random(DateTime.Now.Millisecond);

    public static string WhereEver()
    {
        return _Locations[_rnd.Next(_Locations.Count)];
    }

    private static List<string> _Locations = new List<string>
    {
        "starship USS Enterprise",
        "starship USS Voyager",
        "starship USS Discovery",
        "starbase Deep Space 9",
        "USS Shenzhou",
        "where no man has gone before",
        "planet Vulcan",
        "Starfleet Academy",
        // ...
    };
}

These classes are hosted in a .NET Standard Class Library project that is referenced by client and server projects.

A simple RPC call

In a simple RPC the client sends a request to the server and waits for a response to come back – pretty much a regular function call. To beam up a LifeForm we need to provide its Location:

rpc BeamUp(Location) returns (LifeForm) {}

On the server side we need to override the corresponding abstract method. It takes a Location and an optional ServerCallContext to transport things like authentication context, headers and time-out, and it returns a Task<LifeForm>:

public override Task<LifeForm> BeamUp(Location request, ServerCallContext context)
{
    var whoEver = Data.LifeForms.WhoEver();
    var result = new LifeForm
    {
        Species = whoEver.Item1,
        Name = whoEver.Item2,
        Rank = whoEver.Item3
    };

    return Task.FromResult(result);
}

Here’s how the client creates a location, calls the method, and gets the result:

var location = new Location
{
    Description = Data.Locations.WhereEver()
};

var lifeForm = _client.BeamUp(location);

Pretty straightforward, no?

For the non-streaming RPC calls the compiler also generates an asynchronous version. Here’s an example of a client call that’s using the optional deadline parameter:

var location = await _client.BeamDownAsync(
	lifeForm, 
	deadline: DateTime.UtcNow.AddSeconds(5));

Server-side streaming

In a server-side streaming RPC the client sends a request to the server and gets a stream of messages back. The client reads from the returned stream until there are no more messages. You specify a server-side streaming method by placing the stream keyword before the response type:

rpc BeamUpParty(Location) returns (stream LifeForm) {}

The method on the server side gets the Location instance together with an IServerStreamWriter, and its main job is to write the results (some LifeForms) to that stream:

public override async Task BeamUpParty(
    Location request, 
    IServerStreamWriter<LifeForm> responseStream, 
    ServerCallContext context)
{
    var rnd = _rnd.Next(2, 5);
    for (int i = 0; i < rnd; i++)
    {
        var whoEver = Data.LifeForms.WhoEver();
        var result = new LifeForm
        {
            Species = whoEver.Item1,
            Name = whoEver.Item2,
            Rank = whoEver.Item3
        };

        await responseStream.WriteAsync(result);
    }
}

The method on the client receives an AsyncServerStreamingCall that allows it to iterate over and get the returned LifeForm instances:

var location = new Location
{
    Description = Data.Locations.WhereEver()
};

using (var lifeForms = _client.BeamUpParty(location))
{
    while (await lifeForms.ResponseStream.MoveNext())
    {
        var lifeForm = lifeForms.ResponseStream.Current;
        WriteLog($"- Beamed up {lifeForm.Rank} {lifeForm.Name} ({lifeForm.Species}).");
    }
}

Client-side streaming

In a client-side streaming RPC the client writes a sequence of messages and sends them to the server. Once the client has finished writing the messages, it waits for the server to read them all and return its response. You specify a client-side streaming method by placing the stream keyword before the request type:

rpc BeamDownParty(stream LifeForm) returns (Location) {}

The method on the server can iterate through the provided LifeForm instances by means of an IAsyncStreamReader before returning the Location:

public override async Task<Location> BeamDownParty(
	IAsyncStreamReader<LifeForm> requestStream, 
	ServerCallContext context)
{
    while (await requestStream.MoveNext())
    {
        // ...
    }

    return new Location
    {
        Description = Data.Locations.WhereEver()
    };
}

On the client side, we call the service method stub to receive an AsyncClientStreamingCall with the stream to populate. We indicate completion with a call to CompleteAsync() and then fetch the result with ResponseAsync():

using (var call = _client.BeamDownParty())
{
    foreach (var lifeForm in lifeForms)
    {
        await call.RequestStream.WriteAsync(lifeForm);
    }

    await call.RequestStream.CompleteAsync();

    var location = await call.ResponseAsync;
    WriteLog($"- Party beamed down to {location.Description}.");
}

Bidirectional streaming

In a bidirectional streaming RPC both sides send a sequence of messages using a read-write stream. The two streams operate independently, so client and server can read and write in whatever order they like. The order of messages in each stream is preserved. You specify this type of method by placing the stream keyword before both the request and the response:

rpc ReplaceParty(stream LifeForm) returns (stream LifeForm) {}

It should not come as a surprise that the bidirectional streaming API combines the patterns for client and server side streaming. The two streams operate independently, so client and server can read and write in whatever order that suits your use case. Of course the call itself needs to be initiated by the client, but that doesn’t imply that the client needs to start streaming first. Here’s our client implementation: for each LifeForm that gets beamed up (or ‘streamed down’ if you want), it writes one to the stream to get beamed down:

public async override Task ReplaceParty(
	IAsyncStreamReader<LifeForm> requestStream, 
	IServerStreamWriter<LifeForm> responseStream, 
	ServerCallContext context)
{
    while (await requestStream.MoveNext())
    {
        // var beamedUp = requestStream.Current;
        var beamDown = Data.LifeForms.WhoEver();
        await responseStream.WriteAsync(new LifeForm
        {
            Species = beamDown.Item1,
            Name = beamDown.Item2,
            Rank = beamDown.Item3
        });
    }
}

The server part first writes the party members to beam up to the request stream, and then starts to asynchronously processing the response stream:

// Creating a party.
var rnd = _rnd.Next(2, 5);
var lifeForms = new List<LifeForm>();
for (int i = 0; i < rnd; i++)
{
    var whoEver = Data.LifeForms.WhoEver();
    var lifeForm = new LifeForm
    {
        Species = whoEver.Item1,
        Name = whoEver.Item2,
        Rank = whoEver.Item3
    };

    lifeForms.Add(lifeForm);
}

WriteLog($"Replacing a party.");
using (var call = _client.ReplaceParty())
{
    var responseReaderTask = Task.Run(async () =>
    {
        while (await call.ResponseStream.MoveNext())
        {
            var beamedDown = call.ResponseStream.Current;
            await Dispatcher.RunAsync(Windows.UI.Core.CoreDispatcherPriority.Normal, () =>
            {
                WriteLog($"- Beamed down {beamedDown.Rank} {beamedDown.Name} ({beamedDown.Species}).");
            });
        }
    });

    foreach (var request in lifeForms)
    {
        await call.RequestStream.WriteAsync(request);
        WriteLog($"- Beamed up {request.Rank} {request.Name} ({request.Species}).");
    };

    await call.RequestStream.CompleteAsync();

    await responseReaderTask;
}

The following screenshot shows how the different actions within a ‘Replace Party’ operation happen in an non-determined sequence:

BidirectionalStreaming

WHEN ?

Half of the used NuGet packages that are required for this solution are still in preview (even .NET Core itself). As long as your UWP app runs on Core CLR (i.e. in Debug mode) everything seems to already work nicely. For Release mode you’ll have to wait for a couple of weeks until the .NET Native Compiler is updated to deal with gRPC. That’s a nice perspective and should not keep you from getting started with using gRPC in your UWP apps.

WHERE ?

The sample app lives here on GitHub.

Enjoy!

Consuming an ML.NET model in UWP

In this article we’ll show step by step how to consume a Machine Learning model from ML.NET in a UWP application. In a small sample app we simulate a feedback form with a rating indicator and a text box. The app uses an existing Sentiment Analysis model to validate the assigned number of stars against the assumed (predicted) sentiment of the comments. Here’s how it looks like:

FourStarsButNegative

A typical Machine Learning scenario involves reading and manipulating training data, building and evaluating the model, and consuming the model. ML.NET supports all of these steps, as we described in our recent articles. Not all of these steps need to be hosted in the same app: you may have a central (or even external) set of applications to create the models, and another set of apps that consume these.

This article describes the latter type of apps, where the model is a ZIP file and nothing but a ZIP file.

MachineLearningModel_5

We started the project with downloading this ZIP file from the ML.NET samples on GitHub. The file contains a serialized version of the Sentiment Analysis model that was demonstrated in this Build 2019 session:

MsBuildSample

It takes an English text as input, and predicts its sentiment -positive or negative- as a probability. Data-science-wise this is a ‘binary classification by regression’ solution, but as a model consumer you don’t need to know all of this (although it definitely helps to have some basic knowledge).

Configuring the project

Here’s the solution setup. We added the ZIP file as Content in the Assets folder, and added the Microsoft.ML NuGet Package:

Configuration

This NuGet package contains most but not all of the algorithms. You may need to add more packages to bring your particular model to life (e.g. when it uses Fast Tree or Matrix Factorization).

NuGets

The model’s learning algorithm determines the output schema and the required NuGet package. The Sentiment Analysis model was built around a SdcaLogisticRegressionBinaryTrainer. Its documentation has all the necessary links:

TrainerDocumentation

Deserializing the model

In the code, we first create an MLContext as context for the other calls, and we retrieve the physical path of the model file. Then we call Model.Load() to inflate the model:

var mlContext = new MLContext(seed: null);

var appInstalledFolder = Windows.ApplicationModel.Package.Current.InstalledLocation;
var assets = await appInstalledFolder.GetFolderAsync("Assets");
var file = await assets.GetFileAsync("sentiment_model.zip");
var filePath = file.Path;

var model = mlContext.Model.Load(
    filePath: filePath,
    inputSchema: out _);

The original model file was saved without schema information, so we used the C# 7 discard (‘_’) to avoid wasting an unused local variable to the inputSchema parameter.

Defining the schema

If the schema is not persisted with the model, then you need to dive in its original source code or documentation. Here are the original input and output classes for the Sentiment Analysis model:

public class SentimentData
{
    public string SentimentText;
    [ColumnName("Label")]
    public bool Sentiment;
}

public class SentimentPrediction : SentimentData
{
    [ColumnName("PredictedLabel")]
    public bool Prediction { get; set; }
    public float Probability { get; set; }
    public float Score { get; set; }
}

Since our app will not be used to train or evaluate the model, we can get away with a simplified version of this schema. There’s no need for the TextLoader attributes or a Label column:

public class SentimentData
{
    public string SentimentText;
}

public class SentimentPrediction
{
    public bool PredictedLabel { get; set; }

    public float Probability { get; set; }

    public float Score { get; set; }

    public string SentimentAsText => PredictedLabel ? "positive" : "negative";
}

We recommend model builders to be developer friendly, and save the input schema together with the model – it’s a parameter in the Model.Save() method. This allows the consumer of the model to inspect it when loading the model. Here’s how this looks like (screenshots from another sample app):

InputSchema

When you have the input schema, you can discover or verify the output schema by creating a strongly typed IDataView and passing it to GetOutputSchema():

// Double check the output schema.
var dataView = mlContext.Data.LoadFromEnumerable<SentimentData>(new List<SentimentData>());
var outputSchema = model.GetOutputSchema(dataView.Schema);

Here’s how that looks like when debugging:

OutputSchema

Inference time

Once the input and output classes are defined, we can turn the deserialized model into a strongly typed prediction engine with a call to CreatePredictionEngine():

private PredictionEngine<SentimentData, SentimentPrediction> _engine;
_engine = mlContext.Model.CreatePredictionEngine<SentimentData, SentimentPrediction>(model);

The Predict() call takes the input (the piece of text) and runs the entire pipeline –which is a black box to the consumer- to return the result:

var result = _engine.Predict(new SentimentData { SentimentText = RatingText.Text });
ResultText.Text = $"With a score of {result.Score} " +
                  $"we are {result.Probability * 100}% " +
                  $"sure that the tone of your comment is {result.SentimentAsText}.";

The sample app compares the sentiment (positive or negative) of the text in the text box to the number of stars in the rating indicator. If these correspond, then we accept the feedback form:

FiveStarsAndPositive

When the sentiment does not correspond to the number of stars, we treat the feedback as suspicious:

OneStarButPositive

A ZIP file, an input and an output class, and five lines of code – that’s all you need for using an existing ML.NET model in your app. So far, so good.

A Word of Warning

We had two reasons to place this code in a separate app instead of in our larger UWP-ML.NET-MVVM sample app. The first reason is to show how simple and easy it is to embed and consume an ML.NET Machine Learning model in your app.

The second reason is that we needed a simple sentinel to keep an eye on the evolution of the whole “UWP-ML.NET-.NET Core-.NET Native” ecosystem. The sample app runs fine in debug mode, but in release mode it crashes. Both the creation of the MLContext and the deserialization of the model seem successful, but the creation of the prediction engine fails:

ReleaseRun

ML.NET does not fully support UWP (yet)

The UWP platform targets several devices, each with very different hardware capabilities. The fast startup for .NET applications, their small memory footprint, as well as their independence from other apps and configurations is achieved by relying on .NET Native. A release build in Visual Studio creates a native app by compiling it together with its dependencies ahead of time and stripping off al the unused code. The runtime does not come with a just-in-time compiler, so you have to be careful (i.e. you have to provide the proper runtime directives) with things like reflection and deserialization (which happen to be two popular techniques within ML.NET). On top of that, there’s the sandbox that prohibits calls to some of the Win32 API’s from within a UWP app.

When compiling the small sample app using the .NET Native Tool-Chain you’ll see that it issues warnings for some of the internal utilities projects:

ReleaseBuild

In practice, these warnings indicate that ‘some things may break at runtime’ and it is not something that we can work around as a developer.

The good news is that these problems are known, an given a pretty high priority:

UwpCompatibility

The bad news is that it’s a multi team effort, involving UWP, ML.NET,.NET Core as well as .NET Native. The teams have come a long way already – a few months ago nothing even worked at debug time (that’s against the CoreCLR). But it’s extremely hard to set a deadline or a target date for full compatibility.

In one of our previous articles, we mentioned WinML as an alternative for consuming Machine Learning models. It still is, except that WinML requires the models to be persisted in the ONNX format and … export to ONNX from ML.NET is currently locked down.

In the meantime the UWP platform itself is heading to a new future with ahead-of-time compilation in .NET Core and a less restrictive sandbox. So eventually all the puzzle pieces will fall together. We just don’t know how and when. Anyway, each time that one of the components of the ecosystem is upgraded, we’ll upgrade the sample app and see what happens…

The Code

All the code for embedding and consuming an ML.NET model in a UWP app, is in the code snippets in this article. If you want to take the sample app for a spin: it lives here on GitHub.

Enjoy!