Links

Insights Report

Statistical Report of Model Training Performance
After training a model you will want to review how well it performed. The insights report will provide this information.
Once you have selected Data (for this section we will be using a stroke prediction dataset from Kaggle) and select predict you will be brought to the Insights report. There is a button (Insights Report) that will take you to a second version of the report as well for easier sharing. The two sections are identical.
Once we have run the model training we can review its performance in the following sections.

Classification Summary

This section reviews how well the model performed against the reserve data. When training the model will use 80% of your data for training and reserve 20% to test against. In this case the model predicted 95% of the reserve data correctly. You can retrain the model with different settings at different speeds to see how that changes the accuracy.
You can also select 'See accuracy details' to drill down into specifics:

Predictive Performance

Depending on your data false positives or negatives could affect your data more or less respectively. This drill down into how the model performed with those two groupings respectively can give you insights that will be valuable for your eventual use of the model going forward. In this case we have the common situation where there are significantly more false negatives than positives. Since this is a health concern forecast it may be worth tweaking the Decision thresholds to be more conservative and err on the side of false positives but we will explore that in the Decision Threshold Graph section.

Performance Details

This is how well the model performs at predicting each outcome.
  • Accuracy - measures how often a prediction is correct. It is calculated by dividing the number of correct predictions by the total number of predictions.
  • Precision - is the fraction of true positives out of the predicted positives. This is useful to consider when the cost of a false positive is high, such as in email spam detection. Higher is better.
  • Recall - is how many of the actual positives your model captures. This is useful to consider when the cost of a false negative is high, such as in malignant cancer prediction. Higher is better.
  • F1 Score - combines precision and recall into one metric and weights them to balance the consideration of false positives and false negatives. This is useful for comparing different ML models that predict the same outcome. Higher is better.
  • Count - is the number of times this outcome appears in the validation set.

Advanced Model Details

Click on 'Show Advanced Model Details.
Akkio tests several models for each training and only returns the best performing model for the data. This section details how that model performed over time and provides information on what type of model it is. In this case we used a Deep Neural Network with Attention.

Top Fields

This section shows which fields contributed the most to determining likelihood if the requested outcome, in this case a stroke. Here you can see that average glucose level was almost a quarter of the determining factors of whether a stroke occured. This should track with existing logic and passes the 'sniff test'. You can also see on the right of this section how that field affected the outcome, in this case higher average glucose levels led to more strokes.

Top Factors

Similar to Top Fields, Top Factors show what data in specific fields led to the outcome (stroke) the most often. This will look similar ot the fields data but now targets the information in each field. As you can see here age between 65 and 82 has more of an impact that blood glucose between 123.94 and 271.74 even though the field of blood glucose as a whole contributes more.

Segments

Segments breaks the data up into similar groupings based on outcomes. Shown here is a grouping of patients with a high risk of stroke and shows the values they have in common. Older patients with high glucose and BMI form this high risk group.

Decision Threshold Graph

Break your data up into Unlikely, Likely, and Uncertain results. Move the sliders around to see what percentage changes what.
  • Densification - the rate at which the outcome of interest occurs in this group vs. overall.
  • Group Size - the number of rows in the group as a percentage of the whole

Sample Rows

Sample rows of your data sorted by outcome of interest probability. Drag the slider to inspect rows with different likelihoods.