Log binary classification metrics to Neptune

Train your model and run predictions

Let’s train a model on a synthetic problem predict on test data.

[2]:
from sklearn.datasets import make_classification
from sklearn.ensemble import RandomForestClassifier
from sklearn.model_selection import train_test_split
from sklearn.metrics import classification_report

X, y = make_classification(n_samples=2000)

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2)

model = RandomForestClassifier()
model.fit(X_train, y_train)

y_test_pred = model.predict_proba(X_test)

Instantiate Neptune

[ ]:
import neptune


neptune.init(project_qualified_name='USER_NAME/PROJECT_NAME')
neptune.create_experiment()

Send all binary classification metrics to Neptune

With just one function call you can log a lot of information.

Class-based metrics:

  • accuracy
  • precision, recall
  • f1_score, f2_score
  • matthews_corrcoef
  • cohen_kappa
  • true_positive_rate, true_negative_rate
  • false_positive_rate, false_negative_rate
  • positive_predictive_value, negative_predictive_value, false_discovery_rate

Threshold-based charts for all class-based metrics

Performance charts:

  • Confusion Matrics
  • Classification Report Table
  • ROC AUC
  • Precision Recall curve
  • Lift curve
  • Cumulative gain chart
  • Kolmogorov-Smirnov statistic chart

Losses:

  • log loss
  • brier loss

Other metrics:

  • ROC AUC score
  • Average precision
  • KS-statistic score
[2]:
from neptunecontrib.monitoring.metrics import log_binary_classification_metrics

log_binary_classification_metrics(y_test, y_test_pred, threshold=0.5)

It is now safely logged in Neptune. Check out this experiment.

binary classification metrics

Log things separately

You can also choose what to log and do it separately.

[ ]:
from neptunecontrib.monitoring.metrics import *

log_confusion_matrix(y_test, y_test_pred[:, 1] > threshold)
log_classification_report(y_test, y_test_pred[:, 1] > threshold)
log_class_metrics(y_test, y_test_pred[:, 1] > threshold)
log_class_metrics_by_threshold(y_test, y_test_pred[:, 1])
log_roc_auc(y_test, y_test_pred)
log_precision_recall_auc(y_test, y_test_pred)
log_brier_loss(y_test, y_test_pred[:, 1])
log_log_loss(y_test, y_test_pred)
log_ks_statistic(y_test, y_test_pred)
log_cumulative_gain(y_test, y_test_pred)
log_lift_curve(y_test, y_test_pred)
log_prediction_distribution(y_test, y_test_pred[:, 1])