Ploting a confusion matrix https://github.com/perellonieto/PyCalib/blob/master/pycalib/visualisations/__init__.py#L382 is useful to evaluate the performance of a multiclass classification/calibration method, but not sure if this is out of the scope of this library.