[1912.09322v1] PySS3: A Python package implementing a novel text classifier with visualization tools for Explainable AI
This software should be helpful for researchers and practitioners who need to deploy explainable and trusty machine learning models for text classification.
A recently introduced text classifier, called SS3, has obtained state-of-the-art performance on the CLEF's eRisk tasks. SS3 was created to deal with risk detection over text streams and therefore not only supports incremental training and classification but also can visually explain its rationale. However, little attention has been paid to the potential use of SS3 as a general classifier. We believe this could be due to the unavailability of an open-source implementation of SS3. In this work, we introduce PySS3, a package that not only implements SS3 but also comes with visualization tools that allow researchers deploying robust, explainable and trusty machine learning models for text classification.
‹Figure .1: Live Test screenshot. On the left side, the list of test documents grouped by category is shown along with the percentage of success. In this screenshot, the “doc 2” document was misclassified and it is marked with an exclamation mark (!), easing error analysis. The user has selected the “doc 1”, the “classification result” is shown above the visual description. In this figure, the user has chosen to display the visual explanation at sentence-and-word level, using mixed topics. For instance, the user can confirm that, apparently, the model has learned to recognize important words and that it has correctly classified the document. At sentence level, the user can see that the first sentence was considered to belong to multiple topics, then the topic is shifted to sports since the sentence is colored in orange, however, from the sentence that begins with “Meanwhile” on, the topic is mostly techlology and a little bit of business given by the words “investment” or “engage” colored in green. Note that the user is also able to edit the document text or even create a new one using the two buttons on the upper-right corner. (Software architecture)Figure .2: Evaluation plot screenshot. Each data point represents an evaluation/experiment performed using a particular combinations of hyperparameter values. Points are colored in relation to the obtained performance. The performance measure can be interactively changed using the options panel in the upper-left corner. Additionally, points with the global best performance are marked in pink. As shown in this figure, when the user move the cursor over a point, information related to that evaluation is displayed, including a small version of the obtained confusion matrix. (Software architecture)Figure .3: Evaluation plot - ”show volume” option enabled. (Illustrative examples)›