Classification / Text

TextClassification – Part 2

In API 26 (Oreo) a new TextClassification system was introduced. This has been further refined in API 28 (Pie). In this short series we’ll take a look at what this is, how to use it, and how we can add custom behaviours to it.

Previously we looked at how text classification consists of two distinct steps: First expanding the current selection to something which it is possible to classify to a concrete type, and then performing that classification and determining actions. To write our own text classifier, we’ll need to perform both of these steps by overriding the relevant methods of the TextClassifier interface which we’ll implement. Both suggestSelection() and classifyText() have two forms, once which takes individual arguments, and a second which takes a Request instance which contains all of these arguments. It is important to override the Request for of these because the other method is simply a wrapper around this which will construct the Reequest instance from the individual arguments, and then call the Request form of the method. Overriding this form means that we don’t need to override the other form:

class StylingAndroidTextClassifier(
        private val context: Context,
        private val fallback: TextClassifier,
        private val factory: TextClassifierFactory = FrameworkFactory()
) : TextClassifier by fallback {

    private val stylingAndroid = "Styling Android"
    private val stylingAndroidUri = "https://blog.stylingandroid.com"
    private val regex = Regex("Styling\\s?Android", RegexOption.IGNORE_CASE)

    override fun suggestSelection(request: TextSelection.Request): TextSelection {
        return fallback.suggestSelection(request)
    }

    override fun classifyText(request: TextClassification.Request): TextClassification {
        return fallback.classifyText(request)
    }
}

Most of the methods will be proxied to a fallback TextClassifier instance which will actually be the default System TextClassifier so that we get all of its classifications if our custom TextClassifier instance fails to detect a match. We’ll override just two methods which will perform our custom classification.

Our custom TextClassifier will detect the string “Styling Android” and create a custom action which will open a link to “https://blog.stylingandroid.com” in the browser, with a custom title and icon. Let’s begin by looking at how we’ll override the suggestSelection() method. In the first article we looked at how the current user selection will be expanded to the smallest concrete type that encloses the current selection. The algorithm I’ve opted to use for this is a little crude, and perhaps not the most performant (particularly if the text is very long), but is nonetheless effective: First it will search the entire string from the regex Styling\s?Android (non-case sensitive) so that we find “Styling Android”, “styling android”, “StylingAndroid”, and all other case combinations; then it will compare the range of each of the matches to the current selection, and if the current selection falls entirely inside the range of one of the matches, then the selection will be expanded to that range:

override fun suggestSelection(request: TextSelection.Request): TextSelection {
    return findRangeOfMatch(request)
            ?: fallback.suggestSelection(request)
}

private fun findRangeOfMatch(request: TextSelection.Request): TextSelection? {
    return regex.findAll(request.text)
            .firstOrNull { it.range.contains(request.startIndex until request.endIndex) }
            ?.range
            ?.let {
                factory.buildTextSelection(it.start, it.endInclusive + 1, TextClassifier.TYPE_URL, 1.0f)
            }
}

private fun <T : Comparable<T>> ClosedRange<T>.contains(range: ClosedRange<T>) =
        contains(range.start) && contains(range.endInclusive)

The findRangeOfMatch() method performs all of this logic, and if it fails to find a match will return nulls, so the Elvis operator results in a call to suggestSelection() on the default system TextClassifier (line 17) so that we then try and match the types that it supports.

The factory instance is an object factory that I’ve introduced to keep the code testable, and there is a couple of unit tests to check that this is performing as we expect.

The findRangeOfMatch() method first performs the search for all instances of the regex within the string (line 23). This will return a Sequence instance which will contain details of all of the matches, and we use firstOrNull to filter to either the first match which contains the current selection, or null if there is none (line 24). The final two lines call the factory to construct the TextSelection instance but only if there is a non-null range in the MatchResult of the match which contains the current selection. The safe-calls operators ensure null safety, but will return null from the overall method if no valid match is found. The contains extension function is a convenience function which will search for a range that falls entirely within a larger range, and is there to improve our readability.

When we construct the TextSelection instance, we provide it with the start and end indicies of the expanded range, and the concrete type that we have identified (in this case we’ll report it as a URL using the TextClassifier.TYPE_URL constant), and give a confidence score of 1.0f as we’re certain that this is a positive match.

The factory method implementation uses textSelection.Builder to create a TextSelection instance from these arguments:

override fun buildTextSelection(
        startIndex: Int,
        endIndex: Int,
        entityType: String,
        confidenceScore: Float
): TextSelection {
    return TextSelection.Builder(startIndex, endIndex)
            .setEntityType(entityType, confidenceScore)
            .build()
}

Next we’ll need to implement the classifyText() method to perform the classification, and we’ll do that in the concluding article in this series.

Although we don’t yet have a working solution, the work in progress source code for this article is available here.

© 2018, Mark Allison. All rights reserved.

Copyright © 2018 Styling Android. All Rights Reserved.
Information about how to reuse or republish this work may be available at http://blog.stylingandroid.com/license-information.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.