Get Started With Natural Language Processing in iOS 11

Machine learning has undoubtedly been one of the hottest topics over the past year, with companies of all kinds trying to make their products more intelligent to improve user experiences and differentiate their offerings. 

Now Apple has entered the race to provide developer-facing machine learning. Core ML makes it easy for developers to add deep machine learning to their apps.

Just by taking a look at your iOS device, you will see machine learning incorporated in almost every system app—the most obvious being Siri. For example, when you send text messages, Apple uses Natural Language Processing (NLP) to either predict your next word or intelligently suggest a correction whilst typing a word. Expect machine learning and NLP to continue to become ever-present and further ingrained in our use of technology, from search to customer service. 

Objectives of This Tutorial

This tutorial will introduce you to a subset of machine learning: Natural Language Processing (NLP). We’ll cover what NLP is and why it’s worth implementing, before looking at the various layers or schemes that make up NLP. These include:

  • language identification
  • tokenization
  • part of speech identification
  • named entity recognition

After going through the theory of NLP, we will put our knowledge to practice by creating a simple Twitter client which analyzes tweets. Go ahead and clone the tutorial’s GitHub repo and take a look.

Assumed Knowledge

This tutorial assumes you are an experienced iOS developer. Although we will be working with machine learning, you don’t need to have any background in the subject. Additionally, while other components of Core ML require some knowledge of Python, we won’t be working with any Python-related aspects with NLP. 

Introduction to Machine Learning and NLP

The goal of machine learning is for a computer to do tasks without being explicitly programmed to do so—the ability to think or interpret autonomously. A high-profile contemporary use-case is autonomous driving: giving cars the ability to visually interpret their environment and drive unaided. 

Beyond visual recognition, machine learning has also introduced speech recognition, intelligent web searching, and more. With Google, Microsoft, Facebook and IBM at the forefront of popularizing machine learning and making it available to ordinary developers, Apple has also decided to move in that direction and make it easier for machine learning to be incorporated into third-party applications. 

Core ML is new to Apple’s family of SDKs, introduced as part of iOS 11 to allow developers to implement a vast variety of machine learning modes and deep learning layer types. 

Core ML technology stack source Apple

Natural Language Processing (NLP) logically sits within the Core ML framework alongside two other powerful libraries, Vision and GameplayKit. Vision provides developers with the ability to implement computer vision machine learning to accomplish things such as detecting faces, landmarks, or other objects, while GameplayKit provides game developers with tools for authoring games and specific gameplay features. 

In this tutorial, we will focus on Natural Language Processing. 

Natural Language Processing (NLP)

Natural Language Processing is the science of being able to analyze and comprehend text, breaking down sentences and words to accomplish tasks such as sentiment analysis, relationship extraction, stemming, text or sentence summarization, and more. Or to put it simply, NLP is the ability for computers to understand human language in its naturally spoken or written form.

Diagram showing how Natural Language Processing works

The ability to extract and encapsulate words and sentences contextually allows for improved integration between users and devices, or even between two devices, through meaningful chunks of content. We will explore each of these components in detail shortly, but firstly it is important to understand why you would want to implement NLP.

Why Implement Natural Language Processing? 

With companies continuing to rely on the storing and processing of big data, NLP enables the interpretation of free and unstructured text, making it analyzable. With much information stored in unstructured text files—in medical records, for example—NLP can sift through troves of data and provide data on context, intent, and even sentiment. 

Beyond being able to analyze spoken and written text, NLP has now become the engine behind bots—from ones in Slack that you can almost have a complete human conversation with, to tools for customer service. If you go to Apple’s support website and request to speak to customer service, you will be presented with a web bot that will try and point you in the right direction based on the question you’ve asked. It helps customers feel understood in real time, without actually needing to speak to a human. 

Taking a look at email spams and spam filters, NLP has made it possible to be able to understand text better, and to better classify emails with greater certainty about their intents. 

Summarization is an important NLP technique to provide sentiment analysis, something that companies would want to employ on data from their social media accounts, in order to track the perception of the company’s products. 

Machine Learning and NLP at work source Apple

The Photos app on iOS 11 is another good example. When searching for photos, machine learning works on multiple levels. Besides using machine learning and vision to recognize the face and type of photo (i.e. beach, location), search terms are filtered through NLP, and if you search for the term ‘beaches’, it will also search for photos that contain the description ‘beach’. This is called lemmatization, and you will learn more about this below, as we learn to appreciate how powerful machine learning is, yet how easy Apple makes it for you to make your apps more intelligent. 

With your app having better understanding of, for example, a search string, it will be able to interact more intelligently with users, understanding the intent behind the search term rather than taking the word in its literal sense. By embracing Apple’s NLP library, developers can support a consistent text processing approach and user experience across the entire Apple ecosystem, from iOS to macOS, tvOS, and watchOS. 

With machine learning performed on-device, users benefit by leveraging the device’s CPU and GPU to deliver performance efficiency in computations, instead of accessing external machine learning APIs. This allows user data to stay on-device and reduces latency due to network accesses. With machine learning requiring a more intimate knowledge of users in order to infer suggestions and predictions, being able to contain processing to the physical device, and utilizing differential privacy for any network-related activities, you can provide an intelligent yet non-invasive experience for your users. 

Next, we’ll take a look at the makeup of Apple’s Natural Language Processing engine.

Introducing NSLinguisticTagger

The Foundational class NSLinguisticTagger plays a central role in analyzing and tagging text and speech, segmenting content into paragraphs, sentences and words, and is made up of the following schemes:

NSLinguisticTagger components source Apple

When you initialize NSLinguisticTagger, you pass in the NSLinguisticTagScheme you are interested in analyzing. For example:

let tagger = NSLinguisticTagger(tagSchemes: [.language, .tokenType, ...], options: 0)

You would then set up the various arguments and properties, including passing in the input text, before enumerating through the NSLinguisticTagger instance object, extracting entities and tokens. Let’s dive deeper and see how to implement each of the schemes, step by step, starting with the language identification scheme. 

Language Identification

The first tag scheme type, language identification, attempts to identify the BCP-47 language most prominent at either a document, paragraph, or sentence level. You can retrieve this language by accessing the dominantLanguage property of the NSLinguisticTagger instance object:

...
Let tagger = NSLinguisticTagger(tagSchemes: [.language], options: 0)
...
tagger.string = "Some text in a language or two" 
let language = tagger.dominantLanguage //i.e “en” returned for English.

Pretty straightforward. Next, we’ll look at classifying text using the tokenization method.

Tokenization

Tokenization is the process of demarcating and possibly classifying sections of a string of input characters. The resulting tokens are then passed on to some other form of processing. (source: Wikipedia)

Taking a block of text, tokenization would logically decompose and classify that text into paragraphs, sentences, and words. We start off by setting the appropriate scheme (.tokenType) for the tagger. Unlike the previous scheme, we are expecting multiple results, and we need to enumerate through the returned tags, as illustrated in the example below:

let tagger = NSLinguisticTagger(tagSchemes: [.tokenType], options: 0)

tagger.string = textString
let range = NSRange(location: 0, length: textString.utf16.count)

//Setting various options, such as ignoring white spaces and punctuations
let options: NSLinguisticTagger.Options = [.omitPunctuation, .omitWhitespace]

//We enumerate through the tagger, using the properties set above
tagger.enumerateTags(in: range, unit: .word, scheme: .tokenType, options: options) { tag, tokenRange, stop in
    let token = (text as NSString).substring(with: tokenRange)
	// Handle each token, (i.e add to array)
}

Now we have a list of words. But wouldn’t it be interesting to get the origins of those words? So for example, if a user searches for a term like ‘walks’ or ‘walking’, it would be really useful to get the origin word, ‘walk’, and classify all these permutations of ‘walk’ together. This is called lemmatization, and we will cover that next. 

Lemmatization 

Lemmatization groups together the inflected forms of a word to be analyzed as a singular item, allowing you to infer the intended meaning. Essentially, all you need to remember is that it is deriving the dictionary form of the word.

Knowing the dictionary form of the word is really powerful and allows your users to search with greater ‘fuzziness’. In the previous example, we consider a user searching for the term ‘walking’. Without lemmatization, you would only be able to return literal mentions of that word, but if you were able to consider other forms of the same word, you would be able to also get results that mention ‘walk’. 

Similarly to the previous example, to perform lemmatization, we would set the scheme in the tagger initialization to .lemma, before enumerating the tags:

...
tagger.enumerateTags(in: range, unit: .word, scheme: .lemma, options: options) { tag, tokenRange, stop in
    if let lemma = tag?.rawValue {
	// Handle each lemma item
	} 
}
...

Next up, we’ll look at part of speech tagging, which allows us to classify a block of text as nouns, verbs, adjectives, or other parts. 

Part of Speech (PoS)

Part of Speech tagging aims to associate the part of the speech to each specific word, based on both the word’s definition and context (its relationship to adjacent and related words). As an element of NLP, part of speech tagging allows us to focus on the nouns and verbs, which can help us infer the intent and meaning of text. 

Implementing part of speech tagging involves setting the tagger property to use .lexicalClass, and enumerating in the same manner demonstrated in the previous examples. You will get a decomposition of your sentence into words, with an associative tag for each, classifying the word as belonging to a noun, preposition, verb, adjective, or determiner. For more information on what these mean, refer to Apple’s documentation covering the Lexical Types

Another process within Apple’s NLP stack is Named Entity Recognition, which decomposes blocks of text, extracting specific entity types that we are interested in, such as names, locations, organizations, and people. Let’s look at that next. 

Named Entity Recognition 

Named Entity Recognition is one of the most powerful NLP classification tagging components, allowing you to classify named real-world entities or objects from your sentence (i.e. locations, people, names). As an iPhone user, you would have already seen this in action when you text your friends, and you would have observed certain keywords highlighted, such as phone numbers, names, or dates. 

You can implement Named Entity Recognition in a similar fashion as our other examples, setting the tag scheme to .nameType, and looping through the tagger by a specific range. 

Next, you’ll put what you learned into action, with a simple app that will take a predetermined set of tweets, as you put each tweet through the NLP pipeline. 

Implementing Natural Language Processing

To wrap things up, we’ll a look at a simple Twitter client app that retrieves five tweets in a table view and applies some NLP processing for each one.

In the following screenshot, we used NLP’s Named Entity Recognition to highlight the key entity words (organizations, locations etc.) in red.

Phone screenshot with key words highlighted in red

Go ahead and clone the TwitterNLPExample project from the tutorial GitHub repo and take a quick look at the code. The class we are most interested in is TweetsViewController.swift. Let’s take a look at its tableView(_ tableView: cellForRowAt) method.

override func tableView(_ tableView: UITableView, cellForRowAt indexPath: IndexPath) -> UITableViewCell {
        // Retrieve the Tweet cell.
        let cell = tableView.dequeueReusableCell(withIdentifier: reuseIdentifier, for: indexPath)

        // Retrieve the Tweet model from loaded Tweets.
        let tweet = tweets[indexPath.row]
        
        cell.textLabel?.text = tweet.text
        cell.detailTextLabel?.text = "By \(tweet.author.screenName)."
        self.range = NSRange(location:0, length: (tweet.text.utf16.count))
        self.detectLanguage(with: cell.textLabel!)
        self.getTokenization(with: cell.textLabel!)
        self.getNamedEntityRecognition(with: cell.textLabel!)
        self.getLemmatization(with: cell.textLabel!)
        // Return the Tweet cell.
        return cell
    }

For each cell (tweet), we call four methods which we will define shortly: 

  • detectLanguage()
  • getTokenization()
  • getNamedEntityRecognition()
  • getLemmatization()

For each of those methods, we call the enumerate method, passing in the scheme and text label to extract the text, as we do to identify the language:

func detectLanguage(with textLabel:UILabel) {
        let _ = enumerate(scheme: .language, label: textLabel)
    }

Finally, the enumerate function is where all of the NLP action is really happening, taking in the properties and arguments based on the type of NLP processing we intend to do, and storing the results in arrays for us to use later on. For the purposes of this example, we simply print out the results to the console, for observation purposes. 

func enumerate(scheme:NSLinguisticTagScheme, label: UILabel) -> [String]?{
        var keywords = [String]()
        var tokens = [String]()
        var lemmas = [String]()
        
        let tags: [NSLinguisticTag] = [.personalName, .placeName, .organizationName]
        
        let tagger = NSLinguisticTagger(tagSchemes: [scheme], options: 0)
        tagger.string = label.text
        tagger.enumerateTags(in: range!, unit: .word, scheme: scheme, options: options) {
            tag, tokenRange, _ in

            switch(scheme){
            case NSLinguisticTagScheme.lemma:
                if let lemma = tag?.rawValue {
                    lemmas.append(lemma)
                }
                break
            case NSLinguisticTagScheme.language:
                print("Dominant language: \(tagger.dominantLanguage ?? "Undetermined ")")
                break
            case NSLinguisticTagScheme.nameType:
                if let tag = tag, tags.contains(tag) {
                    let name = (label.text! as NSString).substring(with: tokenRange)
                    print("entity: \(name)")
                    keywords.append(name)
                }
                break
            case NSLinguisticTagScheme.lexicalClass:
                break
            case NSLinguisticTagScheme.tokenType:
                if let tagVal = tag?.rawValue {
                    tokens.append(tagVal.lowercased())
                }
                break
            default:
                break
            }

        }
        if (scheme == .nameType){
            let keywordAttrString = NSMutableAttributedString(string: tagger.string!, attributes: nil)
            
            for name in keywords{
                
                if let indices = label.text?.indicesOf(string: name){
                    for i in indices{
                        let range = NSRange(i..<name.count+i)
                        keywordAttrString.addAttribute(NSAttributedStringKey.foregroundColor, value: UIColor.red, range: range)
                    }
                    label.attributedText = keywordAttrString
                }
            }
            return keywords
        }else if (scheme == .lemma){
            print("lemmas \(lemmas)")
            return lemmas
        }else if (scheme == .tokenType){
            print("tokens \(tokens)")
            return tokens
        }
        return nil
    }

For the .nameType Named Entity Recognition scheme, we take the entity keywords we extracted and go through to highlight the words that match those entities. You could even take it a step further and make those keywords links—maybe to search for tweets matching those keywords. 

Go ahead and build and run the app and take a look at the output, paying particular attention to the lemmas and entities we have extracted. 

Conclusion

From Google leveraging Natural Language Processing in its search engines to Apple’s Siri and Facebook’s messenger bots, there is no doubt that this field is growing exponentially. But NLP and Machine Learning are no longer the exclusive domain of large companies. By introducing the Core ML framework earlier this year, Apple has made it easy for everyday developers without a background in deep learning to be able to add intelligence into their apps.

In this tutorial, you saw how with a few lines of code you can use Core ML to infer context and intent from unstructured sentences and paragraphs, as well as detect the dominant language. We will be seeing further improvements in future iterations of the SDK, but NLP is already promising to be a powerful tool that will be widely used in the App Store.

While you’re here, check out some of our other posts on iOS app development and machine learning!

Leave a Reply

Your email address will not be published. Required fields are marked *