Monday, October 27, 2014

Using Stanford NLP to run a sentiment analysis with F#

Stanford NLP is a great tool for text analysis and Sergey Tihon did a great job demonstrating how it can be called from .NET code with C# and F#.

Purpose of this post is to show how StanfordNLP sentiment analysis can be called from F# application. Code used in this example provides sentiment value - from very negative to very positive - for all sentences of the specified text.

Prerequisites:

    -Nuget Stanford.NLP.CoreNLP package needs to be installed (this code works with 3.4.0.0)  
    -Java binaries should be downloaded from http://nlp.stanford.edu/software/stanford-corenlp-full-2014-06-16.zip and unzipped. After that you need to extract content of stanford-corenlp-3.4-
models.jar(it is part of the zip file)to some directory.  

Source code:

    F# code is elegant as usual :)

open System
open System.IO
open edu.stanford.nlp.ling
open edu.stanford.nlp.neural.rnn
open edu.stanford.nlp.sentiment
open edu.stanford.nlp.trees
open edu.stanford.nlp.util
open java.util
open edu.stanford.nlp.pipeline
let classForType<'t> =
java.lang.Class.op_Implicit typeof<'t>
type SentimentPrediction =
| VeryNegative
| Negative
| Neutral
| Positive
| VeryPositive
let classToSentiment = function
| 0 -> VeryNegative
| 1 -> Negative
| 2 -> Neutral
| 3 -> Positive
| 4 -> VeryPositive
| _ -> failwith "unknown class"
let makeSentimentAnalyzer modelsDir =
let props = Properties()
props.setProperty("annotators", "tokenize, ssplit, pos, parse, sentiment") |> ignore
let currDir = Environment.CurrentDirectory
Directory.SetCurrentDirectory modelsDir
let pipeline = StanfordCoreNLP(props)
Directory.SetCurrentDirectory currDir
fun text ->
(pipeline.``process`` text).get classForType<CoreAnnotations.SentencesAnnotation> :?> ArrayList
|> Seq.cast<CoreMap>
|> Seq.map(fun cm -> cm.get classForType<SentimentCoreAnnotations.AnnotatedTree>)
|> Seq.cast<Tree>
|> Seq.map (RNNCoreAnnotations.getPredictedClass >> classToSentiment)
|> Seq.toList
view raw sentiment1.fs hosted with ❤ by GitHub

    To call this method you can use following code, where models location should be set to modelsDir variable:

[<EntryPoint>]
let main argv =
let text = "awesome great this text is so exciting! this is disgusting sentence number two.";
let modelsDir = @"C:\tmp\stanford-corenlp-full-2014-06-16\models";
let analyzer = makeSentimentAnalyzer modelsDir
printfn "%A" (analyzer text)
0 // return an integer exit code
view raw sentiment2.fs hosted with ❤ by GitHub

    Enjoy!

3 comments:

  1. With public sentiment, nothing can fail. Without it, nothing can succeed.
    See the link below for more info.

    #sentiment
    www.ufgop.org

    ReplyDelete
  2. Hi, I was testing this and I observed that for each iteration with the same text time is getting increased by 3 to 4 seconds.
    Below is our code :

    let analyzer = makeSentimentAnalyzer "ModelDirectoryPath"
    let result =
    [1..10]
    |> List.map(fun x -> analyzer someText)

    and output is :

    iteration = 1 and its Time : 00:00:04.7943076
    iteration = 2 and its Time : 00:00:07.8334975
    iteration = 3 and its Time : 00:00:11.0907054
    iteration = 4 and its Time : 00:00:14.3769231
    iteration = 5 and its Time : 00:00:17.7311399
    iteration = 6 and its Time : 00:00:20.9423484
    iteration = 7 and its Time : 00:00:24.3025585
    iteration = 8 and its Time : 00:00:27.5257557
    iteration = 9 and its Time : 00:00:30.9089676
    iteration = 10 and its Time : 00:00:34.1181736

    Any suggestions ?

    ReplyDelete
  3. problem got resolved we were using older version now we are using 12 December library.
    Regards,
    ABB

    ReplyDelete