Purpose of this post is to show how StanfordNLP sentiment analysis can be called from F# application. Code used in this example provides sentiment value - from very negative to very positive - for all sentences of the specified text.
Prerequisites:
-Nuget Stanford.NLP.CoreNLP package needs to be installed (this code works with 3.4.0.0)-Java binaries should be downloaded from http://nlp.stanford.edu/software/stanford-corenlp-full-2014-06-16.zip and unzipped. After that you need to extract content of stanford-corenlp-3.4-
models.jar(it is part of the zip file)to some directory.
Source code:
F# code is elegant as usual :)
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
open System | |
open System.IO | |
open edu.stanford.nlp.ling | |
open edu.stanford.nlp.neural.rnn | |
open edu.stanford.nlp.sentiment | |
open edu.stanford.nlp.trees | |
open edu.stanford.nlp.util | |
open java.util | |
open edu.stanford.nlp.pipeline | |
let classForType<'t> = | |
java.lang.Class.op_Implicit typeof<'t> | |
type SentimentPrediction = | |
| VeryNegative | |
| Negative | |
| Neutral | |
| Positive | |
| VeryPositive | |
let classToSentiment = function | |
| 0 -> VeryNegative | |
| 1 -> Negative | |
| 2 -> Neutral | |
| 3 -> Positive | |
| 4 -> VeryPositive | |
| _ -> failwith "unknown class" | |
let makeSentimentAnalyzer modelsDir = | |
let props = Properties() | |
props.setProperty("annotators", "tokenize, ssplit, pos, parse, sentiment") |> ignore | |
let currDir = Environment.CurrentDirectory | |
Directory.SetCurrentDirectory modelsDir | |
let pipeline = StanfordCoreNLP(props) | |
Directory.SetCurrentDirectory currDir | |
fun text -> | |
(pipeline.``process`` text).get classForType<CoreAnnotations.SentencesAnnotation> :?> ArrayList | |
|> Seq.cast<CoreMap> | |
|> Seq.map(fun cm -> cm.get classForType<SentimentCoreAnnotations.AnnotatedTree>) | |
|> Seq.cast<Tree> | |
|> Seq.map (RNNCoreAnnotations.getPredictedClass >> classToSentiment) | |
|> Seq.toList |
To call this method you can use following code, where models location should be set to modelsDir variable:
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
[<EntryPoint>] | |
let main argv = | |
let text = "awesome great this text is so exciting! this is disgusting sentence number two."; | |
let modelsDir = @"C:\tmp\stanford-corenlp-full-2014-06-16\models"; | |
let analyzer = makeSentimentAnalyzer modelsDir | |
printfn "%A" (analyzer text) | |
0 // return an integer exit code |
Enjoy!
With public sentiment, nothing can fail. Without it, nothing can succeed.
ReplyDeleteSee the link below for more info.
#sentiment
www.ufgop.org
Hi, I was testing this and I observed that for each iteration with the same text time is getting increased by 3 to 4 seconds.
ReplyDeleteBelow is our code :
let analyzer = makeSentimentAnalyzer "ModelDirectoryPath"
let result =
[1..10]
|> List.map(fun x -> analyzer someText)
and output is :
iteration = 1 and its Time : 00:00:04.7943076
iteration = 2 and its Time : 00:00:07.8334975
iteration = 3 and its Time : 00:00:11.0907054
iteration = 4 and its Time : 00:00:14.3769231
iteration = 5 and its Time : 00:00:17.7311399
iteration = 6 and its Time : 00:00:20.9423484
iteration = 7 and its Time : 00:00:24.3025585
iteration = 8 and its Time : 00:00:27.5257557
iteration = 9 and its Time : 00:00:30.9089676
iteration = 10 and its Time : 00:00:34.1181736
Any suggestions ?
problem got resolved we were using older version now we are using 12 December library.
ReplyDeleteRegards,
ABB