More on Minipar – Java interface

As I mentioned in my last post, I was trying to access the Minipar library from Java. Our current approach which uses the pdemo program to parse each sentence has a performance problem. It took more than an hour to parse 28 research articles (about 265ms per sentence). Most of time may be spent on creating the pdemo process and loading the data files, which were done before parsing every sentence. So, I wrote a Java proxy class which calls a C++ proxy class which then calls the Minipar library. The initialization code is only called once at the beginning and there is no need to create a process. The Java code calls the C++ library through the Java Native Interface (JNI). The illustration below shows the basic process of a call of parsing a sentence.

Main Java program -> MiniparProxy.java -> MiniparProxy.cpp -> Minipar library

The improvement on performance is significant. See the table below for a comparison. 28 research articles are processed (17558 sentences, 341980 word tokens). The current method is about 20 times faster than the original one.

New Minipar2.java Original Minipar.java
Total time (min) 3.90 77.77
Time per document (s) 8.37 166.66
Time per sentence (ms) 13.34 265.77
Time per token (ms) 0.68

13.65

Download source code here.

About these ads
This entry was posted in Computer, ISI and tagged , , , , , , . Bookmark the permalink.

26 Responses to More on Minipar – Java interface

  1. Hi there,

    I am currently looking at the Stanford Parser, but it is very slow. I would like to test your work using the Java proxy and Minipar. Is there a way in which I can obtain the proxy stuff you have written?

    Best regards,
    Jethro

  2. Jesús García says:

    Hi:

    I am currently working on a project with the mini with the intention of comparing it with the parser stanford, including a clustering method called k-means also with JAVA.

    I would love to see how to implement the java code to work with the mini.

    My email contact is garciatjm@gmail.com

    Greetings.

    Jesus.

  3. Jesús García says:

    HI:

    I was seeing how your code works with java but I do not understand something, how can I use the java print_triples from? I see that got MiniparProxy.ccp but not in use or maybe I have not noticed, unlike pdemo be passed parameters (-h,-t,-p) I hope your help if you’ve used this function I try to make the relationships in this way but not in the form of trees that normally goes.

    Jesus.

  4. Jesús García says:

    hi:

    In your code MiniparProxy.java has a function called print_triples, but not using, and try to squeeze that is leaving me but I need to mark errors in compiling.

    Jesus.

  5. Sure says:

    Hi,

    I am now trying to use minipar, but am not able to understand some of the words used in the results like U, fin, …

    Could you pls. indicate me where I can get more information on Minipar or explain it to me?

    Thank you

  6. salma says:

    hi, i want java code for k-means clustering.In my project i need to cluster the sentences ina document.I mean i give input as document then all the sentences in the document should be clustered based on similarity.i hope i will get reply soon.

  7. salma says:

    hi, i need java source code for k-means clustering for clustering the sentences.

  8. Ishrar says:

    Hi there,

    I will appreciate if you send me your Java source for accessing the Minipar parser, especially for extracting the dependancy relationships for a university project that I am currently working on. I’m using the Java MiniparWrapper from Gate, and I would like to check if your source provides better performance. I’ll be refering to this page for proper citation. Looking forward to your response.

    Regards.

  9. Ingrid says:

    Hi. I’m currently also working on a semantic tagging project in java for a class and I was wondering if I could try your interface to minipar. It would be a great help! Thanks!

  10. Vivek says:

    Hi,
    We are students currently working on a project that makes use of the stanford parser to parse sentences. We would like to test our project with the minipar parser. Could you please share the method in which to access minipar in java.
    Thank you

    my email id is: v_vek911@yahoo.com

  11. Calinutza says:

    Hello! I’m trying to create a web service for minipar and I find your idea very useful. Could you please send me your source code to my e-mail “calinadorofte@yahoo.com” I was thinking to make the web service in C++ with ajax, but I think that java is better. Thank you.

    Calina

  12. saatvi says:

    Hi, Your work in minipar is very useful. im trying out minipar for my research right now.. i used the program in java which calls the C++ library. but i get this error.
    java.lang.UnsatisfiedLinkError: E:\MiniparProxy\MiniparCpp\libMiniparProxy.so: Can’t load this .dll (machine code=0x101) on a IA 32-bit platform

    i’m using a windows 64-bit machine. can you help me wit this?

    thnks

    saatvi

    • Hao says:

      Hi saatvi,

      I guess the first problem is you need to load the dll version of the Minipar library MiniparProxy.dll because you are using a Windows machine. I suspect this will solve your problem. I am not sure if there will any problem with MiniparProxy.dll on a 32-bit machine as it was compiled and tested on 32-bit win xp sp2. Anyway, let me know.

      Hao

      • Rushdi says:

        Hi Hao! I am facing a same problem. I loaded the dll but it is showing me the same message.

        By the way, it is really appreciable effort… keep up the good work.

  13. changxiaolong says:

    Hello,I am a Java Programmer.I want to use the Minipar to parse but I have the problem as you that the speed is low.Can you send you to code to my email beacuse your code can not be download.Thank you very much!

  14. sure says:

    Hi Hao,

    I am trying to use Minipar and I get the following message every time I run pdemo. I have tried to google this problem but couldnt find anything.

    unknown function link-file
    db all.hdr

    The machine has Ubuntu10.04 LTS OS installed. Could you please help me solve this problem?

    Thank you

  15. hello there and thank you for your information – I’ve definitely picked up
    something new from right here. I did however expertise some technical issues using this website, as I
    experienced to reload the web site a lot of times previous to I could get it to load correctly.
    I had been wondering if your web hosting is OK? Not that I am complaining, but slow loading instances times will sometimes affect your placement in google and could
    damage your high quality score if ads and marketing with Adwords.
    Well I am adding this RSS to my email and can look out for much more of
    your respective interesting content. Ensure that you update this again soon.
    .

  16. Sethu says:

    Hi Hao,
    I would like to try your Java wrapper for Minipar. Requesting you to send your source code zip to my email id below.

    Email : sethus@outlook.com

    Thanks in advance!

  17. Vinod says:

    Hi Hao,

    I have tried the Minipar its true it is taking too much time, I would like to try your Java wrapper for Minipar. Requesting you to send your source code zip to my email id below.

    Email : vinodthebest@gmail.com

    Thank you.

  18. Write more, thats all I have to say. Literally, it seems as though you relied on the video to
    make your point. You obviously know what youre talking
    about, why waste your intelligence on just posting videos to your weblog
    when you could be giving us something informative to read?

  19. ran says:

    Broken Links, can’t download :(

  20. Giovanni says:

    Hey,
    I have some questions about Minipar.
    In your project, did you use the same databfolder provided in the zip file of Minipar?
    Did you note some problem with the output of Minipar executed on some sentences?
    For example, i have the following problem: if the input sentence is “this screen is amazing”, the output of minipar doesn’t consider the word “is” and it doesn’t specify the relation between ‘amazing’ and ‘is’

  21. Darius says:

    Hi,
    We are a couple of students trying to do some sentiment analysis, and need minipar. The links are broken, could you please give us your version of Minipar , on the email address below?

    Email: darius.suciu04@gmail.com

    Thank you

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s