Convert open office document to another open office format

In many application we want to convert one office format to another office format e.g doc to PDF , doc to html etc.We can import/export document using OpenOffice easily,but this is manual way.But standalone/Web based application we have to automate this functionality. JODConverter,the Java OpenDocument Converter, It converts documents between different office formats using OpenOffice.

JODConverter supports all conversion which is given by OpenOffice.More Info regarding format you can visit here.

Now,JODConverter is a java library so it can be used using java application,or using command line,or by making web service so any other language(RUBY,.NET,PHP,PYTHON) can used this conversion directly.
Convert Office is a ruby Wrapper of jodconverter .Which is used to convert office format to another office format.

Requirement

  • JAVA to be installed in system.
  • Open Office 2,3 or greater required
  • Start Open Office in headless mode using following command.

Installation

  • Start Open Office in headless mode using following command.

soffice -headless -accept=”socket,host=127.0.0.1,port=8100;urp;” -nofirststartwizard

ruby script/plugin install git://github.com/amardaxini/convert_office.git
  • Configuration

Create configuration file convert_office.rb placed in a config/initializers/convert_office.rb

ConvertOffice::ConvertOfficeConfig.options = {
   :java_bin => "java",          # java binary path
   :nailgun =>false,             # for nailgun support
   :soffice_port=>8100           # Open office port no
}

Document is converted using destination file name

   ConvertOffice::ConvertOfficeFormat.new.convert(src path,dest path)

Above methods take src path which is src file,and dest path which is converted document path.
dest path contains file name with valid format extension.To know valid format is discuss later in this article.

e.g converting test.doc into converted.pdf

Document is converted using format

  ConvertOffice::ConvertOfficeFormat.new.convert("test.doc","converted.pdf")

If destination file name is same as source file name except format.To do this
Pass (src path,””,valid format parameter)

   ConvertOffice::ConvertOfficeFormat.new.convert(src_file,"",format)

eg.converting test.doc to test.html

   ConvertOffice::ConvertOfficeFormat.new.convert("test.doc","","html")

Valid Format

Following methods display valid format

  • By default it displays all valid format that are available
  • By passing file name it will displays valid conversion format
  • By passing format it will displays valid conversion format
  ConvertOffice::ConvertOfficeFormat.display_valid_format
  ConvertOffice::ConvertOfficeFormat.display_valid_format(input file name)
  ConvertOffice::ConvertOfficeFormat.display_valid_format(format)

convert_office also support for nailgun.

To speed up conversion speed we can use nailgun.In previous article i have shown how to start with nailgun and its basics.

After setting up nailgun do following steps.

 rake convert_office
 script/convert_office_nailgun

Now you are ready with nailgun.

If you have any issue you can mail me amardaxini[at]gmail[dot]com


3 comments On Convert open office document to another open office format

  • Hi Amar
    I have installed jodconverter using it with the command line before. I ran into resources problems. Each time a document was converted, the memory usage went through the roof and jodconverter or openoffice crashed and had to be restarted manually. And the rails app got really slow of course.
    My solution was to host jodconverter on a different server and use it as a web service. At least it got the resources it needed and did not interfere with the calling rails app.

    I will definitely try your plugin as I’m installing jod converter for another project.
    Hopefully the resource problem will not happen with your solution.

    Anyway thanks for your article.
    Alex
    @alexip

    • We are using in staging we haven’t tested on production.But in staging we continuously converting documents into various format.
      We are using nailgun which helps you to free resource some extent.
      If you are converting heavily then one way as you as suggested or may be if you can put in background job and start multiple workers as per your requirement it will help you working on same machine without creating web service on another server.

      Thanks for replying.

  • Pingback: Convert office documents with jodconverter and open office | Alexis Perrier ()

Leave a reply:

Your email address will not be published.

Site Footer