Typesafe Java/COM-integration with JACOB using jcom-based XML wrapper generation 

 

Miika Nurminen 

Faculty of Information Technology 

University of Jyväskylä, Finland 

minurmin@cc.jyu.fi

 

Abstract 

An approach to Java-COM integration using typesafe COM interface wrappers is presented with source code and examples using Microsoft Office applications. The technique is based on jcom, java2com and JACOB libraries and compared with other open source Java-COM integration packages.  XML wrapper and code generator could be bundled with future JACOB releases as an alternative to Jacobgen wrapper generator.

1 Introduction

COM (Component Object Model) interfaces provide a standard way to interact with many native Windows applications, most notably the applications in Microsoft Office suite. COM is supposed to be language-independent, to the extent that even non-object-oriented languages like C or old versions of Visual Basic can interact with COM components. However, it is not straightforward to call COM objects from Java, because it requires interacting with native code running outside the JVM using JNI. It was possible to directly call COM objects using Microsoft's own JVM in Windows (see [Adler04]), but since Microsoft dropped its Java support, alternative techniques are needed.

There are various reasons to use direct Java/COM-integration compared to looser approach, such as web services.  

 

We restrict our discussion to calling Dispatch-based COM components from Java code, but not vice versa (with the exception of events, since some Java/COM bridges allow COM event handlers to be callbacks in Java code). In addition to Dispatch-based COM components there are Vtable-based components (ie. non-scriptable components that do not implement IDispatch interface. See [Jawin]), but they are not discussed here. We also assume that the COM components  reside on the same machine as the Java code. We do not discuss the more elaborate problem of embedding ActiveX controls physically into Java applications, but some guidelines can be found  from [IECanvas] or [Srinivas00]. There are several commercial Java/COM integration tools available, but we focus on open source approaches. It seems that currently there isn't available an open source Java/COM-component that both supports wrappers and works with all our test cases without the need to modify the wrappers manually or to use complex code to call the components.

This report is structured as follows: section 2 introduces JACOB, the primary library we use in Java/COM-integration. Other open source integration libraries are reviewed in section  3. Section 4 describes our approach and required source code modifications in detail. The test cases are presented in section 5. The report is concluded in section 6.

2 JACOB – Java-COm Bridge

The primary library we use in Java/COM-integration is JACOB (see [JACOB]), a Java-COM bridge that allows calling COM automation components from Java. It uses JNI to make native calls into the COM and Win32 libraries. JACOB was originally created by [Adler04] and is based on Microsoft's Java SDK. JACOB supports Dispatch-based COM interfaces as well as COM events.  Vtable-based interfaces are not supported. JACOB project has been in active development since 1999, latest version 1.10.1 was released in April 2006. Because of its age, JACOB can be considered a stable and mature package.

Our goal is to call Dispatch COM components with the same ease and convenience as from Visual Basic. JACOB already allows calling components using a wrapper class based on IDispatch, but this requires a lot of browsing through documentation and extensive typecasts whenever we wish to get or set a value. If you get the datatypes and method names right it works, but is inherently unsafe and tedious for any but the simplest projects. As an illustrative example, see a few lines from a sample file from JACOB's source distribution:

 

ActiveXComponent xl = new ActiveXComponent("Excel.Application");      

Dispatch.put(xl, "Visible", new Variant(true));

Dispatch workbooks = xl.getProperty("Workbooks").toDispatch();

Dispatch workbook = Dispatch.get(workbooks,"Add").toDispatch();

Dispatch sheet = Dispatch.get(workbook,"ActiveSheet").toDispatch();

Dispatch a1 = Dispatch.invoke(sheet, "Range", Dispatch.Get,

                                  new Object[] {"A1"},

                                  new int[1]).toDispatch();

Dispatch a2 = Dispatch.invoke(sheet, "Range", Dispatch.Get,

                                  new Object[] {"A2"},

                                  new int[1]).toDispatch();

Dispatch.put(a1, "Value", "123");

Dispatch.put(a2, "Value", "=A1*2");

System.out.println("a2 from excel:"+Dispatch.get(a2, "Value"));

Solution: wrapper classes generated from a COM type library. Type libraries (such es excel.exe) describe all methods, datatypes and events related to a COM server. Type libraries can be browsed by many Windows-based development tools such as Microsoft Visual Studio of Borland Delphi. Even some Java/COM-tools described in this report provide a type library browser. JACOB is accompanied by a wrapper generator called Jacobgen, but it is still on early development stage and did not work adequately with our test cases (for example, we couldn't manage to parse the type library from Microsoft Excel). Fortunately, there is another wrapper generator component (we call it java2com in this discussion) created in 1998 by [Lewis98]. With slight modifications to both JACOB and java2com described in section 4 we are eventually able to script Excel (and basically any Dispatch-based COM-component) with a much more simplified (typesafe) notation:

 

Application  xl = new Application();

System.out.println(xl.getVersion());

xl.setVisible(true);

Workbooks workbooks = xl.getWorkbooks();

Workbook workbook = workbooks.Add();

Worksheet sheet = new Worksheet(workbook.getActiveSheet());

Range a1 = sheet.getRange("A1");

Range a2 = sheet.getRange("A2");

a1.setValue("123");

a2.setFormula("=A1*2");

System.out.println("a2 from excel:"+a2.getValue());

Note that even with the generated wrappers we cannot infer all the datatypes. Getter  workbook.getActiveSheet() returns actually a Dispatch pointer that is subsequently handed to a Worksheet constructor – we are effectively doing a typecast here. See section 5 for more detailed examples of using wrappers with Microsoft Office products.

3 Alternatives to JACOB

In this section we review some of the open source Java/COM integration tools. Overall, it seems that there is a clear need to integrate Java code with legacy COM code, since there are so many projects under development (not to mention hundreds of posts in the projects' discussion forums). Unfortunately, none of the integration tools seemed to work “out of the box” without problems related to wrapper generation, datatype handling or simply the ease of use. The documentation was also somewhat scarce. Wrapper classes generated by different projects were similar in general, but had some variations in detail. Thus, after choosing one integration platform it is not straightforward to move to another. It is crucial to choose the right integration tool with respect to requirements relatively early in the implementation. We evaluated JACOB, Jawin, jSegue and com4j based on our test cases (see section 5) and some general characteristics of the libraries. Our results are summarized in table .

 

 

JACOB(+jcom) 

Jawin 

jSegue 

com4j 

Dispatch interfaces 

Yes 

Yes 

Verbose 

Yes 

Vtable interfaces 

No 

Yes 

Yes 

In progress 

COM events 

Needs tinkering 

No 

Yes 

In progress 

Wrapper generator 

Yes 

Needs tinkering 

Needs C++ 

In progress 

COM browser 

No 

Yes 

No 

Yes 

Table 1: Comparison of Java/COM integration tools

 

Note that the actual support for different datatypes (especially arrays and dates) embedded in COM Variants vary between the projects, but we did not test those systematically. 

3.1 jSegue

jSegue is a toolset for making Java bindings to native code (see [jSegue]). The toolset includes a wrapper generator tlb2java that generates Java and JNI code to call COM Automation servers. Of the integration packages we tested, jSegue is the only tool  that works also as  COM-Java bridge: implementing in-process COM servers in Java. The project has been in active development at least since 2004 (and earlier as a commercial package from Moebius Solutions, Inc). Latest version 2.0.0.394 was released in March 2006.

jSegue is a mature and robust package, supporting both Dispatch and Vtable-based COM interfaces as well as COM events. We managed to get all our test cases to work with jSegue, so it seems to be a feasible integration solution. The only disadvantage with jSegue is that it is somewhat awkward to use. Its wrapper generator  produces both Java wrappers and C++ stubs that must be separately compiled to make full use of the library. This works if you have Visual Studio or equivalent installed, but is regardless an extra step from purely Java point of view. A more serious inconvenience with jSegue is its convention of calling Dispatch interfaces (even with wrapper-generated code). As an illustrative example, let's return to the Excel example presented in section 2 and focus on the line that creates a new Worksheet:

 

Worksheet sheet = new Worksheet(workbook.getActiveSheet());

In jSegue, the equivalent code must be represented as follows: 

 

IDispatch sh = workbook.getActiveSheet();

_Worksheet[] out_sheet = { null };

_Worksheet sheet;

sh.QueryInterface(out_sheet);

sheet=out_sheet[0];

 

Yes - it works, if you are willing to type the extra plumbing code (in this case, it seems almost simpler to use a purely Dispatch-based approach with typecasts without wrappers, though). Regarding our goal of making COM as easy to use from Java as from Visual Basic it's just a bit too verbose and C++-like. However, if you are implementing or enhancing a system in immediate production use, I would definitely recommend this one because of  its maturity.

3.2 Jawin

The Java/Win32 integration project (Jawin) is a free, open source architecture for interoperation between Java and components exposed through Microsoft's Component Object Model (COM) or through Win32 Dynamic Link Libraries (DLLs) [Jawin]. The project was initiated by Stuart Halloway and Justin Gehtland and it has been in development since 2003. Currently, the development seems to have slowed down: the latest version 2.0 alpha1 was released in March 2005.

Jawin has many promising features: it supports both Dispatch- and Vtable-based COM components and includes an easy to use COM type library browser as well as XML-based wrapper generator (unfortunately incompatible with wrappers generated with java2com). However, COM events are not supported. Jawin seems to work in principle, but in practice our test cases required numerous minor wrapper code changes both in initial XSL transform and Java code to make it even compile, not to mention to make it actually work. We are confident that someone with extra time and energy could fix the bugs associated in wrapper generation with moderate effort, but with our limited knowledge of COM, the effort seemed overwhelming. Eventually we managed to generate the wrappers, only to find out that the Variant type handling was obscure to say the least – at least compared to conventions used in JACOB or even jSegue. We succeeded running only the Word and Visio-related test cases. 

Jawin has potential, especially in the user interface side. It would be ideal tool for a beginner developer, especially because of the COM browser. Unfortunately, at its current state it cannot be recommended, because the generated code must be modified manually and because of the issues with datatype handling.  

3.3 com4j

Com4j is a new project to develop a Java library that allows Java applications to seamlessly interoperate with Microsoft Component Object Model and produce a Java tool that imports a COM type library and generates the Java definitions of that library (see [com4j]). Com4j takes advantages of Java 1.5 features to improve usability (ie. producing significantly cleaner wrapper code compared to other packages discussed here). Com4j is developed by Kohsuke Kawaguchi and has been in active development since 2004. Latest version was signed 22.03.2006.

Com4j is a very ambitious and promising project. The idea of using Java 1.5 metadata to describe wrappers seems elegant and when finished, the package would provide all the essenttial tools for  COM interoperability: interfaces for both Dispatch- and  Vtable-based components, support for events as well as a wrapper generator. Unfortunately, the project was really in alpha stage when we tested it. We couldn't even create the wrappers so it is hopelessly too early to put it in production use. This might be an optimal solution in the future, so keep an eye on this one. Because of the frequent updates, part of the problems mentioned here might have actually been fixed already.

4 Enhancing JACOB with java2com

Based on our initial tests we concluded that JACOB itself is stable and mature enough to be used on Java/COM integration, but the wrapper generation component should be replaced with a more robust one. We found Steven Lewis' java2com component [Lewis98] that uses wrapper generator based on Yoshinori Watanabe's jcom (see [jcom]) and used a modified JACOB DLL library.

We first tried running our tests on old java2com package. Wrapper generation worked fine (although the enum keyword was used in generated class names – the package was deployed years before the release of Java 1.5), but we had some problems in running some of the test cases (for example, changing font style in Microsoft Visio did not work – perhaps something related to COM enum handling?). However, since the equivalent, purely Dispatch-based code worked in current JACOB distribution we concluded that the invoke error was caused by the old JACOB DLL, not the java2com component itself.

The logical next step was to integrate the java2com code from 1998 with current JACOB distribution. Unfortunately, the current version of JACOB DLL is not directly compatible with Lewis' modifications (there were modifications both in C++ and Java code, along with some renamed methods and enhanced functionality). In order to take advantage of the bug fixes of the last 8 years, we had to manually modify the current JACOB package to make it again compatible with java2com (and Java 1.5, for that matter). These modifications are described in detail in the following subsections.

4.1 Wrapper generation

Java2com includes subprojects Xmlgen and Codegen, for generating XML descriptors from COM type libraries, followed by generating Java stubs from XML descriptors. In order to generate the descriptors, you need to find type library files. There are numerous tools for this, including Visual Studio or Borland Delphi. Type libraries can be referred with absolute file name (eg. C:\Program Files\Microsoft Office\OFFICE11\ECXEL.EXE), a class identifier (128-bit number also known as GUID) or a human readable program identifier (eg. excel.application). Type libraries can include other type libraries (for example, Excel is dependent on Office object library, Standard OLE and Visual Basic library), but Xmlgen consolidates all required data without the need to request it explicitly.

We incorporated Xmlgen in the new integration package unaltered. Codegen package was slightly modified to make generated code compatible with Java 1.5. enum literals were renamed and some additional exception handling was added. Overall, the following classes were modified.

 

Xmlgen and Codegen were combined to the same binary disribution (see javacom_bin package), along with XML parser xerces that is required by Codegen.

4.2 Modifications to JACOB

This is the most complicated part of the integration. Using the contemporary JACOB package with java2com-generated wrappers does not work (or even compile). We have to dig the modifications made by Lewis in 1998 and match them with current JACOB source code, preferably without introducing any new bugs or breaking anything (see also: Modifications to JACOB code for Generated Code in [Lewis98]).

Following classes can be directly replaced by the classes in java2com package. They are either new or there hasn't been any significant progress in the recent years: 

 

Following classes should be kept in JACOB package as is. They are either the same as in java2com or contain fixes compared to code in java2com. 

 

Dispatch and Variant classes require more extensive changes. You must copy a Dispatch constructor with threading model to JACOB, but note that the native call createInstance has been renamed to createInstanceNative (a new stub createInstanceNative with threading model argument, as well as doSetDefaultThreadingModel stub must also be added). Java2com has a more elaborate constructor Dispatch(int) (which requires also adding a setDispatch native stub and getDispatch method) compared to one in JACOB, so we use it. The method obj2variant differs slightly in packages, but the code in JACOB is newer, so keep it. Variant in java2com has a few methods that have either been renamed of removed and can be ignored. These include save, load, and getObjectRef. getIID, getObject, buildSafeArray, buildVariantArray,  and ForceObjectValariant, as well as Variant(String) and Variant(SafeArray, boolean) constructors and EMPTY_ARRAY constant should be added from java2com. Finally, constructor Variant(Object, boolean) should be augmented with Wdispatch and Array handling.

 

The last step is altering the Dispatch C++ class in JACOB DLL directly. The following methods have been renamed in newer JACOB: 

 

An implementation of createInstance with threadingModel parameter should be copied and renamed from java2com. Another createInstanceNative method must be augmented with CoInitialize call in the beginning and handling for progids starting with '{' in the end of the method. Also, implementations for setDispatch and doSetDefaultThreadingModel should be added.

 

The convention of automatically initializing COM threads and specifying threading model explicitly has some effect on existing JACOB sample code. Java2com uses multithreaded model as default, but current JACOB samples assume single-threaded apartments (for example, a single-threaded apartment thread is created in the beginning of the Excel example). The threading model should be explicitly specified in component constructor and match the running thread (another approach would be to simply change the default threading model in JACOB DLL). For example the Excel component using single-threaded apartment would be created as follows: 

 

ActiveXComponent xl = new ActiveXComponent("Excel.Application",

com.jacob.activeX.ThreadingModelEnum.ApartmentThreaded_ENUM);

However, if wrapper classes descended from Wdispatch are to be used (which is be the whole point of this exercise), JACOB creates automatically multi-threaded apartment and therefore you cannot make a call like ComThread.InitSTA() when wrappers are used (see example ExcelWrapperTest and compare it to ExcelDispatchTest in com.jacob.samples.office package). In both cases the thread should be explicitly released with ComThread.Release() to prevent potential memory leaks.

4.3 Limitations

Apart the wrapper code, the java2com-enhanced JACOB package has the same general limitations as JACOB. There is no support for Vtable-based COM components. Also, there is no inherent support for graphical ActiveX controls embedded in Java UIs (however, see Visio example in com.jacob.samples.visio). JACOB supports both single- and multithreaded apartments, but the thread-related code adapted from java2com is not thoroughly tested, so there may be some problems (for intricacies related to COM apartments, see [Liong05]. Also note that the JACOB default threading model was altered since version 1.7, as reported in [Adler04]).

Another essential limitation is event handling. JACOB allows events using DispatchEvents class, but the type information related to them is lost: all event handlers must have a Variant[] type signature. Event handlers are described in type libraries, so Codegen generates wrapper classes for them, but they cannot be directly used. There was no event handling mechanism in java2com and they are not compatible with JACOB's DispatchEvents. Indirectly, the wrapper classes (eg. com.microsoft.excel.gen._WorkbookEvents) could be used to look up the correct method names and excepted types for event handlers, but you must still use your own handler classes handled to DispatchEvents with Variant[] arguments. A potential workaround would be to change InvocationProxy class to call event handlers with correct signatures using reflection, but our initial tests were not successful. As with Jawin, this could probably fixed with moderate effort (and deeper knowledge related to COM), since the basic event handling mechanism is already implemented.

5 Using modified JACOB distribution and test cases

We repackaged JACOB 10.1.1 and java2com packages to new source and binary distributions. Source distribution is divided to wrapper generators Xmlgen and Codegen, modified JACOB code jacob_1.10.1_mod, sample code com_samples containing standard JACOB samples, our test cases and wrapper classes for Microsoft Word 2003, Excel 2003 and Visio 2003. The projects are  organized using NetBeans 5 Ant build scripts. Binary distribution contains both JACOB and jcom DLLs, as well as jar packages of all the projects, along with examples. The binary distribution can be used to generate both XML descriptors and wrapper classes and for running the examples.

For rebuilding JACOB or the examples with NetBeans 5, you should first open Codegen, then JACOB and finally com_samples projects, because of the dependencies. Codegen needs XML parser xerces contained in the lib directory. Xmlgen is separated from other projects and shouldn't need to be recompiled. You may have to increase heap size, because the  wrappers constitute over 1000 classes, requiring extensive memory. If this is the case, start NetBeans with the following command that instructs JVM to increase heap size to 400MB:

 

nb.exe -J-Xmx400M

To generate XML descriptors, call jp.jcomgenerator.XMLGenerator class in XMLGenerator.jar (jcom.dll must be in path). To generate Java wrappers, run com.lordjoe.lib.xml.TypeLib in codegen.jar (xerces.jar must be in classpath as well). Using modified JACOB package requires jacob.jar and codegen.jar in classpath and jacob.dll in path.

5 1 Visio integration

The Visio example starts Visio, creates boxes, changes text style, waits for 5 seconds and exists the application. Along the example, there is some commented code used in our tests with jSegue and Jawin (the Jawin code doesn't actually work. This was one reason we abandoned Jawin). Java2com-generated COM enums are also demonstrated by changing font size. Note also that changing font style did not work with the original java2com package. 

For a more elaborate example using Excel, see the example contained in the original JACOB distribution in com.jacob.samples.visio package. This example demonstrates embedding Visio chart in Java JFrame as well as event handling, but uses only Dispatch interface without wrappers. Converting it to wrappers would be an interesting and useful exercise.

5 2 Creating a chart with Excel

Excel example is based on the original Talking to COM example by [Lewis98]. The application starts Excel, creates a workbook with some cells, creates a chart and exists. The methods to be called have slightly more complex parameters compared to Visio example and we couldn't make it run with Jawin.

There is also another, slightly simpler Excel example in com.jacob.samples.office package that demonstrates using similar functionality with only Dispatch interface (ExcelDispatchTest) and wrappers (ExcelWrapperTest). Note that the  ExcelDispatchTest is slightly altered to make it work with explicit threading model.

5 3 Events with Word

Word example demonstrates event handling. An event handler class is created and registered with Word application object. Then two documents are created. Java application gets notifications when documents are opened, before closing and application exit. As discussed section 4.3, we are restricted to event handlers with coarse Variant[] arguments. However, method names can be checked from com.microsoft.word._ApplicationEvents2 (note that simply subclassing _ApplicationEvents2 with our event handler does not work with JACOB's current event handling, even if you manage to call the handler with correct parameter list. The method calls generated in _ApplicationEvents2 result in runtime error).

com.jacob.samples.office.WordDocumentProperties is another example adapted from standard JACOB distribution. It fetches document author metadata from Word document and exits.

6 Conclusion

We have presented an approach and working examples of Java-COM integration using typesafe COM Dispatch interface wrappers. The approach was originally presented by Steven Lewis in 1998 and is now adapted to work with current release of JACOB library with detailed explanation of what must be modified in future releases. We propose that the XML wrapper and code generator could be bundled with future JACOB releases as an alternative to Jacobgen wrapper generator. XML descriptors could be also used to integrate various Java-COM integration packages, thus making it easier to shift between packages. 

The repackaged JACOB seems to work with examples presented in this report, but further testing is needed, especially with threading. The event handling mechanism in JACOB should be enhanced to be compatible to event handler classes generated by Codegen. This provides better typesafety and reduces the need to create event handler classes manually. We hope that this package provides a feasible alternative to Java/COM -integration using open source software and is a robust enough basis for further development. 

 

7 References               

[Adhikari01]

Richard Adhikari: Java & .NET can live together. Application Development Trends, December 2001. http://www.adtmag.com/article.aspx?id=5750

[Adler04]

Dan Adler. The JACOB Project: A Java-COM Bridge. 2004. http://danadler.com/jacob/

[com4j]

com4j project web site. https://com4j.dev.java.net/

[Crosby04]

Neil Crosby: Using the iTunes COM Interface with Java and Swing. workingwith.me.uk, 2004. http://www.workingwith.me.uk/articles/java/itunes-com-with-java-and-swing

[IECanvas]

Torgeir Veimo. IECanvas project. http://nothome.com/IECanvas/

[JACOB]

JACOB project web site. http://sourceforge.net/projects/jacob-project/

[Jawin]

Jawin project web site. http://jawinproject.sourceforge.net/

[jcom]

jcom project web site.                                                                                   http://www.nexb.org/open-source-it-asset-management/Wiki.jsp?page=Jcom

[jSegue]

jSegue project web site. http://jsegue.sourceforge.net/

[Lewis98]

Steven Lewis. Talking To COM. SeaJUG, 1998. http://www.lordjoe.com/Java2Com/index.html

[Liong05]

Lim Bio Liong: Understanding The COM Single-Threaded Apartment Part 1. The Code Project, 2005. http://www.codeproject.com/com/CCOMThread.asp

[Srinivas00]

Davanum Srinivas: Embed ActiveX controls inside Java GUI. The Code Project, 2000. http://www.codeproject.com/java/javacom.asp