Java/Excel (and other Office) Interoperability
Update January 2018
We’re curious: this blog post has consistently been a top performer for many years now, and we’d like to know why.
I’ve been working with a customer on a project to show how Java code can access Microsoft Excel files through the .NET interfaces provided in Visual Studio Tools for Office (VSTO). Although the example uses Excel, the principles here will work with the VSTO interfaces for any other Office application, including Word, Outlook, and PowerPoint.
The customer is using .NET Framework 1.1, so we concentrated on VSTO 2003 and the 1.x-targeted version of JNBridgePro 3.0, but the same ideas will work with .NET 2.0 and VSTO 2005.
We started with the following C# program:
using System; using Excel = Microsoft.Office.Interop.Excel; namespace ExcelProject { public class ExcelInterop { static void sheetChangeHandler(object ah, Excel.Range target) { Console.WriteLine(“the worksheet changed!”); Console.WriteLine(“target = [" + target.Row + ", " + target.Column + "]“); } public static void Main(string[] args) { Excel.WorkbookEvents_SheetChangeEventHandler sheetChangeEvent; Excel.Application theApp = new Excel.Application(); Excel.Workbooks wbs = theApp.Workbooks; Excel.Workbook wb = wbs.Open(@”C:\Data\JNBridge\ExcelProject\Book1.xls”, Type.Missing, Type.Missing, Type.Missing, Type.Missing, Type.Missing, Type.Missing, Type.Missing, Type.Missing, Type.Missing, Type.Missing, Type.Missing, Type.Missing, Type.Missing, Type.Missing); sheetChangeEvent = new Excel.WorkbookEvents_SheetChangeEventHandler( sheetChangeHandler); wb.SheetChange += sheetChangeEvent; Excel.Worksheet ws = (Excel.Worksheet)wb.Sheets[1]; ((Excel.Range)ws.Cells[1,1]).Value2 = “hello world”; wb.Save(); wb.Close(false, Type.Missing, Type.Missing); theApp.Quit(); } } }
The program takes an existing Excel file Book1.xls, opens it, assigns the string “hello world” to cell A1, then saves it away and quits. We wanted to write a Java program that would do the same thing.
As a first pass, we’ll start by generating proxies for all the VSTO classes used above: Application, Workbooks, Workbook, Worksheet, Range, WorkbookEvents_SheetChange, and Type (plus supporting classes). However, if you just do a straightforward one-to-one translation from C# to Java, you’ll run into a couple of issues.
The first thing you’ll discover is that Application is an interface and you can’t instantiate it:
Application theApp = new Application();
So how come we can instantiate Application in C#? It turns out that the Application interface is annotated with a CoClassAttribute that points to Excel.ApplicationClass. The C# compiler has hooks that understand CoClassAttribute, and the compiled code actually instantiates ApplicationClass, which implements the Application interface This metadata isn’t proxied to the Java side, and Java wouldn’t know what to do with it anyway. The solution is to proxy Excel.ApplicationClass and instantiate that in Java:
Application theApp = new ApplicationClass();
The Excel objects that are returned from the API are actually instances of System.__ComObject, which is a special .NET wrapper for the underlying COM objects returned by the various Office APIs. Now, System.__ComObjects don’t actually implement the interfaces they stand in for (try applying the reflection method Type.GetInterfaces() to a System.__ComObject to verify this), although if a System.__ComObject o is supposed to represent an object with an interface I, the test “o is I” will succeed. In any case, the built-in .NET-COM interop seems to take care of this, but the metadata needed to make this happen isn’t proxied across to the Java side and this leads to ClassCastExceptions when the proxied System.__ComObjects are assigned to variables of interface type.
To address this issue, we need to make sure that all interop with COM objects happens on the .NET side. To do this, we create wrapper classes for each of the Excel classes we use. The wrappers are what gets proxied, and the underlying wrapper objects are simply pass-throughs to the COM objects — since the calls to the COM objects are done on the .NET side, the interop automatically happens.
We create wrapper classes for each interface that we used. As an example of how this can be done, consider the wrapper that we write for the Excel.Workbook interface:
using Excel = Microsoft.Office.Interop.Excel; namespace com.jnbridge.office.interop.excel { public class WorkbookImpl: Excel.Workbook { private Excel.Workbook wb; public WorkbookImpl(Excel.Workbook theWb) { this.wb = theWb; } public void Dummy16() { wb.Dummy16(); } public bool IsInplace { get { return wb.IsInplace; } } public Microsoft.Office.Interop.Excel.Sheets Sheets { get { return new SheetsImpl(wb.Sheets); } } // etc } }
Note that the wrapper methods are passthroughs. Also look at the last property, Sheets: it returns an object that implements the Sheets interface. Since that object will itself be a System.__ComObject, we need to wrap it with the wrapper for Sheets (SheetsImpl) that we’ve written before it can be returned. In general, we need to do this with all members that return an object of an Office API type, since they will all be System.__ComObjects. Objects of other types, including primitives and strings, don’t have to be wrapped.
Some of the Excel API methods take arguments of Type.Missing to indicate that the value isn’t being supplied. Unfortunately, the .NET reflection API’s dynamic dispatch mechanism gets confused when it sees Type.Missing, since it’s used as a special indicator in vararg calls. Since JNBridgePro uses this reflection mechanism, it gets confused, too. To address this problem, we have the Java side pass null for those arguments that aren’t used, and the wrapper will translate them to Type.Missing before calling the COM object, as in this wrapper call to WorkbookImpl.Close():
public void Close(object SaveChanges, object Filename, object RouteWorkbook) { wb.Close(SaveChanges, Filename != null ? Filename : Type.Missing, RouteWorkbook != null ? RouteWorkbook : Type.Missing); }
Finally, we need to address Java-side event handlers. When an event is fired, if an Excel object is being passed to the event handler, what really gets passed is a COM object — so we need to make sure these COM object arguments are wrapped before they get to the Java side. The way to do this is to wrap the event handler so that it intervenes in the event call and wraps the COM objects before passing them to the Java side. Here’s how we did SheetChange Event in the WorkbookImpl class:
// this is a small nested class inside WorkbookImpl // use this wrapper so that the call to the ComObject is done on the // .NET side, and the Java side always sees the WorkbookImpl and // RangeImpl objects private class SheetChangeEventHandlerWrapper { Excel.WorkbookEvents_SheetChangeEventHandler sceh; public SheetChangeEventHandlerWrapper( Excel.WorkbookEvents_SheetChangeEventHandler theSceh) { this.sceh = theSceh; // this is the Java side handler } public void wrapped_sceh(object sender, Excel.Range target) { // call the Java side handler with the RangeImpl object sceh(new WorksheetImpl((Excel.Worksheet) sender), new RangeImpl(target)); } } public event Microsoft.Office.Interop.Excel.WorkbookEvents_SheetChangeEventHandlerSheetChange { add { // create a new SheetChangeEventHandler // that will call this one // value is the Java-side handler SheetChangeEventHandlerWrapper mySceh = new SheetChangeEventHandlerWrapper(value); // add the new one to SheetChangeEventHandler // – the handler we register is the wrapper handler wb.SheetChange += new Excel.WorkbookEvents_SheetChangeEventHandler( mySceh.wrapped_sceh); } remove { // create a new SheetChangeEventHandler // that will call this one // value is the Java-side handler SheetChangeEventHandlerWrapper mySceh = new SheetChangeEventHandlerWrapper(value); // remove the new one to SheetChangeEventHandler // – the handler we register is the wrapper handler wb.SheetChange -= new Excel.WorkbookEvents_SheetChangeEventHandler( mySceh.wrapped_sceh); } }
That should give you enough information to implement your own wrapper classes. If you only implement wrapper classes for the VSTO classes you need, it should be pretty straightforward. If you want to implement wrappers for the entire VSTO API, you might want to contact us. 🙂
Now generate the proxy jar file as follows:
Start with a new Java-to-.NET project. Add the VSTO Excel assembly (from the GAC) and the new wrapper DLL to the assembly list. Load the following classes (using “Add classes from assembly list…”) from the primary interop assemblies (plus all supporting classes):
- Microsoft.Office.Interop.Excel.Application
- Microsoft.Office.Interop.Excel.Workbooks
- Microsoft.Office.Interop.Excel.Workbook
- Microsoft.Office.Interop.Excel.Sheets
- Microsoft.Office.Interop.Excel.Worksheet
- Microsoft.Office.Interop.Excel.Range
Load all the classes (using “Add classes from assembly file…”) from the wrapper DLL.
Generate proxies for all the classes that you’ve loaded.
Make sure you’ve got the following on your machine:
- NET Framework 1.1
- VSTO 2003
- JNBridgePro 3.0
- place the 1.x-targeted versions of jnbshare.dll, jnbsharedmem.dll, and jnbjavaentry2.dll in the GAC.
Place the following code in a file MainClass.java so that the paths to jnbjavaentry.dll and ExcelWrappers.dll are correct on your machine, and compile it. Make sure that jnbcore.jar, bcel-5.1-jnbridge.jar, and the jar file with the proxies for the Excel VSTO interfaces and the wrapper classes are in the compilation classpath. Note that we’re configuring JNBridgePro programmatically so you won’t need to worry about a configuration file. It’s set up for shared memory.
Note the long strings below have been wrapped. Make sure they’re correct when you try to compile this code.
import Microsoft.Office.Interop.Excel.*; import System.Type; import System.BoxedBoolean; import System.DotNetString; import System.BoxedInt; import System.Object; import com.jnbridge.jnbcore.DotNetSide; import java.util.Properties; import com.jnbridge.office.interop.excel.*; public class MainClass { public static class SheetChangedEventHandler implements WorkbookEvents_SheetChangeEventHandler { public void Invoke(Object sender, Range target) { System.out.println(“the worksheet changed!”); System.out.println(“target = [" + target.Get_Row() + ", " + target.Get_Column() + "]“); } } public static void main(String[] args) { // set up the properties Properties props = new Properties(); props.put(“dotNetSide.serverType”, “sharedmem”); props.put(“dotNetSide.assemblyList.1″, “C:\\Data\\JNBridge\\ExcelWrappers\\bin\\Debug\\ExcelWrappers.dll”); props.put(“dotNetSide.assemblyList.2″, “Microsoft.Office.Interop.Excel, Version=11.0.0.0,Culture=neutral, PublicKeyToken=71e9bce111e9429c”); props.put(“dotNetSide.assemblyList.3″, “office, Version=11.0.0.0, Culture=neutral, PublicKeyToken=71e9bce111e9429c”); props.put(“dotNetSide.assemblyList.4″, “Microsoft.Vbe.Interop.Forms, Version=11.0.0.0, Culture=neutral, PublicKeyToken=71e9bce111e9429c”); props.put(“dotNetSide.assemblyList.5″, “System, Version=1.0.5000.0, Culture=neutral, PublicKeyToken=b77a5c561934e089″); props.put(“dotNetSide.assemblyList.6″, “System.Data, Version=1.0.5000.0, Culture=neutral, PublicKeyToken=b77a5c561934e089″); props.put(“dotNetSide.assemblyList.7″, “System.Windows.Forms, Version=1.0.5000.0, Culture=neutral, PublicKeyToken=b77a5c561934e089″); props.put(“dotNetSide.assemblyList.8″, “System.XML, Version=1.0.5000.0, Culture=neutral, PublicKeyToken=b77a5c561934e089″); props.put(“dotNetSide.assemblyList.9″, “Microsoft.Vbe.Interop, Version=11.0.0.0, Culture=neutral, PublicKeyToken=71e9bce111e9429c”); props.put(“dotNetSide.javaEntry”, “C:/Program Files/JNBridge/JNBridgePro v3.0/1.x-targeted/JNBJavaEntry.dll”); DotNetSide.init(props); Application theApp = new com.jnbridge.office.interop.excel.ApplicationClass( new Microsoft.Office.Interop.Excel.ApplicationClass()); Workbooks wbs = theApp.Get_Workbooks(); Microsoft.Office.Interop.Excel.Workbook wb = wbs.Open(“C:\\Data\\JNBridge\\ExcelProject2\\Book1.xls”, null, null, null, null, null, null, null, null, null, null, null, null, null, null); // use null instead of Type.Missing WorkbookEvents_SheetChangeEventHandler sceh = new SheetChangedEventHandler(); wb.add_SheetChange(sceh); Microsoft.Office.Interop.Excel.Sheets wss = (Microsoft.Office.Interop.Excel.Sheets)wb.Get_Sheets(); Microsoft.Office.Interop.Excel.Worksheet ws = (Worksheet)wss.Get_Item(new BoxedInt(1)); Range r = ws.Get_Cells(); Range r2 = (Range)r.Get_Item(new BoxedInt(1), new BoxedInt(1)); r2.Set_Value2(new DotNetString(“hello world”)); wb.Save(); wb.Close(new BoxedBoolean(false), null, null); theApp.Quit(); } }
Note how we only explicitly refer a wrapper proxy once, when we’re wrapping the VSTO ApplicationClass. In all other cases, we continue to use the VSTO interfaces, but if we’ve constructed the wrapper classes correctly, wrappers will be returned for each call and we address them through the interfaces, so they’re invisible.
To run this application, just make sure that jnbcore.jar, bcel-5.1-jnbridge.jar, and the proxy jar file are in your classpath, and run
java MainClass
If you’ve done this properly, the Excel spreadsheet will now have “hello world” in cell A1.
Once you get past the need for wrappers, setting up this interop is pretty simple, and it’ll work for all the Office APIs, not just Excel.
A question for all of you out there. Would you be interested in having JNBridge supply an assembly with wrappers for the Office APIs, or are you happy to do it yourselves? Let me know.