There have been debates about ActiveX vs Java from the user's point of view, but which is really easier for the programmer?
Over the past few months, I've read numerous articles and books on programming for the Internet but I had not got round to doing much more than playing... until now, that is. I've got a passing interest in genealogy and chose to make it my mission to display family trees on the Internet. After a bit of deliberation, I decided that it would be implemented as either a Java applet or a C++ ActiveX control and this article addresses the question of which is 'better' suited to this particular job. I'll describe the trade-offs between the two approaches and include solutions to some of the problems encountered along the way.
A problem with displaying genealogical information as the sort of tree that people expect to see is fitting everything in. There are two aspects to this: the first is the details about each individual or relationship depicted - the person's date of birth, or photograph, or just general miscellaneous notes. Putting all of this on a picture clutters it up a lot, so I chose to just place people's names on the chart and use a button click to get at the rest of the information. The second problem relates to the structure of the tree: a typical genealogical database is more complex than a tree - there may be many links up and down the generations. I realised that when I look at such a database, at any one time I'm most interested in the branches extending from one individual outwards (the 'root') and the user could click on someone else featured on the tree to cause that person to become the centre of attention to explore the whole database.
There are a number of development alternatives (see later) but I decided that I would write either an ActiveX control in C++ using the Microsoft Foundation Classes (MFC) or a Java applet and for me, the most significant trade-offs were speed vs portability:
Individuals and a list of Familys (the two data structures making up the tree) in a form which can easily be read by the applet/control - there's little point in making the downloaded component do more work than is necessary. The only thing worth noting about this pre-processing stage is byte order. The ActiveX control runs as native 80x86 code, and therefore is little-endian; Java's external data representation is big-endian. Rather than force an inefficient data stream structure on to one or other implementation, I just produced two forms of pre-processed data file.The bulk of the component's code to be written deals with laying out the tree rooted in a single individual. This occurs in two passes: the first calculates the widths required for each individual based on the width of its name string and the width required by its children or parents; having calculated the widths, the name strings are positioned. This has to be carried out in two passes because the width of any individual depends on all the individuals further from the root along that subtree. Pseudo-code for positioning children is shown in Figure 1, and parent positioning is very similar.
void positionEverybody( Individual* root )
{
int overallWidth = calcWidthDown( root )
setPositionDown( root, 0, 0 )
}
int calcWidthDown( Individual* individual )
{
if individual is already done
return its width
else
width = width of the name in the current fount
childWidth = 0
for each family with this as a parent
for each child in each family
childWidth += calcWidthDown( child ) + spacing
if childWidth > width
width = childWidth
mark as done
}
int setPositionDown( Individual* individual, int x, int y )
{
individual's x position = x + individual's width / 2 - name's width / 2
individual's y position = y
current x = x
for each family with this as a parent
for each child in each family
current x += setPositionDown( child, current x, y + spacing )
return individual's width
} |
The result of applying this to the data read in is that the location of each individual is known, and only needs to be recalculated if the root individual is changed.
After that, the control has to manage a scrollable area into which all the individuals accessible from the root are drawn along with lines between them and their parents/children, and when the mouse button is clicked on an individual, a dialog box pops up with any text about that individual that was available in the GEDCOM source - all fairly straightforward code.
getLineColor() and setLineColor() pair of access functions (sorry about the American spelling, but I was being consistent with the ambient colour properties). The wizards also let you delete properties and class members, but don't do the complete job - you still have to chop text out of the source files.Since ActiveX controls are geared to more than the Internet, the development process gave me much more than I really wanted here. For example, it was my intention that the control would read its properties (such as line and text colours) once at the beginning and they would not change for the control's lifetime, as would be the case when initialising from HTML page parameters. However, ActiveX controls' ability to interact with any of a number of containers implied that I would be ill-mannered to ignore changes - in fact, this ability is almost essential to be able to use Visual Basic as a container since the VB development environment permits you to change the properties of controls embedded in a VB project - not allowing changes would make using the control in VB rather awkward.
The actions the control has to perform map well on to MFC functionality, and particular points to note are:
<PARAM> tags just means defining ActiveX properties (creating get and set functions, such as the colour ones mentioned earlier) - everything else happens behind the scenes;
const DWORD Context = 1; // Not really used, but needed by MFC Internet fns
CFile* f = NULL;
try
{
CInternetSession Sess;
f = Sess.OpenURL( Source, Context,
INTERNET_FLAG_TRANSFER_BINARY |
INTERNET_FLAG_RAW_DATA |
INTERNET_FLAG_EXISTING_CONNECT );
if( !f )
AfxThrowInternetException( Context );
int Magic, Version;
f->Read( &Magic, sizeof( int ) );
f->Read( &Version, sizeof( int ) );
if( Magic != 0xAB875433 || Version != 0x00010003 )
AfxThrowFileException( CFileException::invalidFile );
f->Read( &NumIndividuals, sizeof( int ) );
...
}
catch( ... )
{
AfxMessageBox( "Could not read source", MB_ICONEXCLAMATION );
delete f;
Destroy();
} |
SetScrollInfo() and ShowScrollBar(), and OnHScroll() and OnVScroll() are called when the scrollbar is used.
<OBJECT ID="Viewer" WIDTH=500 HEIGHT=250
CLASSID="CLSID:26F313E6-424B-11D1-A8C5-444553540000"
CODEBASE="http://localhost/genviewer.ocx"
<PARAM NAME="Source" VALUE="http://localhost/test.bin">
<PARAM NAME="Root" VALUE="70">
<PARAM NAME="LineColor" VALUE="255">
</OBJECT> |
The CODEBASE line indicates where to pick up the control. However, this has a couple of undesirable features: first, since the control uses MFC, what happens if the MFC DLLs are not present on a user's machine, or what if they're an old version? Second, the control is rather large (see later), so is there any way to make is smaller? Microsoft is promoting cabinet file technology - these are a bit like ZIP files, and Internet Explorer understands what to do with them. Cabinet file creation tools can be found either in the cabinet SDK or as part of Microsoft's Java SDK, and instructions may be found along with either of those. To define a cabinet, I needed to create an INF file containing my control and references to the correct versions of MFC and C++ runtime library DLLs, as shown in Figure 4. You can make Visual C++ generate the cabinet file for you: for ActiveX controls, the Visual C++ application wizard inserts a post-build command to register the OCX - add the following command and the cabinet will also be created automatically:
<cab SDK path>\bin\cabarc N $(OutDir)\$(TargetName).cab $(TargetPath) $(ProjDir)\$(TargetName).inf
[version]
signature="$CHICAGO$"
AdvancedINF=2.0
[Add.Code]
GenViewer.ocx=GenViewer.ocx
msvcrt.dll=msvcrt.dll
mfc42.dll=mfc42.dll
olepro32.dll=olepro32.dll
[GenViewer.ocx]
file-win32-x86=thiscab
clsid={26F313E6-424B-11D1-A8C5-444553540000}
FileVersion=1,0,0,2
RegisterServer=yes
[msvcrt.dll]
FileVersion=4,20,0,6144
hook=mfc42installer
[mfc42.dll]
FileVersion=4,2,0,6256
hook=mfc42installer
[olepro32.dll]
FileVersion=4,2,0,6068
hook=mfc42installer
[mfc42installer]
file-win32-x86=http://activex.microsoft.com/controls/vc/mfc42.cab
run=%EXTRACT_DIR%\mfc42.exe |
The CAB file can be used in a slightly modified form of the web page - replace the CODEBASE line with:
CODEBASE="http://localhost/genviewer.cab"
Now, when a user accesses that page, if the ActiveX control is not registered locally, or is an earlier version than the one specified, the browser will grab the cabinet file from the CODEBASE location and unpack it; the OCX control will be registered and the other DLLs' version numbers checked; if they're not present or earlier versions, the browser will pop along to the Microsoft site and grab them. (I hope that most users will already have the MFC DLLs from other programs' disc based installations since the download is rather large!)
Microsoft's cabinet system also allows some form of authentication: I could generate a signature and embed that in the cabinet, so that a user can be confident that someone (the certificate publisher) trusts me and that the cabinet file has not been tampered with since I last touched it. However, I would have to pay the certificate publisher a small sum for this service, and I'm not particularly interested in spending money in this way, so my control will remain unauthenticated. Without a signature, anyone downloading the cabinet will get an irritating warning message: however, it does appear only once since the message is generated only on download, not on subsequent use of the installed control.
Visual C++ development also made is trivial to add About boxes, property pages and icons, and I built a few different variants to get an idea of the size of the final control. With only the minimum of changeable properties, the control was about 26KB and the cabinet file about 12KB; adding a few more bells and whistles such as property pages, it grew to 30KB/14KB.
Figure 5 shows the control in an HTML page (I think, in hindsight, that I should have added a border to the control). What's more, I can embed it in a Word document, or a Visual Basic application (Figure 6): the latter made it easy to debug. To debug an ActiveX control, you need to embed it in something: the obvious application is Internet Explorer, but this is large and takes quite a while to load. I threw together a tiny VB application to host the control and used this instead - it was much quicker to load. (The same could be done in C++, but it's a bit more work.)
|
|
|
As I do not have any serious form of Java development environment (Visual J++ 1 is very basic, giving not much more than edit-compile-point to error in this application - I found its debugging abilities unusable here), this was more or less developed using command line tools. The reason for the lack of useful debugging facilities is the Java sandbox, as I'll explain shortly. Having said that, coding the applet in Java was much easier than C++ would have been without its wizards.
Particularly noteworthy points are:
<PARAM> tags just requires code like the following few lines, with some error trapping added.
param = getParameter( "Root" ); if( param != null ) root = Integer.parseInt( param );
In this case, the applet's properties are definitely read only once, and cannot be changed as with the ActiveX variant, implying that I did not have to add code to cope with their changes.
try
{
URL url = new URL( m_Source );
DataInput is = new DataInputStream(
new BufferedInputStream(
url.openStream() ) );
int magic = is.readInt();
int version = is.readInt();
int numIndividuals = is.readInt();
...
}
catch( Exception e )
... |
http://localhost/... and it all ran quite happily, albeit somewhat slowly on my clunky and underpowered PC.The end result of the compilation process is a number of class files, totalling about 10KB for the most basic form of the control - I was surprised at how much smaller than the ActiveX control this was. Adding extra features such as typeface and colour selection bumped that up to about 11KB. It's a bit of a nuisance to have multiple files since downloading each is a separate HTTP transaction - could I create anything like cabinet files? Yes: Internet Explorer can handle cabinet files for Java exactly as it does for ActiveX controls but unfortunately no other browser recognises them. Sun has specified an approximately equivalent system to cabinets, the JAR file. However, this is rather new and not many browsers support JAR files - Netscape Navigator 4 and IE 4 do (but unfortunately IE 3 does not, and guess what my power-challenged PC has...). Creating a JAR file is as easy as creating a cabinet:
jar cvf genviewer2.jar genealogy/*.class
Microsoft claims that cabinets have better performance - the files are combined and then compressed whereas with JAR files, the individual elements are compressed and then combined. It does look like Microsoft is right in this particular instance: my JAR file was 6KB while the cabinet was only 5KB. JAR files can also have authentication signatures but as far as I am aware, this is not supported by any commercial browsers. So if I want to use Java it looks like I had better just leave the files as individual class files. Unfortunately, some web server platforms have restrictions on file names - Java 1.1 inner classes result in class files with dollar symbols in their names and it may not be able to host them outside JAR files or cabinets (this was the case for my ISP at the time of writing this).
The feature I found most annoying when debugging the applet was that browsers tend not to reload Java class files once they're in the cache: I had to keep quitting and restarting the browser to see the effect of any changes.
The Java applet is shown in Figure 8. It is rather slow - the applet took about four times longer than the ActiveX control to load and produce a tree display. I mentioned earlier that the Java 1.0 GUI style is rather basic - the dialog box is distinctly clumsy looking: Microsoft has developed the Application Foundation Classes (AFC), a set of classes to give a much more sophisticated user interface style and, of course, there is a Javasoft equivalent, the Java Foundation Classes, or JFC (this contains more than the AFC, but I'm only interested in the user interface bits). Microsoft have not gone out of their way to make AFC available to any browser, though it is pure Java and will run anywhere if you repackage it; and the JFC is still in beta but the user interface parts, known as Swing, are available now if you care to download them. Both the AFC and JFC look much better but are also both even slower, so I'll stick with the AWT for the time being.
|
|
The Java applet compiled to a smaller binary, but there's not much difference between the ActiveX cabinet and the set of Java class files and, with separate HTTP transactions for each class file, the latter takes longer to download - of course, if you need to download the MFC DLLs to, matters change somewhat! Associated with binary size is the amount of memory taken up when the component is running - this will be a combination of executable size, already present binaries, such as the Java virtual machine or MFC DLLs, and data space. One crude way to measure this is to have a look at Windows' memory usage statistics: with IE addressing a blank page, Windows 95 reported 55MB of memory allocated; loading a page with the ActiveX control reported 60.5MB allocated; and repeating with the applet resulted in 64.5MB allocated. This implies that the Java applet and engine take up a lot more memory than accessing the ActiveX control.
Another issue related to download times is that Java class files are merely cached when used: when the cache expires, the files will be downloaded again. Of course, a user could download my class files (or a JAR file, if a suitable browser is being used) and store them somewhere on the CLASSPATH. On the other hand, once the ActiveX control is downloaded, it is installed on to the destination system, and will not be downloaded again unless the version number indicates that the stored one is old.
Although the C++ code is considerably more complex, there was little difference in development effort thanks to the Visual C++ wizards.
I started off this evaluation of ActiveX vs Java expecting the C++ with MFC route to win and was surprised how easy it was to produce a reasonable Java applet. In this particular application, I think that Java is the better solution, despite slower speed and greater memory requirements, and I'll be using completing the development in that language. Another useful feature of Java I have yet to investigate is compressed data streams: I did not mention the size of the pre-processed data file above since it was the same for both implementations, but it is quite significant and compressing it is something I wanted to investigate - java.util.zip.GZIPInputStream and serialization could let me handle that with minimal effort.
Of course, this doesn't mean I'm giving up C++ development in favour of Java - each language has its place. In this application, with only a single server transaction, negligible host operating system interaction, and simple user interface, Java provides no worse a solution than C++, apart from start-up time, and is smaller and more portable. (For a different application, see Tom Guinther's column in January's EXE.)
I could have written a plug-in instead of an ActiveX control: the development process would have been very similar, and the unit could run in any Windows 32 browser, and it's easy to port the result to different operating systems, unlike ActiveX technology at the moment. The user would have to perform a separate download and install step, but that would not have been too onerous. I would have had to invent a new MIME type, so that the browser would know when to invoke the plug-in, but once again, that's not arduous. The major difficulty with a plug-in is debugging it - a small and quick to load VB or C++ application can host an ActiveX control, but the host for a plug-in really has to be a browser.
Instead of the MFC, I could have relied on the Active Template Library (ATL). Interestingly, I found that the component produced with this was much larger than the MFC one - the reason for this is that the ATL component relies on very little else being present at the destination whereas the MFC component requires the MFC DLLs to be there.
Another ActiveX implementation language is Visual Basic of course. This would produce a small component (though reliant on the large VB runtime DLL). However, I did not want to write recursive tree scanning code in VB.
Finally, it's worth remembering that the code which runs within the browser is not Java - it's the compiled Java bytecode, sometimes called J-code. There's no reason why Java should be the only source language. A number of companies have produced Ada compilers which emit J-code: I'm quite fond of Ada ('Ada better than C++?', EXE May 1997) and, although it'll almost certainly result in no more compact an applet than Java, I'm interested in pursuing this avenue further. Aonix have made a restricted version of their ObjectAda compiler, which includes Java 1.0 support, available for download.
Last modified on 14th January 2002