An overview of the first step for rekindling perljvm

Bradley M. Kuhn bkuhn@ebb.org
Fri, 1 Dec 2000 23:53:26 -0500

[The time has finally come---I shall endeavor to get perljvm off the ground again. Here's the first step in that process.]

I have scoured around a lot trying to find software that will give some short cuts for developing perljvm. In doing so, I have slowly built an overall design in my head of how to best approach the problem. This message serves as a bit of a core dump of that plan.

But, before I get there, let me review what's happened this far. At the Perl Conference this past summer, Raymond McCrae and I compared notes on our two approaches to the problem of porting Perl to the JVM. Ray was able to move a bit faster than I was, because he took a straight-forward approach and decided not to get bogged down in the details of the current perl5 implementation. He wrote a compiler, from scratch, to compile Perl code into Java source, and got it working so that some basic Perl programs ran on the JVM.

Meanwhile, I was working on a solution inspired by Brian Jepson's original B::JVM::Jasmin prototype. This required a deep knowledge of perl's internal data structures, which took some time to acquire, and thus, I didn't get as far as I would have liked.

But, my B::JVM::Jasmin implementation did a good job at leveraging the B:: modules to take care of all the front-end work, and thus showed that the B:: modules can fill that role well. The B:: modules are undoubtedly complex (due to the underlying complexity of perl5's internals), but the complexity isn't insurmountable.

The real problem that I found was that there was no real code generation module available. B:: hands you the parse tree, and gives you the tools to navigate it, but code generation has to be done painstakingly by hand. This was exacerbated by another problem, which proved fatal for B::JVM::Jasmin in the end. It turns out to be very hard to emit verifiable JVM assembler code. There are far too many "gotchas" to simply treat the JVM as "just another assembler"; more care must be taken.

Ray indicated to me that he believed parsing was a pretty big job, and wasn't convinced that continuing to write a parser from scratch in Java could get us to full Perl support on the JVM. This rings true with what I discovered by spending some time in perl parser code, and from the conventional wisdom in the perl development community. So, leveraging an existing parser (which, in the case of perl5, means leveraging B::) appears to be the right approach.

So, that leaves us with the question of a back-end. How do we fill in the gap so that we don't end up doing all the code generation by hand?

I believe that GNU Kawa), Per Bothner's Java-based Scheme system, is the right solution. Kawa provides an intermediate representation (IR) for JVM bytecode that is much more palatable than JVM assembler or pure bytecode. Per worked extensively on GCC, and Kawa's IR is inspired by GCC's IR.

I spoke to Per at the conference in August, and he's interested in the idea using Kawa to facilitate a Perl port to the JVM. He is willing to offer us some support to making Kawa useful for other languages like Perl. (In fact, it's always been a goal of Kawa to support more than just Scheme-like languages.)

Kawa won't work out of the box, of course. We'll have to put some effort into Kawa itself as well. But, with Per's help, I think we can make it happen. Plus, our contributions will make Kawa better software, which can only help the community at large (and perhaps make Kawa suitable as a framework for perl6' JVM support).

Another requirement is we'll have to make a library of Perl-ish data structures in Java. Both Ray and I discovered independently that there appears to be no way to port Perl to the JVM without Perl's data types like "scalar" and "array" being available directly in Java.

As it turns out, this requirement fits well with the design of the perl6 internals. The perl6-internals team are planning to come up with a clear API for how these data structures should work. As that API develops, we can strive to stay on the "bleeding edge" of it. We can give feedback to the perl6-internals team, while simultaneously ensuring that they don't do anything that drastically complicates the JVM port of perl6. ;)

So, I hope that gives an idea of the basic approach that I'd like to take with the next iteration of perljvm. Larry said in his recent ALS speech that he always knew to throw away a prototype or two, and that he sees perl5 as a prototype for perl6. Well, given that, I see no qualms in throwing away the three prototypes for perljvm we now have floating around. Starting anew with tools to make the job easier seems like a good way to go. Plus, I think that there are many useful contributions we can make to the perl6 effort, while still working on a perl5 port.

BTW, the final piece of this puzzle, surprisingly enough, is JPL. We'll need a way to have Kawa talking to B::, and vice-versa. Given that Kawa is completely in Java, and B:: requires perl, JPL is the best way to let the two talk to each other. Larry always intended for JPL to eventually "evolve" into a Perl port to the JVM. I suppose this is a bit of a surprising way for such an evolution to take place, but, good evolution should do surprising things. Especially in the Perl community. (For example, I know a few Marriott coffee mugs that were sure surprised last August. ;)

So, my plan for the coming weeks is the following:

Get Kawa, JPL, B::, and the free software Java implementations (i.e., Kaffe/Classpath/jikes) to play together nicely.
Set up an infrastructure for software development collaboration. (We already have CVS services hosted at gnu.org, but I need to wrestle with autoconf/automake to have them configure all the items properly so we can hack easily).
Not to run out of tuits and disappear from perljvm again, at least anytime soon. ;)

I hope you join me in this effort; I think it'll be fun. I look forward to your comments and feedback.

Ready to hack,
bkuhn