...thanks to Kshtij Sobti
It gave a throttle to the legacy SpiderMonkey (SM) engine for sure, but TM isn't a standalone engine itself, it is merely a performance booster. So, for some parts of JS compiling, TM has to take help from the legacy engine, making the overall performance suffer. To understand the combination better, let us put it like- trying to go to Kanyakumari, by taking a flight to Chennai to save time, and then walking on from there! The tracing accomplished by TraceMonkey can give incredible boosts in performance, but when it doesn't work, Firefox falls back to the much slower older SpiderMonkey interpreter engine!
That is what JaegerMonkey (alt. JägerMonkey) is all about.
The implementation of JaegerMonkey (JM) will be as a replacement for SpiderMonkey, and integrated with TraceMonkey. So that, in those cases where TraceMonkey used to leave the job in the hands of SpideMonkey, it will instead fall back to JaegerMonkey. As of now, JM and TM are equally fast, but JM excels in compiling those codes that TM fails to work upon. So that when these two are integrated, the combined system will hopefully have no Achilles Heel.
|TraceMonkey||The tracing JIT compiler added with SpiderMonkey in Firefox 3.5, to boost its performance.|
With this much said, most of you may consider it the end of the story - but for geek pleasure, we will go further ahead dissecting the innards of those monkeys. So, click 'Next' with caution; if you have a tendency for heart-attacks or allergy to techno-babble, advance no more!
For any given code being executed by Firefox the execution will be happening in the old SpiderMonkey interpreter engine. The tracing engine will keep a look out for repeated code, and in such cases try to optimize it using the tracing JIT. If this works, great, if it doesn't we've just wasted some time. Also in cases where the code uses an eval() function, TM fails (as it requires the variables and types to be predefined), and the interpreter has to do the job.
So, as it is prominent, these events cause two-fold delays; one for the trials with TM which eventually ends up fruitless, and the other is for the default interpreter, which takes more time than a polished sophisticated JS engine.
Faced with this huge performance gap between their two engine, it was clear that simply making the faster engine even faster wont cut it. As long as it is falling back to the old interpreter the overall speed won't improve appreciably. Mozilla's Tracing JIT has already gained a respectable standard when executing the code it works well with, and with such codes it manages to beat other browsers even with its current engine. Now something needed to be done about the slower engine.
The concept of JaegerMoneky (JM) is to create an inline threading capability of SpiderMonkey (SM) with method JIT, and to increase the baseline performance, so that it achieves as much operational speed as possible. It also has to be co-operative with tracing, and switchable too, to jump back and forth from inline thread to trace monitor, as the computation requires. It eventually speeds up the baseline performance, while having all the goodness of TM as is.
On the way to develop JaegerMonkey, the very first step to take was to select the assembler for it, which will be fast and do justice to all these optimizations in every possible way. The existing assembler for TM i.e. nanojit, performs compilation optimizations and generates very fast native codes, but it has overhead lags to perform these operations, resulting in delays. Enhancing nanojit wasn't impossible, but it includes too much of optimization (eventually making it more complex). The developers had to decide whether to stick with it or to move on and find an alternative.
The true beauty of Open Source is to share the progress cumulatively, and never to solve a problem twice. As the Nitro engine was performing efficiently for Apple's WebKit browser, it was already a stable platform to consider. It's open source in nature and being designed with C , theoretically it should give a perceivable performance boost. All that needed to do is a few customizations, to fit it run in Firefox builds.
The interpreter then hacked to maintain better contingency for memory management. The procedure of using stack values has been made simpler, as well as more efficient for memory addressing and accessing; even though the use of registers to hold the values of variables will be encouraged, as that will lead to speedier operations. Polymorphic inline caching (PIC) is being implemented, which takes advantages of property caching compared to C , but achieves it with even more faster native JIT codes.
The method JIT compiler will have richer control over the processor register, to increase the 'hit' and reduce data fetching from memory. Also, to reduce calling C functions, predefined 'fast paths' are being added to help speeding up more and more operations in native JIT code, without being explicitly evaluated. Other than these, loads of improvement will follow in managing variables and types; this includes optimizing global variables, managing strings with ropes.
The float variables will be taken to an entire new horizon of 128-bit including 64-bit payload, from the current procedure of addressing multiple 32-bit heaps and bit-operating them together to get the float value (which is 64-bit, typically). This would drastically reduce the effort needed to merge 2-3 heaps of 32-bit data and then use the value for the operation; rather it can directly store, fetch and purge the 64-bit float value in one single step.
Moreover, as a statement is compiled only once in method JIT, it doesn't matter how branchy the code is; it doesn't require computer for all brunches of the tree. This gives a huge advantage over tracing, where each new branch has to be traced in order to get compiled.
The caveat was, at the primary stage of development, the implementation of 'optional' type specialization in JM made the performance suffer in those cases which can be done better with the help of tracing. But even then the overall performance was comparable. And in course of reaching the final stage of development, JaegerMonkey with method JIT performs at par the same level as of TraceMonkey (on 27th July, it overtook TM in both SunSpider and V8 benchmark).
|This site no longer holds the benchmark data for the legacy engines (from 30th July), so what has been told herein may seem a little vague. Actually, if there were, the baseline engines' (without tracer or method JIT) performance graphs would have been three times higher than the shown ones.|
Modern browsing deals a lot with strings; so more the optimizations in this field, the better. For a better control over strings, Ropes will be used which deals with string operations more efficiently. Also, the implementation of x64 architectural capability is still due - report says, the procedure is in progress. It is prominent, that x64 will be an enormous advantage over the current 'x86 only' framework, as it has many times more processor registers than a 32-bit system, and as most of the development is primarily designed to be register based, instead of the prior stack based paradigm.
Not only the legacy stack has been dumped for variables, but in case of functions too. The new enhanced stack frames will help reduce redundancy as well as maintenance complexity, enriching the function call times. Developers are working hard to make it more and more effective; lots of perfection possibilities lie in there. And besides all these, working with a debugger is still slow. The process of training the debugger to be able to debug a JIT code is in progress, once it's completed, logically there should not be much of a lag.
Mozilla planed to employ JaegerMonkey in such a way, that when TM TraceMonkey finds it tough to operate on, JM takes it away from it and uses its optimization methods to produce the solution. Pretty much the same concept as of now; just, instead of using the baseline interpreter, the JaegerMonkey will be used. Although increasing the base performance a lot, this system will still have a lag when TM tries and retries and then hand-overs the job to JM. And by the time we speak of this, the integration of TM with JM has been done, but leaving this perfection in optimization undone.
As the basic solution, the switch between tracer to method JIT has to be faster. The analysis of more optimal JIT between them has to be determined without too many trial and error (and that's better to be done in the ideal period). It is in the development pipeline of the current ongoing progress, along with the implementation of more and more method JIT 'fast paths' and strong possession over the global and closure variables. Once it is done, we can expect huge performance upgrade since either of the engines. As the tracer and method JIT, each multiplies the baseline performance, while in action being integrated together, the result may seem radical.
With these much said, you may want to taste the wildness of JaegerMoneky, but the problem is JM is still not in the Mozilla's build trunk (not even in nightlies, I mean). It is not even likely to be integrated with a Firefox branding before reaching release candidate of version 4.0. So in the meantime - if you don't mind getting your hands a little dirty - follow Mozilla's guide to build with SpiderMonkey (the method doesn't require passing --enable-methodjit to configure anymore). As of now, only shell builds are supported - you will find the source code for JaegerMoneky in here.