Magic behind the JavaScript engines Part-2 V8 Engine

Manthan M Kulakarni
5 min readAug 7, 2020

--

Hey guys welcome to second article of Magic behind JavaScript engine series. In our last article we were discussed about basics of Language processor and JIT. So if you have missed that article then you can read it from the below link.

And when i was working on this article an example came to my mind. I think it is a great example to show JIT is way better than regular compilers.

switch statements
Simple conditional statement

I think most of you got what i was about to say about these snippets.

In regular compilers all the switch statements are converted into machine level code even though we know that only one of them is going to executed among them based on the value of variable x. This might not be a big thing when the number of cases is very small , but when the number of cases are in thousands it an overhead. While converting these conditional branching statements to machine code the compiler has to resolve the branching references for each case statements. These case labels are to be replace by the intra segment branching address followed by proper jump instructions along with converting the set of instruction inside each switch case. But in case of JIT the conversion of assembly level code to machine code is optimised. Based on the value of variable x only the required switch case is translated to machine level code which saves unnecessary time used in translation of all branching statements.

Now lets dive into todays topic The V8 engine

The first version of V8 engine was launched on 2nd September 2008 at the same time as the first version of Chrome browser.

Basic structure of V8

Parser and Syntax Tree generation

This is how V8 engine looks like, the JavaScript code which we write is just like series of string to compiler. These string are preprocessed where comments are removed.Then it is fed to scanner which break down these strings into smaller units called as tokens or lexemes. These tokens are nothing but the key value pair of token name and token value. This is feed to parser which builds a parse tree based on the Regular expression defined for that language. You can refer the below example which gives a brief knowledge of how the string is parsed in compilers based on the grammer.

Generation of parse tree for an express “id+id*id”

This parse tree tells how a string can be generated from Start symbol (automata theory). This parse tree is converted into abstract form called as Syntax tree(AST). AST is compact representation of our source code.

AST from Parse tree

Firing up the Ignition Interpreter

This part takes the AST and converts it into bytecode line by line. The major concern in developing this Ignition interpreter was that it has to run even on mobile phones which RAM as low as 512Bytes. So Goole engineers solved this problem by using register variables instead of using stack mechanism which involves lot of push and pop operations to store temporary variables. This byte code is executed and profiling data is sent to Turbofan. Even though interpreter looks as a separate entity in the block diagram of V8, Ignition Interpreter uses some parts of Turbofan pipeline.

Turbofan

Turbo fan was introduced in 2015 and during the release of V8 in 2008 there was no Optimizing compiler even though V8 was faster than other JavaScript compilers just by using technique like Caching and Inline Functions. Then in 2010 an Optimizing compiler called Crankshaft was introduced. Crankshaft used profiling data to optimize the byte code but major overhead was at deoptimizing step used in it. Then finally Turbofan was introduced the major reason behind the success of Turbofan over Crankshaft is usage of Sea of Nodes rather than using classical Control flow graph(CFG) (Sea of nodes and Control flow graph are various Intermediate representation used in comiplers). Control flow graph uses tree representation which might result redundant subtress. If you remember the discussion of Hot Spots in our previous article in Sea of Node representation each nodes represents a Hot Spots and doesn’t follow any hierarchical structure like CFG. It is just a graph where redundancy is eliminated and each node is separately optimized based on the degree of hotness.

Liftoff

Though Turbofan is an excellent in code optimization but might not be suitable for simple things. Turbofan optimizes the code with help of lot of Intermediate representation(IR) of code. This would slow up the start-up of WebApplication. So the goal of Liftoff is to reduce the start-up time of a WebApplication. Liftoff skips these IR and directly produces the byte code in an single pass. In this part speed is given more importance than the quality of byte code. Once the WebApplication is started Turbofan handle the further optimization.

TurboFan vs Liftoff in bytecode generation

Thank you guys for reading this article. In the next article of this series will cover other topics like Orinoco garbage collector and handling of dynamic data type in JavaScript .

With regards Manthan M Kulakarni

--

--

Manthan M Kulakarni
Manthan M Kulakarni

No responses yet