Compiled, Interpreted and Hybrid (Portable Compiled) Programming Languages
Compiled languages are translated into machine code before runtime, resulting in faster execution, while interpreted languages are translated and executed line by line during runtime, offering more flexibility but potentially slower performance. Compiled languages are typically more efficient and provide better control over hardware, but interpreted languages can be easier to learn and more adaptable. Compiled Languages:
Process: Code is translated into machine code by a compiler before execution.
Execution: Machine code is directly executed by the computer’s processor. Advantages: Faster execution speed, more efficient resource usage, and often provide more control over hardware. Examples: C, C++, Go, Rust.
Interpreted Languages:
Process: Code is translated and executed line by line by an interpreter during runtime.
Execution: The interpreter reads the source code and executes the corresponding machine instructions. Advantages: More flexible, easier to learn and debug, can adapt to runtime conditions, and often offer dynamic typing. Examples: Python, JavaScript, Ruby, PHP.
Key Differences Summarized:
Feature | Compiled Languages | Interpreted Languages | Hybrid Approach - both compiled and interpreted |
---|---|---|---|
Translation | Before runtime | During runtime | code is first compiled into an intermediate form (like bytecode) and then interpreted by a virtual machine |
Execution Speed | Generally faster | Generally slower | |
Flexibility | Less flexible | More flexible | |
Control | More control over hardware | Less control over hardware | |
Example Languages | C, C++, Go, Rust, Pascal | Python, Ruby, PHP | Java |
The Hybrid Approach combines some of the benefits of both compiled and interpreted languages
https://www.reddit.com/r/learnjava/comments/vgow91/in_actual_practice_what_does_it_mean_to_say_that/
Most compiled languages read a source code file, process it into native machine code (called object code) and then can also sometimes link the object code with system libraries to create machine-dependent native executable program files. C, C++, and many other languages work this way. Interpreted languages do not have directly executable files, but have a language interpreter that loads the source file and executes the statements therein. Python is an example of an interpreted language. JavaScript is a scripting language, and is closer to an interpreted language, but isn’t quite, as it is not designed to be used on its own, but from within another environment equipped with a JavaScript scripting engine, such as a web browser.
Java is a little different yet. It is not interpreted, it is a compiled language. However, it is not compiled to native executable code like C or C++. Instead, it is compiled to a bytecode file that is used by the Java Virtual Machine to convert the bytecode into native executable code. It does this so that the Java program is platform agnostic and can run on any system that supports a JVM without needing modifications. You can compare this with running a game console emulator on your computer. The emulator creates a simulated gaming console that runs files containing compiled images of the games (ROM files) and converts that code to run on its host. The important takeaway is that the JVM is not an interpreter or translator; it creates an emulated environment that is always the same, regardless of its host machine or host operating system, so that it can be seemingly platform independent.
From a programmer’s perspective, there is little difference working with an interpreted language, a compiled language, or a scripting language. The process is the same. You write human-readable code, and it gets executed on a machine. It is only the parts in-between those two that differ. Historically, languages that compile to native code allow the programmer the most flexibility because they can take advantage of every aspect of the machine they are programming. Interpreted languages strip away some of that flexibility in order to have a “least common denominator” approach to running the programs, and scripting languages are the most restrictive still, usually removing any ability whatsoever to directly access the host machine or its operating system and/or working with a limited set of features.
There are several nuances to the definitions of interpreted and compiled and neither of the statements above are quite right.
- Compilation is the act of translating one language into another. Some compilers actually do this conversion through multiple language stages (from source, to an intermediate representation language, to a final language). The final output language may or may not be native-code.
- A unit of ’native’ code is not necessarily executed in its entirety (eg some code path may not be used) and not of all the code executed is necessarily in a single file (the code may be one of several libraries dynamically loaded).
- Executing a ‘scripting language’ isn’t really ’line-by-line.
The interpreted/compiled distinction doesn’t hold as well as it used to.
Let’s call one group of languages ’native code languages’. These use a compiler whose final output stage is native code (machine object code). C, C++, pascal, rust and many others fall into this group. most often, these produce a single unit of execution (an executable file).
Another group of languages are ‘scripting languages’. These languages use an ‘interpreter’ that reads source at runtime. However it is not true that all (or even many) of these execute the source directly, many employing one or more tokenisation and compilation steps first (sometimes) to produce either an intermediate language or native code to execute (but usually without outputting an executable file). Python, JavaScript and many others fall into this group.
It should be noted that JavaScript, where the interpreter is built into the browser, is often compiled ‘just in time’, to native-code, before execution. Other scripting languages do this too - the trade off is the execution delay (while compilation happens) vs the performance improvements of native-code (where low-level optimisation can take place).
Another group of languages are ‘portable compiled languages’. These are compiled languages whose compiler output is an intermediate language. They require a ‘runtime’ on the target platform to execute the application, but execution may or may not execute the intermediate language directly - several include one or more ‘just in time’ compilers than selectively compile the intermediate language to native code.
Java falls into this last group.
- Java source is compiled by the Java compiler to byte-code
- Byte-code is loaded by the Java Virtual Machine (JVM) - one unit per class/type but often packaged into a file per archive or module.
- The JVM
- May use an interpreter to execute byte-code
- May use one or more Just-in-time compilers to compile byte-code to native code with varying degrees of optimisation.
- May choose to recompile code if it looks like different optimisations should apply.
Java uses statistics, of how the code is run, to guide the native-code optimisation. This means that running the same application with two very different data-sets, might choose to compile to native-code very differently. Languages like C/C++/Rust can do a similar thing with something called ‘profile-guided optimisation’ - but they have to choose a representative dataset, which may not match the data actually used.
And newer releases of the Java Virtual Machine can improve the optimisation and execution of older byte-code.
On the other-hand, Java optimisation has to be fast, and exist within the resource constraints of the execution environment. It has a much more limited time and space budget than a C++/Rust optimising compiler. Additionally the ‘runtime’ itself, just by its presence, takes up resources that the application could use to solve the applications computational exercise.
In practice, most of the code you run in Java, for any long-running process, will be native-code produced by the fast C1 HotSpot Just In Time compiler, or native-code produced by the more aggressively optimising C2 JIT compiler. The interpreter will be used until C1 has compiled the code in question, or if code is de-optimised (to allow C2 to re-optimise it to respond to changing execution statistics).
Computers don’t read your source code, they read machine instructions in the form of bytes.
C++ converts your program into a file containing those machine instructions, which makes said program run very fast, but comes with the downside that those instructions differ based on OS and hardware.
an interpreted language on the other hand lets you run your source code directly, turning it into machine instructions while the program is running. this makes it work on all systems that support the language, but obviously means the program is going to be much slower since it does that translation at runtime.
so what if you could have benefits from both of these? maybe we turn our source code into a code that resembles machine instructions (before running the program) but works on all systems, that way we can convert it to machine code on the fly much quicker without losing portability between systems. that’s how java works - that special code is known as byte code.