Skip to content
🧩 Bytecode

🧩 Bytecode

📜 Definition

Bytecode is a platform‑independent, low‑level instruction set generated from high‑level source code. It’s designed to be executed by a process virtual machine (VM) rather than directly by a CPU.


🔄 Where It Fits in the Execution Flow

Source CodeCompiler/TranslatorBytecodeVM (interpret or JIT) → Machine CodeCPU


🧠 Key Characteristics

  • Intermediate Representation (IR) — Sits between human‑readable code and CPU‑specific machine code.
  • Standardized Instruction Set — Often uses 1‑byte opcodes (0–255 possible instructions) plus operands.
  • Portable — Same bytecode runs on any platform with the right VM.
  • Efficient for VMs — Already parsed and semantically analyzed, so faster to interpret than raw source.

🛠 How It’s Executed

  1. Interpretation — VM reads and executes bytecode instructions one by one.
  2. JIT Compilation — VM compiles frequently used (“hot”) bytecode paths into native machine code at runtime for speed.

🔍 Examples

  • Java.java.class bytecode → JVM
  • .NET → C# → CIL (Common Intermediate Language) → CLR
  • Python.py.pyc bytecode → CPython VM
  • Lua.lua → Lua bytecode → Lua VM

🛡 Advantages

  • Portability — “Write once, run anywhere” (with the right VM).
  • Security — VM can verify and sandbox code before execution.
  • Optimization — JIT can adapt code to the current hardware.
  • Multi‑language Support — Different languages can target the same VM.

⚖️ Bytecode vs Machine Code

FeatureBytecodeMachine Code
TargetVirtual MachinePhysical CPU
PortabilityHigh (VM‑dependent)Low (CPU/OS‑specific)
ExecutionInterpreted or JIT‑compiled by VMDirectly executed by CPU
OptimizationRuntime (JIT)Compile‑time
SecurityVM sandbox + verificationRelies on OS/hardware protections

📌 Related Notes

  • Virtual Machines — System vs Process
  • Just‑In‑Time Compilation
  • Machine Code

🔤 Bytecode / Disassembly Suffix Legend

When viewing bytecode in a disassembler, suffix letters often indicate the data type or width of an immediate value or operand. These suffixes are not stored in the bytecode itself — they are for human‑readable notation.

SuffixType / WidthBitsExampleMeaning in context
bbyte (signed/unsigned)810bLiteral fits in one byte
sshort16300s16‑bit integer
iint321024i32‑bit integer
llong64999999l64‑bit integer
ffloat323.14f32‑bit IEEE‑754 floating point
ddouble642.71828d64‑bit IEEE‑754 floating point
z / boolboolean8*1zBoolean (true/false), encoded as 0/1 byte
cchar (UTF‑16 code unit)1665cCharacter literal (‘A’)

Note: Exact suffix sets vary by VM, language, or disassembler. For example, javap shows some typed constants with suffixes, while others rely on context from the constant pool.


📌 How to Use This Legend

  • Speeds up reverse‑engineering: quickly infer the operand’s size/type without checking spec.
  • Helpful for cross‑VM study: not all VMs use the same suffix convention.
  • Useful in debug notes alongside opcode meaning.
Last updated on