Skip to main content

Fundamentals

Python is an OOP, interpreted language that uses modular code. Standard implementation: cpython.

Internal working of Python

Steps Involved in Python Execution

  1. Python compiler reads the source code from the editor, initiating execution
  2. Code is saved as a .py file with system instructions
  3. Compilation converts source code to bytecode, checking for syntax errors and generating a .pyc file
  4. Bytecode is sent to PVM (Python Virtual Machine), which converts it to machine-executable code, executing line by line and halting on errors
  5. Within PVM, bytecode becomes machine code (binary 0s and 1s), optimized for the CPU
  6. CPU executes the machine code, producing the program's output

Global Interpreter Lock (GIL)

In CPython, the GIL is a mutex allowing only 1 thread to execute Python bytecode at a time, ensuring thread safety for reference counting memory management. Ongoing efforts aim to remove the GIL for better performance.

  • Benefits: Simplifies memory management, C extension integration, and interpreter complexity.
  • Drawbacks: Limits CPU-bound multi-threaded parallelism on multi-cores; I/O-bound tasks less affected (GIL released during waits). Use multiprocessing for parallelism.

Compiler vs. Interpreter

CompilerInterpreter
Faster; conversion before executionSlower; simultaneous execution
Errors detected before executionErrors at runtime
Needs recompilation for different machinesPortable with interpreter
Requires more memory for full translationRequires less memory
Debugging complex due to batch processingDebugging easier with line-by-line execution

Garbage Collection

Python's memory management relies on automatic mechanisms: reference counting and garbage collection. Reference counting tracks object references and deallocates memory when a count reaches zero. However, it fails with cyclic references (objects referencing each other).

To handle cycles, Python uses a generational garbage collector that groups objects by age into three generations, collecting younger ones more frequently for efficiency. It runs automatically based on allocation thresholds but can be manually triggered via the gc module.

This dual approach ensures efficient, automatic memory handling, minimizing manual overhead.