Fundamentals

Python is an OOP, interpreted language that uses modular code. Standard implementation: cpython.

Internal working of Python

Steps Involved in Python Execution

Python compiler reads the source code from the editor, initiating execution
Code is saved as a .py file with system instructions
Compilation converts source code to bytecode, checking for syntax errors and generating a .pyc file
Bytecode is sent to PVM (Python Virtual Machine), which converts it to machine-executable code, executing line by line and halting on errors
Within PVM, bytecode becomes machine code (binary 0s and 1s), optimized for the CPU
CPU executes the machine code, producing the program's output

Global Interpreter Lock (GIL)

In CPython, the GIL is a mutex allowing only 1 thread to execute Python bytecode at a time, ensuring thread safety for reference counting memory management. Ongoing efforts aim to remove the GIL for better performance.

Benefits: Simplifies memory management, C extension integration, and interpreter complexity.
Drawbacks: Limits CPU-bound multi-threaded parallelism on multi-cores; I/O-bound tasks less affected (GIL released during waits). Use multiprocessing for parallelism.

Compiler vs. Interpreter

Compiler	Interpreter
Faster; conversion before execution	Slower; simultaneous execution
Errors detected before execution	Errors at runtime
Needs recompilation for different machines	Portable with interpreter
Requires more memory for full translation	Requires less memory
Debugging complex due to batch processing	Debugging easier with line-by-line execution

Garbage Collection

Python's memory management relies on automatic mechanisms: reference counting and garbage collection. Reference counting tracks object references and deallocates memory when a count reaches zero. However, it fails with cyclic references (objects referencing each other).

To handle cycles, Python uses a generational garbage collector that groups objects by age into three generations, collecting younger ones more frequently for efficiency. It runs automatically based on allocation thresholds but can be manually triggered via the gc module.

This dual approach ensures efficient, automatic memory handling, minimizing manual overhead.

Python packages is a way to organize related Python modules (files with reusable code) in one directory, making code easier to reuse and distribute.

Package is a directory containing at least one module (.py file) or can contain multiple modules, sub-packages (nested directories) and a special __init__.py file. Without __init__.py file, directories aren't treated as packages (except for namespace packages in Python 3.3+).

my_package/
    __init__.py
    module_one.py
    module_two.py

Feature	Module (`.py file`)	Package (directory)
Structure	Flat structure; single file	Hierarchical structure; can contain sub-packages and multiple modules
Purpose	Code reuse, single topic	Organizing large projects
Namespace	Provides a single namespace for its contents	Provides a separate namespace for its modules, preventing name conflicts
Importing	Imported directly using the module name (e.g., `import module_one`)	Imported using the package name followed by the module name (e.g., `from my_package import module_one`, `import my_package.module_one`)
Initialization	No special initialization required	The `__init__.py` file can contain initialization code for the package

Python Package Ecosystem

The Python package ecosystem enables developers to publish, share, discover, and install reusable code libraries. Its foundation is PyPI, the primary repository for open-source packages. Community standards and formal proposals (PEPs) govern package formats, distribution, and installation, promoting compatibility across tools and platforms.

Package Management Tools

Aspect	`pip`	`conda`	`Poetry`	`Pipenv`	`uv`
Environments	venv, virtualenv	native	built-in/env	built-in/env	built-in/env, venv
Dependency Management	requirements.txt, direct install	environment.yml, conda install	pyproject.toml, poetry.lock	Pipfile, Pipfile.lock	pyproject/requirements.lock
Speed	standard	slower	moderate	moderate	fastest
Non-Python Support	❌	✅	❌	❌	❌
Lockfile Support	partial (via pip-tools)	✅	✅	✅	yes (requirements.lock)
Typical Usage	basic projects, scripts	data science, cross-lang	modern Python projects	intermediate projects	large/modern projects
Pros	simple, direct PyPI access	handles non-Python packages	modern, organized, isolated	integrates pip+env, easy to use	fast, automatic Python version
Cons	no environment isolation, manual lockfile	heavyweight, slower, not pure Python	no non-Python deps, native lockfile only	fewer updates, sometimes buggy	newer, less mature, Python-only
Best For	simple/small scripts	data/data science	robust apps/libraries	basic to moderate apps	fast prototyping, deployment

Naming Conventions

name must start with a letter or the underscore character
name cannot start with a number
name can only contain alpha-numeric characters and underscores [A-Za-z0-9_]
names are case-sensitive (firstname, Firstname, FirstName and FIRSTNAME) are different variables

Convention	Use Cases
`snake_case`	variable names: `first_name` attribute names: `self.user_id` function names: `def get_user_info()` method names: `def calculate_total(self)` parameter names: `def send_email(to_address)` module names: `user_profile.py` decorator names: `@login_required` package names: `my_package` file names: `data_processor.py`
`PascalCase`	class names: `class UserProfile:` exception names: `class UserNotFoundError(Exception):` interface names: `class IUser:` enum names: `class UserRole(Enum):`
`UPPER_CASE`	constant names: `MAX_RETRIES = 5` global variables: `API_URL = "https://api.example.com"` environment variables: `DATABASE_URL = "postgres://user:pass@localhost/dbname"` configuration settings: `DEBUG = True`

Comments

# Single line comments start with a number symbol.

""" Multiline strings can be written
    using three "s, and are often used
    as documentation.
"""

Primitive Datatypes and Operators

# You have numbers
3  # => 3

# Math is what you would expect
1 + 1   # => 2
8 - 1   # => 7
10 * 2  # => 20
35 / 5  # => 7.0

# Floor division rounds towards negative infinity
5 // 3       # => 1
-5 // 3      # => -2
5.0 // 3.0   # => 1.0  # works on floats too
-5.0 // 3.0  # => -2.0

# The result of division is always a float
10.0 / 3  # => 3.3333333333333335

# Modulo operation
7 % 3   # => 1
# i % j have the same sign as j, unlike C
-7 % 3  # => 2

# Exponentiation (x**y, x to the yth power)
2**3  # => 8

# Enforce precedence with parentheses
1 + 3 * 2    # => 7
(1 + 3) * 2  # => 8

# Boolean values are primitives (Note: the capitalization)
True   # => True
False  # => False

# negate with not
not True   # => False
not False  # => True

# Boolean Operators
# Note "and" and "or" are case-sensitive
True and False  # => False
False or True   # => True

# True and False are actually 1 and 0 but with different keywords
True + True  # => 2
True * 8     # => 8
False - 5    # => -5

# Comparison operators look at the numerical value of True and False
0 == False   # => True
2 > True     # => True
2 == True    # => False
-5 != False  # => True

# None, 0, and empty strings/lists/dicts/tuples/sets all evaluate to False.
# All other values are True
bool(0)      # => False
bool("")     # => False
bool([])     # => False
bool({})     # => False
bool(())     # => False
bool(set())  # => False
bool(4)      # => True
bool(-6)     # => True

# Using boolean logical operators on ints casts them to booleans for evaluation,
# but their non-cast value is returned. Don't mix up with bool(ints) and bitwise
# and/or (&,|)
bool(0)   # => False
bool(2)   # => True
0 and 2   # => 0
bool(-5)  # => True
bool(2)   # => True
-5 or 0   # => -5

# Equality is ==
1 == 1  # => True
2 == 1  # => False

# Inequality is !=
1 != 1  # => False
2 != 1  # => True

# More comparisons
1 < 10  # => True
1 > 10  # => False
2 <= 2  # => True
2 >= 2  # => True

# Seeing whether a value is in a range
1 < 2 and 2 < 3  # => True
2 < 3 and 3 < 2  # => False
# Chaining makes this look nicer
1 < 2 < 3  # => True
2 < 3 < 2  # => False

# (is vs. ==) is checks if two variables refer to the same object, but == checks
# if the objects pointed to have the same values.
a = [1, 2, 3, 4]  # Point a at a new list, [1, 2, 3, 4]
b = a             # Point b at what a is pointing to
b is a            # => True, a and b refer to the same object
b == a            # => True, a's and b's objects are equal
b = [1, 2, 3, 4]  # Point b at a new list, [1, 2, 3, 4]
b is a            # => False, a and b do not refer to the same object
b == a            # => True, a's and b's objects are equal

# Strings are created with " or '
"This is a string."
'This is also a string.'

# Strings can be added too
"Hello " + "world!"  # => "Hello world!"
# String literals (but not variables) can be concatenated without using '+'
"Hello " "world!"    # => "Hello world!"

# A string can be treated like a list of characters
"Hello world!"[0]  # => 'H'

# You can find the length of a string
len("This is a string")  # => 16

# Since Python 3.6, you can use f-strings or formatted string literals.
name = "Joe"
f"He said his name is {name}."  # => "He said his name is Joe"
# Any valid Python expression inside these braces is returned to the string.
f"{name} is {len(name)} characters long."  # => "Joe is 3 characters long."

num = 1000000
print(f"{num:,}")  # => "1,000,000"
print(f"{num:_}")  # => "1_000_000"

str = "name"
print(f"{str:^20}")   # => "        name        "
print(f"{str:|^20}")  # => "||||||||name||||||||"
print(f"{str:>20}")   # => "                name"
print(f"{str:<20}:")  # => "name                :"
print(f"{str:20}:")   # => "name                :"

date = datetime.now()
print(f"{date:%Y-%m-%d}")  # => "2023-10-05"
print(f"{date:%c}")  # local datetime: => "Wed Jan 01 01:00:00 2025"

float_num = 3.141
print(f"{float_num:.2f}")  # => "3.14"

a = 5
b = 10
print(f"a + b = {a + b}") # => "a + b = 15"
print(f"{a + b = }")      # => "a + b = 15"

# None is an object
None  # => None

# Don't use the equality "==" symbol to compare objects to None
# Use "is" instead. This checks for equality of object identity.
"etc" == None  # => False
None is None   # => True

Variables and Collections

# Python has a print function
print("I'm Python")  # => I'm Python

# By default the print function also prints out a newline at the end.
# Use the optional argument end to change the end string.
print("Hello, World", end="!")  # => Hello, World!

# Simple way to get input data from console
input_string_var = input("Enter some data: ")  # Returns the data as a string

# There are no declarations, only assignments.
some_var = 5
some_var  # => 5

# Accessing a previously unassigned variable is an exception.
some_unknown_var  # Raises a NameError

# if can be used as an expression
# Equivalent of C's '?:' ternary operator
"yes" if 0 > 1 else "no"  # => "no"

# Lists store sequences
li = []
# You can start with a prefilled list
other_li = [4, 5, 6]

# Add stuff to the end of a list with append
li.append(1)    # li is now [1]
li.append(2)    # li is now [1, 2]
li.append(3)    # li is now [1, 2, 3]
li.append(4)    # li is now [1, 2, 3, 4]
# Remove from the end with pop
li.pop()        # => 4 and li is now [1, 2, 3]
# Let's put it back
li.append(4)    # li is now [1, 2, 3, 4] again.

# Access a list like you would any array
li[0]   # => 1
# Look at the last element
li[-1]  # => 4

# Looking out of bounds is an IndexError
li[4]  # Raises an IndexError

# You can look at ranges with slice syntax.
# (It's a closed/open range for you mathy types.)
li[1:3]   # Return list from index 1 to 3 => [2, 3]
li[2:]    # Return list starting from index 2 => [3, 4]
li[:3]    # Return list from beginning until index 3  => [1, 2, 3]
li[::2]   # Return list selecting elements with a step size of 2 => [1, 3]
li[::-1]  # Return list in reverse order => [4, 3, 2, 1]
li[-3:-1] # Negative indices work too => [2, 3]
li[1:-1]  # => [2, 3]
li[-3:3]  # => [2, 3]
li[-3:]   # => [2, 3, 4]
li[:-1]   # => [1, 2, 3]
# Use any combination of these to make advanced slices
# li[start:end:step] # start (inclusive), end (exclusive), step

# Make a one layer deep copy using slices
li[::]       # Return a copy of the whole list => [1, 2, 3, 4]
li2 = li[:]  # => li2 = [1, 2, 3, 4] but (li2 is li) will result in false.

# Remove arbitrary elements from a list with "del"
del li[2]  # li is now [1, 2, 3]

# Remove first occurrence of a value
li.remove(2)  # li is now [1, 3]
li.remove(2)  # Raises a ValueError as 2 is not in the list

# Insert an element at a specific index
li.insert(1, 2)  # li is now [1, 2, 3] again

# Get the index of the first item found matching the argument
li.index(2)  # => 1
li.index(4)  # Raises a ValueError as 4 is not in the list

# You can add lists
# Note: values for li and for other_li are not modified.
li + other_li  # => [1, 2, 3, 4, 5, 6]

# Concatenate lists with "extend()"
li.extend(other_li)  # Now li is [1, 2, 3, 4, 5, 6]

# Check for existence in a list with "in"
1 in li  # => True

# Examine the length with "len()"
len(li)  # => 6


# Tuples are like lists but are immutable.
tup = (1, 2, 3)
tup[0]      # => 1
tup[0] = 3  # Raises a TypeError

# Note that a tuple of length one has to have a comma after the last element but
# tuples of other lengths, even zero, do not.
type((1))   # => <class 'int'>
type((1,))  # => <class 'tuple'>
type(())    # => <class 'tuple'>

# You can do most of the list operations on tuples too
len(tup)         # => 3
tup + (4, 5, 6)  # => (1, 2, 3, 4, 5, 6)
tup[:2]          # => (1, 2)
2 in tup         # => True

# You can unpack tuples (or lists) into variables
a, b, c = (1, 2, 3)  # a is now 1, b is now 2 and c is now 3
# You can also do extended unpacking
*a, b = (1, 2, 3, 4)  # a is now [1, 2, 3] and b is now 4
a, *b = (1, 2, 3, 4)  # a is now 1 and b is now [2, 3, 4]
a, *b, c = (1, 2, 3, 4)  # a is now 1, b is now [2, 3] and c is now 4
# Tuples are created by default if you leave out the parentheses
d, e, f = 4, 5, 6  # tuple 4, 5, 6 is unpacked into variables d, e and f
# respectively such that d = 4, e = 5 and f = 6
# Now look how easy it is to swap two values
e, d = d, e  # d is now 5 and e is now 4


# Dictionaries store mappings from keys to values
empty_dict = {}
# Here is a prefilled dictionary
filled_dict = {"one": 1, "two": 2, "three": 3}

# Note keys for dictionaries have to be immutable types. This is to ensure that
# the key can be converted to a constant hash value for quick look-ups.
# Immutable types include ints, floats, strings, tuples.
invalid_dict = {[1,2,3]: "123"}  # => Yield a TypeError: unhashable type: 'list'
valid_dict = {(1,2,3):[1,2,3]}   # Values can be of any type, however.

# Look up values with []
filled_dict["one"]  # => 1

# Get all keys as an iterable with "keys()". We need to wrap the call in list()
# to turn it into a list. We'll talk about those later.  Note - for Python
# versions <3.7, dictionary key ordering is not guaranteed. Your results might
# not match the example below exactly. However, as of Python 3.7, dictionary
# items maintain the order at which they are inserted into the dictionary.
list(filled_dict.keys())  # => ["three", "two", "one"] in Python <3.7
list(filled_dict.keys())  # => ["one", "two", "three"] in Python 3.7+


# Get all values as an iterable with "values()". Once again we need to wrap it
# in list() to get it out of the iterable. Note - Same as above regarding key
# ordering.
list(filled_dict.values())  # => [3, 2, 1]  in Python <3.7
list(filled_dict.values())  # => [1, 2, 3] in Python 3.7+

# Check for existence of keys in a dictionary with "in"
"one" in filled_dict  # => True
1 in filled_dict      # => False

# Looking up a non-existing key is a KeyError
filled_dict["four"]  # KeyError

# Use "get()" method to avoid the KeyError
filled_dict.get("one")      # => 1
filled_dict.get("four")     # => None
# The get method supports a default argument when the value is missing
filled_dict.get("one", 4)   # => 1
filled_dict.get("four", 4)  # => 4

# "setdefault()" inserts into a dictionary only if the given key isn't present
filled_dict.setdefault("five", 5)  # filled_dict["five"] is set to 5
filled_dict.setdefault("five", 6)  # filled_dict["five"] is still 5

# Adding to a dictionary
filled_dict.update({"four":4})  # => {"one": 1, "two": 2, "three": 3, "four": 4}
filled_dict["four"] = 4         # another way to add to dict

# Remove keys from a dictionary with del
del filled_dict["one"]  # Removes the key "one" from filled dict

# From Python 3.5 you can also use the additional unpacking options
{"a": 1, **{"b": 2}}  # => {'a': 1, 'b': 2}
{"a": 1, **{"a": 2}}  # => {'a': 2}


# Sets store ... well sets
empty_set = set()
# Initialize a set with a bunch of values.
some_set = {1, 1, 2, 2, 3, 4}  # some_set is now {1, 2, 3, 4}

# Similar to keys of a dictionary, elements of a set have to be immutable.
invalid_set = {[1], 1}  # => Raises a TypeError: unhashable type: 'list'
valid_set = {(1,), 1}

# Add one more item to the set
filled_set = some_set
filled_set.add(5)  # filled_set is now {1, 2, 3, 4, 5}
# Sets do not have duplicate elements
filled_set.add(5)  # it remains as before {1, 2, 3, 4, 5}

# Do set intersection with &
other_set = {3, 4, 5, 6}
filled_set & other_set  # => {3, 4, 5}

# Do set union with |
filled_set | other_set  # => {1, 2, 3, 4, 5, 6}

# Do set difference with -
{1, 2, 3, 4} - {2, 3, 5}  # => {1, 4}

# Do set symmetric difference with ^
{1, 2, 3, 4} ^ {2, 3, 5}  # => {1, 4, 5}

# Check if set on the left is a superset of set on the right
{1, 2} >= {1, 2, 3}  # => False

# Check if set on the left is a subset of set on the right
{1, 2} <= {1, 2, 3}  # => True

# Check for existence in a set with in
2 in filled_set   # => True
10 in filled_set  # => False

# Make a one layer deep copy
filled_set = some_set.copy()  # filled_set is {1, 2, 3, 4, 5}
filled_set is some_set        # => False

# Type Conversion Functions
bool(0)                     # This function is used to convert a value to boolean: => False, => True
bytes('hello', 'utf-8')     # This function is used to convert a string to bytes: => b'hello'
int(3.5)                    # converts any data type into integer type: => 3
float(3)                    # converts any data type into float type: => 3.0
hex(255)                    # converts integers to hexadecimal: => '0xff'
oct(8)                      # converts integer to octal: => '0o10'
str(123)                    # Used to convert integer into a string: => '123'
ord('a')                    # converts characters into integer: => 97
chr(97)                     # This function is used to convert an integer to a character: => 'a'
complex(3,4)                # This function converts real numbers to complex(real,imag) number: => (3+4j)
list((1,2,3))               # This function is used to convert any data type to a list type: => [1,2,3]
tuple([1,2,3])              # This function is used to convert to a tuple: => (1,2,3)
set([1,2,2,3])              # This function returns the type after converting to set: => {1,2,3}
dict([('a',1),('b',2)])     # This function is used to convert a tuple of order (key,value) into a dictionary: => {'a':1,'b':2}

Binary Operations

# Binary literals start with 0b
0b1010  # => 10
# Hex literals start with 0x
0x1A  # => 26
# Octal literals start with 0o
0o12  # => 10

a = 5       # Binary 0101
b = 3       # Binary 0011
# Bitwise AND
a & b       # => 1 (Binary 0001)
# Bitwise OR
a | b       # => 7 (Binary 0111)
# Bitwise XOR
a ^ b       # => 6 (Binary 0110)
# Bitwise NOT
~a          # => -6 (Binary ...11111010)
# Left Shift
a << 1      # => 10 (Binary 1010)
# Right Shift
a >> 1      # => 2 (Binary 0010)

Regular Expressions

import re

# Match a pattern.
# Raw string literal. The `r` prefix tells Python
# to treat backslashes as literal characters and not as escape characters.
pattern = r"\d+"
text = "There are 123 apples"
match = re.search(pattern, text)
if match:
    print("Found:", match.group()) # => Found: 123

# Find all matches
all_matches = re.findall(pattern, text)
print("All matches:", all_matches) # => All matches: ['123']

# Replace text
new_text = re.sub(pattern, "456", text)
print("Replaced text:", new_text) # => Replaced text: There are 456 apples

# Split text
split_text = re.split(r"\s+", text)
print("Split text:", split_text) # => Split text: ['There', 'are', '123', 'apples']

# Compile a regex for repeated use
compiled_pattern = re.compile(r"\w+")
words = compiled_pattern.findall(text)
print("Words:", words) # => Words: ['There', 'are', '123', 'apples']

# Regex flags
case_insensitive_pattern = re.compile(r"apples", re.IGNORECASE)

Control Flow and Iterables

# Let's just make a variable
some_var = 5

# Here is an if statement. Indentation is significant in Python!
# Convention is to use four spaces, not tabs.
# This prints "some_var is smaller than 10"
if some_var > 10:
    print("some_var is totally bigger than 10.")
elif some_var < 10:    # This elif clause is optional.
    print("some_var is smaller than 10.")
else:                  # This is optional too.
    print("some_var is indeed 10.")

# Match/Case  -  Introduced in Python 3.10
# It compares a value against multiple patterns and executes the matching case block.

command = "run"

match command:
    case "run":
        print("The robot started to run 🏃‍♂️")
    case "speak" | "say_hi":  # multiple options (OR pattern)
        print("The robot said hi 🗣️")
    case code if command.isdigit():  # conditional
        print(f"The robot execute code: {code}")
    case _:  # _ is a wildcard that never fails (like default/else)
        print("Invalid command ❌")

# Output: "the robot started to run 🏃‍♂️"

"""
For loops iterate over lists
prints:
    dog is a mammal
    cat is a mammal
    mouse is a mammal
"""
for animal in ["dog", "cat", "mouse"]:
    # You can use format() to interpolate formatted strings
    print("{} is a mammal".format(animal))

"""
"range(number)" returns an iterable of numbers
from zero up to (but excluding) the given number
prints:
    0
    1
    2
    3
"""
for i in range(4):
    print(i)

"""
"range(lower, upper)" returns an iterable of numbers
from the lower number to the upper number
prints:
    4
    5
    6
    7
"""
for i in range(4, 8):
    print(i)

"""
"range(lower, upper, step)" returns an iterable of numbers
from the lower number to the upper number, while incrementing
by step. If step is not indicated, the default value is 1.
prints:
    4
    6
"""
for i in range(4, 8, 2):
    print(i)

"""
Loop over a list to retrieve both the index and the value of each list item:
    0 dog
    1 cat
    2 mouse
"""
animals = ["dog", "cat", "mouse"]
for i, value in enumerate(animals):
    print(i, value)

"""
While loops go until a condition is no longer met.
prints:
    0
    1
    2
    3
"""
x = 0
while x < 4:
    print(x)
    x += 1  # Shorthand for x = x + 1

# Handle exceptions with a try/except block
try:

    # Use "raise" to raise an error
    raise IndexError("This is an index error")
except IndexError as e:
    pass                 # Refrain from this, provide a recovery (next example).
except (TypeError, NameError):
    pass                 # Multiple exceptions can be processed jointly.
else:                    # Optional clause to the try/except block. Must follow
                        # all except blocks.
    print("All good!")   # Runs only if the code in try raises no exceptions
finally:                 # Execute under all circumstances
    print("We can clean up resources here")

# Instead of try/finally to cleanup resources you can use a with statement
with open("myfile.txt") as f:
    for line in f:
        print(line)

# Writing to a file
contents = {"aa": 12, "bb": 21}
# Context Managers set up and automatically clean up resources for code blocks, commonly used with the `with` statement
with open("myfile1.txt", "w") as file:
    file.write(str(contents))        # writes a string to a file

import json
with open("myfile2.txt", "w") as file:
    file.write(json.dumps(contents))  # writes an object to a file

# Reading from a file
with open("myfile1.txt") as file:
    contents = file.read()           # reads a string from a file
print(contents)
# print: {"aa": 12, "bb": 21}

with open("myfile2.txt", "r") as file:
    contents = json.load(file)       # reads a json object from a file
print(contents)
# print: {"aa": 12, "bb": 21}


# Python offers a fundamental abstraction called the Iterable.
# An iterable is an object that can be treated as a sequence.
# The object returned by the range function, is an iterable.

filled_dict = {"one": 1, "two": 2, "three": 3}
our_iterable = filled_dict.keys()
print(our_iterable)  # => dict_keys(['one', 'two', 'three']). This is an object
                    # that implements our Iterable interface.

# We can loop over it.
for i in our_iterable:
    print(i)  # Prints one, two, three

# However we cannot address elements by index.
our_iterable[1]  # Raises a TypeError

# An iterable is an object that knows how to create an iterator.
our_iterator = iter(our_iterable)

# Our iterator is an object that can remember the state as we traverse through
# it. We get the next object with "next()".
next(our_iterator)  # => "one"

# It maintains state as we iterate.
next(our_iterator)  # => "two"
next(our_iterator)  # => "three"

# After the iterator has returned all of its data, it raises a
# StopIteration exception
next(our_iterator)  # Raises StopIteration

# We can also loop over it, in fact, "for" does this implicitly!
our_iterator = iter(our_iterable)
for i in our_iterator:
    print(i)  # Prints one, two, three

# You can grab all the elements of an iterable or iterator by call of list().
list(our_iterable)  # => Returns ["one", "two", "three"]
list(our_iterator)  # => Returns [] because state is saved

Functions

# Use "def" to create new functions
def add(x, y):
    print("x is {} and y is {}".format(x, y))
    return x + y  # Return values with a return statement

# Calling functions with parameters
add(5, 6)  # => prints out "x is 5 and y is 6" and returns 11

# Another way to call functions is with keyword arguments
add(y=6, x=5)  # Keyword arguments can arrive in any order.

# You can define functions that take a variable number of
# positional arguments
def varargs(*args):
    return args

varargs(1, 2, 3)  # => (1, 2, 3)

# You can define functions that take a variable number of
# keyword arguments, as well
def keyword_args(**kwargs):
    return kwargs

# Let's call it to see what happens
keyword_args(big="foot", loch="ness")  # => {"big": "foot", "loch": "ness"}


# You can do both at once, if you like
def all_the_args(*args, **kwargs):
    print(args)
    print(kwargs)
"""
all_the_args(1, 2, a=3, b=4) prints:
    (1, 2)
    {"a": 3, "b": 4}
"""

# When calling functions, you can do the opposite of args/kwargs!
# Use * to expand args (tuples) and use ** to expand kwargs (dictionaries).
args = (1, 2, 3, 4)
kwargs = {"a": 3, "b": 4}
all_the_args(*args)            # equivalent: all_the_args(1, 2, 3, 4)
all_the_args(**kwargs)         # equivalent: all_the_args(a=3, b=4)
all_the_args(*args, **kwargs)  # equivalent: all_the_args(1, 2, 3, 4, a=3, b=4)

# Returning multiple values (with tuple assignments)
def swap(x, y):
    return y, x  # Return multiple values as a tuple without the parenthesis.
                # (Note: parenthesis have been excluded but can be included)

x = 1
y = 2
x, y = swap(x, y)     # => x = 2, y = 1
# (x, y) = swap(x,y)  # Again the use of parenthesis is optional.

# global scope
x = 5

def set_x(num):
    # local scope begins here
    # local var x not the same as global var x
    x = num    # => 43
    print(x)   # => 43

def set_global_x(num):
    # global indicates that particular var lives in the global scope
    global x
    print(x)   # => 5
    x = num    # global var x is now set to 6
    print(x)   # => 6

set_x(43)
set_global_x(6)
"""
prints:
    43
    5
    6
"""


# Python has first class functions
def create_adder(x):
    def adder(y):
        return x + y
    return adder

add_10 = create_adder(10)
add_10(3)   # => 13

# Closures in nested functions:
# We can use the nonlocal keyword to work with variables in nested scope which shouldn't be declared in the inner functions.
def create_avg():
    total = 0
    count = 0
    def avg(n):
        nonlocal total, count
        total += n
        count += 1
        return total/count
    return avg
avg = create_avg()
avg(3)  # => 3.0
avg(5)  # (3+5)/2 => 4.0
avg(7)  # (8+7)/3 => 5.0

# There are also anonymous functions
(lambda x: x > 2)(3)                  # => True
(lambda x, y: x ** 2 + y ** 2)(2, 1)  # => 5

# There are built-in higher order functions
list(map(add_10, [1, 2, 3]))          # => [11, 12, 13]
list(map(max, [1, 2, 3], [4, 2, 1]))  # => [4, 2, 3]

list(filter(lambda x: x > 5, [3, 4, 5, 6, 7]))  # => [6, 7]

# We can use list comprehensions for nice maps and filters
# List comprehension stores the output as a list (which itself may be nested).
[add_10(i) for i in [1, 2, 3]]         # => [11, 12, 13]
[x for x in [3, 4, 5, 6, 7] if x > 5]  # => [6, 7]

# You can construct set and dict comprehensions as well.
{x for x in "abcddeef" if x not in "abc"}  # => {'d', 'e', 'f'}
{x: x**2 for x in range(5)}  # => {0: 0, 1: 1, 2: 4, 3: 9, 4: 16}

Modules

# You can import modules
import math
print(math.sqrt(16))  # => 4.0

# You can get specific functions from a module
from math import ceil, floor
print(ceil(3.7))   # => 4
print(floor(3.7))  # => 3

# You can import all functions from a module.
# Warning: this is not recommended
from math import *

# You can shorten module names
import math as m
math.sqrt(16) == m.sqrt(16)  # => True

# Python modules are just ordinary Python files. You
# can write your own, and import them. The name of the
# module is the same as the name of the file.

# You can find out which functions and attributes
# are defined in a module.
import math
dir(math)

# If you have a Python script named math.py in the same
# folder as your current script, the file math.py will
# be loaded instead of the built-in Python module.
# This happens because the local folder has priority
# over Python's built-in libraries.

Classes

# We use the "class" statement to create a class
class Human:

    # A class attribute. It is shared by all instances of this class
    species = "H. sapiens"

    # Basic initializer, this is called when this class is instantiated.
    # Note that the double leading and trailing underscores denote objects
    # or attributes that are used by Python but that live in user-controlled
    # namespaces. Methods(or objects or attributes) like: __init__, __str__,
    # __repr__ etc. are called special methods (or sometimes called dunder
    # methods). You should not invent such names on your own.
    def __init__(self, name):
        # Assign the argument to the instance's name attribute
        self.name = name

        # Initialize property
        self._age = 0   # the leading underscore indicates the "age" property is
                        # intended to be used internally
                        # do not rely on this to be enforced: it's a hint to other devs

    # An instance method. All methods take "self" as the first argument
    def say(self, msg):
        print("{name}: {message}".format(name=self.name, message=msg))

    # Another instance method
    def sing(self):
        return "yo... yo... microphone check... one two... one two..."

    # A class method is shared among all instances
    # They are called with the calling class as the first argument
    @classmethod
    def get_species(cls):
        return cls.species

    # A static method is called without a class or instance reference
    @staticmethod
    def grunt():
        return "*grunt*"

    # A property is just like a getter.
    # It turns the method age() into a read-only attribute of the same name.
    # There's no need to write trivial getters and setters in Python, though.
    @property
    def age(self):
        return self._age

    # This allows the property to be set
    @age.setter
    def age(self, age):
        self._age = age

    # This allows the property to be deleted
    @age.deleter
    def age(self):
        del self._age


# When a Python interpreter reads a source file it executes all its code.
# This __name__ check makes sure this code block is only executed when this
# module is the main program.
if __name__ == "__main__":
    # Instantiate a class
    i = Human(name="Ian")
    i.say("hi")                     # "Ian: hi"
    j = Human("Joel")
    j.say("hello")                  # "Joel: hello"
    # i and j are instances of type Human; i.e., they are Human objects.

    # Call our class method
    i.say(i.get_species())          # "Ian: H. sapiens"
    # Change the shared attribute
    Human.species = "H. neanderthalensis"
    i.say(i.get_species())          # => "Ian: H. neanderthalensis"
    j.say(j.get_species())          # => "Joel: H. neanderthalensis"

    # Call the static method
    print(Human.grunt())            # => "*grunt*"

    # Static methods can be called by instances too
    print(i.grunt())                # => "*grunt*"

    # Update the property for this instance
    i.age = 42
    # Get the property
    i.say(i.age)                    # => "Ian: 42"
    j.say(j.age)                    # => "Joel: 0"
    # Delete the property
    del i.age
    # i.age                         # => this would raise an AttributeError

Inheritance

# Inheritance allows new child classes to be defined that inherit methods and
# variables from their parent class.

# Using the Human class defined above as the base or parent class, we can
# define a child class, Superhero, which inherits variables like "species",
# "name", and "age", as well as methods, like "sing" and "grunt"
# from the Human class, but can also have its own unique properties.

# To take advantage of modularization by file you could place the classes above
# in their own files, say, human.py

# To import functions from other files use the following format
# from "filename-without-extension" import "function-or-class"

from human import Human


# Specify the parent class(es) as parameters to the class definition
class Superhero(Human):

    # If the child class should inherit all of the parent's definitions without
    # any modifications, you can just use the "pass" keyword (and nothing else)
    # but in this case it is commented out to allow for a unique child class:
    # pass

    # Child classes can override their parents' attributes
    species = "Superhuman"

    # Children automatically inherit their parent class's constructor including
    # its arguments, but can also define additional arguments or definitions
    # and override its methods such as the class constructor.
    # This constructor inherits the "name" argument from the "Human" class and
    # adds the "superpower" and "movie" arguments:
    def __init__(self, name, movie=False,
                superpowers=["super strength", "bulletproofing"]):

        # add additional class attributes:
        self.fictional = True
        self.movie = movie
        # be aware of mutable default values, since defaults are shared
        self.superpowers = superpowers

        # The "super" function lets you access the parent class's methods
        # that are overridden by the child, in this case, the __init__ method.
        # This calls the parent class constructor:
        super().__init__(name)

    # override the sing method
    def sing(self):
        return "Dun, dun, DUN!"

    # add an additional instance method
    def boast(self):
        for power in self.superpowers:
            print("I wield the power of {pow}!".format(pow=power))


if __name__ == "__main__":
    sup = Superhero(name="Tick")

    # Instance type checks
    if isinstance(sup, Human):
        print("I am human")
    if type(sup) is Superhero:
        print("I am a superhero")

    # Get the "Method Resolution Order" used by both getattr() and super()
    # (the order in which classes are searched for an attribute or method)
    # This attribute is dynamic and can be updated
    print(Superhero.__mro__)    # => (<class '__main__.Superhero'>,
                                # => <class 'human.Human'>, <class 'object'>)

    # Calls parent method but uses its own class attribute
    print(sup.get_species())    # => Superhuman

    # Calls overridden method
    print(sup.sing())           # => Dun, dun, DUN!

    # Calls method from Human
    sup.say("Spoon")            # => Tick: Spoon

    # Call method that exists only in Superhero
    sup.boast()                 # => I wield the power of super strength!
                                # => I wield the power of bulletproofing!

    # Inherited class attribute
    sup.age = 31
    print(sup.age)              # => 31

    # Attribute that only exists within Superhero
    print("Am I Oscar eligible? " + str(sup.movie))

Multiple Inheritance

# Another class definition
# bat.py
class Bat:

    species = "Baty"

    def __init__(self, can_fly=True):
        self.fly = can_fly

    # This class also has a say method
    def say(self, msg):
        msg = "... ... ..."
        return msg

    # And its own method as well
    def sonar(self):
        return "sonar"


if __name__ == "__main__":
    b = Bat()
    print(b.say("hello"))
    print(b.fly)


# And yet another class definition that inherits from Superhero and Bat
# superhero.py
from superhero import Superhero
from bat import Bat

# Define Batman as a child that inherits from both Superhero and Bat
class Batman(Superhero, Bat):

    def __init__(self, *args, **kwargs):
        # Typically to inherit attributes you have to call super:
        # super(Batman, self).__init__(*args, **kwargs)
        # However we are dealing with multiple inheritance here, and super()
        # only works with the next base class in the MRO list.
        # So instead we explicitly call __init__ for all ancestors.
        # The use of *args and **kwargs allows for a clean way to pass
        # arguments, with each parent "peeling a layer of the onion".
        Superhero.__init__(self, "anonymous", movie=True,
                        superpowers=["Wealthy"], *args, **kwargs)
        Bat.__init__(self, *args, can_fly=False, **kwargs)
        # override the value for the name attribute
        self.name = "Bat Man"

    def sing(self):
        return "batman!"


if __name__ == "__main__":
    sup = Batman()

    # The Method Resolution Order
    print(Batman.__mro__)     # => (<class '__main__.Batman'>,
                            # => <class 'superhero.Superhero'>,
                            # => <class 'human.Human'>,
                            # => <class 'bat.Bat'>, <class 'object'>)

    # Calls parent method but uses its own class attribute
    print(sup.get_species())  # => Superhuman

    # Calls overridden method
    print(sup.sing())         # => nan nan nan nan nan batman!

    # Calls method from Human, because inheritance order matters
    sup.say("I agree")        # => Sad Affleck: I agree

    # Call method that exists only in 2nd ancestor
    print(sup.sonar())        # => ))) ... (((

    # Inherited class attribute
    sup.age = 100
    print(sup.age)            # => 100

    # Inherited attribute from 2nd ancestor whose default value was overridden.
    print("Can I fly? " + str(sup.fly))  # => Can I fly? False

Advanced

# Generators help you make lazy code.
def double_numbers(iterable):
    for i in iterable:
        yield i + i

# Generators are memory-efficient because they only load the data needed to
# process the next value in the iterable. This allows them to perform
# operations on otherwise prohibitively large value ranges.
# NOTE: `range` replaces `xrange` in Python 3.
for i in double_numbers(range(1, 900000000)):  # `range` is a generator.
    print(i)
    if i >= 30:
        break

# Just as you can create a list comprehension, you can create generator
# comprehensions as well.
values = (-x for x in [1,2,3,4,5])
for x in values:
    print(x)  # prints -1 -2 -3 -4 -5 to console/terminal

# You can also cast a generator comprehension directly to a list.
values = (-x for x in [1,2,3,4,5])
gen_to_list = list(values)
print(gen_to_list)  # => [-1, -2, -3, -4, -5]


# Decorators are a form of syntactic sugar.
# They make code easier to read while accomplishing clunky syntax.

# Wrappers are one type of decorator.
# They're really useful for adding logging to existing functions without needing to modify them.

def log_function(func):
    def wrapper(*args, **kwargs):
        print("Entering function", func.__name__)
        result = func(*args, **kwargs)
        print("Exiting function", func.__name__)
        return result
    return wrapper

@log_function               # equivalent:
def my_function(x,y):       # def my_function(x,y):
    """Adds two numbers together."""
    return x+y              #   return x+y
                            # my_function = log_function(my_function)
# The decorator @log_function tells us as we begin reading the function definition
# for my_function that this function will be wrapped with log_function.
# When function definitions are long, it can be hard to parse the non-decorated
# assignment at the end of the definition.

my_function(1,2)  # => "Entering function my_function"
                # => "3"
                # => "Exiting function my_function"

# But there's a problem.
# What happens if we try to get some information about my_function?

print(my_function.__name__)  # => 'wrapper'
print(my_function.__doc__)  # => None (wrapper function has no docstring)

# Because our decorator is equivalent to my_function = log_function(my_function)
# we've replaced information about my_function with information from wrapper

# Fix this using functools

from functools import wraps

def log_function(func):
    @wraps(func)  # this ensures docstring, function name, arguments list, etc. are all copied
                # to the wrapped function - instead of being replaced with wrapper's info
    def wrapper(*args, **kwargs):
        print("Entering function", func.__name__)
        result = func(*args, **kwargs)
        print("Exiting function", func.__name__)
        return result
    return wrapper

@log_function
def my_function(x,y):
    """Adds two numbers together."""
    return x+y

my_function(1,2)  # => "Entering function my_function"
                # => "3"
                # => "Exiting function my_function"

print(my_function.__name__)  # => 'my_function'
print(my_function.__doc__)  # => 'Adds two numbers together.'

Built-in data types

Category	Type	Immutable	Description	Example
Text	`str`	✅	String of Unicode characters	`"Hello, World"`
Numeric	`int`	✅	Integer (whole number)	`42`
	`float`	✅	Floating-point number (decimal)	`3.14`
	`complex`	✅	Complex number (real and imaginary parts)	`1 + 2j`
Sequence	`list`	❌	Ordered, mutable collection of items	`[1, 2, 3]`
	`tuple`	✅	Ordered, immutable collection of items	`(1, 2, 3)`
	`range`	✅	Immutable sequence of numbers (often used in loops)	`range(0, 10)`
Mapping	`dict`	❌	Unordered collection of key-value pairs	`{"key": "value"}`
Set	`set`	❌	Unordered collection of unique items	`{1, 2, 3}`
Set	`frozenset`	✅	Immutable version of a set	`frozenset([1, 2, 3])`
Boolean	`bool`	✅	Represents truth values	`True` or `False`
Binary	`bytes`	✅	Immutable sequence of bytes	`b"Hello"`
	`bytearray`	❌	Mutable sequence of bytes	`bytearray(b"Hello")`
	`memoryview`	❌	Memory view object (allows access to the internal data of an object)	`memoryview(b"Hello")`
None Type	`NoneType`	✅	Represents the absence (NULL) of a value	`None`

List, Tuple, Set, Dictionary

Feature	List	Tuple	Set	Dictionary
Syntax	`[ ]`	`( )`	`{ }`	`{key: value}`
Mutability	Mutable	Immutable	Mutable	Mutable
Order	Ordered	Ordered	Unordered	Ordered (Python 3.7+)
Duplicate elements	Allowed	Allowed	Not allowed	Keys unique, values allowed
Indexing	Supports integer indexing	Supports integer indexing	No indexing	Key-based indexing
Addition of elements	Yes, using append()/insert()	No	Yes, using add()	Yes, by key assignment
Deletion of elements	Yes, using pop()/del	No	Yes, using pop()/remove()	Yes, by key
Heterogeneous elements (store elements of different data types)	Allowed	Allowed	Allowed	Allowed (keys and values)
Nesting	Allowed	Allowed	Allowed	Allowed
Typical Use Case	Ordered collection, modifiable	Fixed collection, constant data	Unique collection, unordered	Key-value pairs, fast lookup
Performance Notes	Slower than tuple for iteration	Faster than list, space efficient	Fast for membership (`in` / `not in`) checks	Fast lookups via hashing

Shallow vs. Deep Copy

Aspect	Shallow Copy	Deep Copy
Definition	Creates a new object, but inserts references into it to the objects found in the original	Creates a new object and recursively adds copies of nested objects found in the original
Use Case	When you want a new collection but are okay with shared references to nested objects	When you need a completely independent copy of an object and all its nested objects
Implementation	Using the `copy` module's `copy()` function or slicing for lists	Using the `copy` module's `deepcopy()` function
Example	`from copy import copy original = [[1, 2, 3], [4, 5, 6]] shallow_copied = copy(original) shallow_copied[0][0] = 'X' print(original) # => [['X', 2, 3], [4, 5, 6]]`	`from copy import deepcopy original = [[1, 2, 3], [4, 5, 6]] deep_copied = deepcopy(original) deep_copied[0][0] = 'X' print(original) # => [[1, 2, 3], [4, 5, 6]]`

Monkey Patching

# Monkey patching is a technique to modify or extend code at runtime.
# It is often used to change or extend the behavior of libraries or classes
# without modifying their source code.
class A:
    def method(self):
        return "original method"
a = A()
print(a.method())  # => "original method"
# Now we will monkey patch the method
def new_method(self):
    return "patched method"
A.method = new_method
print(a.method())  # => "patched method"

Regular vs. Metaclasses

Aspect	Regular Classes	Metaclasses
Definition	Blueprints for creating instances (objects)	Classes that define how other classes are created (blueprints of classes)
Purpose	Define attributes and behaviors of instances	Define attributes and behaviors of classes themselves
Instantiation	Creating instances (objects) of the class	Creating classes (which in turn create instances)
Default Metaclass	N/A (they are instances of metaclasses)	The default metaclass is `type` in Python, which creates all regular classes
How created	Defined using the `class` keyword	Usually created by subclassing `type` and overriding methods
Use Cases	Modeling real-world entities and data structures	Customizing or controlling class creation, enforcing constraints, injecting methods, creating frameworks
Relationship	Classes are instances of metaclasses	Metaclasses are classes whose instances are classes
Complexity	Basic OOP concept, broadly used	Advanced topic, used mainly for metaprogramming or framework development
Example	`class Dog: def bark(self): return "Woof!" my_dog = Dog() print(my_dog.bark()) # => "Woof!"`	`class Meta(type): def __new__(cls, name, bases, attrs): attrs['greet'] = lambda self: "Hello from " + name return super().__new__(cls, name, bases, attrs) class Cat(metaclass=Meta): pass my_cat = Cat() print(my_cat.greet()) # => "Hello from Cat"`

Access Specifier

Aspect	Public	Protected	Private
Definition	Members are open and can be accessed from any part of the program without restriction	Members are meant to be accessed only within the class and its derived classes as a convention only	Members are restricted to the defining class and not accessible directly from outside or subclasses
Naming Convention	No underscore prefix	Single underscore prefix (`_name`)	Double underscore prefix (`__name`)
Visibility	Accessible from anywhere (inside and outside the class)	Accessible within the class and subclasses, not outside	Accessible only inside the class where defined
Example	`class MyClass: def __init__(self): self.public_var = "I am public" obj = MyClass() print(obj.public_var) # => "I am public"`	`class MyClass: def __init__(self): self._protected_var = "I am protected" obj = MyClass() print(obj._protected_var) # => "I am protected" (but should be treated as non-public)`	`class MyClass: def __init__(self): self.__private_var = "I am private" def get_private(self): return self.__private_var obj = MyClass() print(obj.get_private()) # => "I am private" # print(obj.__private_var) # => AttributeError`

Walrus Operator

# The walrus operator (:=) allows assignment and return of a value in the same
# expression. It is useful in situations where you want to both assign a value
# to a variable and use that value in a condition or expression.
numbers = [1, 2, 3, 4, 5]

# Example 1: Using walrus operator in a while loop
if (n := len(numbers)) > 3:
    print(f"List is too long ({n} elements, expected <= 3)")

# Example 2: Using walrus operator in a list comprehension
while (n := len(numbers)) > 0:
    print(numbers.pop())

*args vs. **kwargs

Aspect	`*args`	`**kwargs`
Syntax	A single asterisk `*` before a parameter name	Double asterisks `**` before a parameter name
Parameter type	Collects extra positional arguments	Collects extra keyword (named) arguments
Data type inside function	Tuple of positional arguments	Dictionary of keyword arguments (key-value pairs)
Usage scenario	When number of positional arguments is variable	When number of keyword arguments is variable
Access	Iterate over tuple or access by index	Iterate over dictionary items (key-value)
Example of function definition	`def func(*args):`	`def func(**kwargs):`
Example input	`func(1, 2, 3)`	`func(name='John', age=25)`
Use with keyword args	Does not handle keyword arguments	Handles only keyword arguments
Ordering in function definition	Must come before `**kwargs`	Must come after `*args`
Common usage	When expecting multiple non-keyword parameters	When expecting multiple named parameters

Method Resolution Order (MRO)

# MRO is the order in which base classes are searched when executing a method.
# It is especially important in the context of multiple inheritance.
# Python uses the C3 linearization algorithm to determine the MRO.
class A:
    def method(self):
        return "Method from A"
class B(A):
    def method(self):
        return "Method from B"
class C(A):
    def method(self):
        return "Method from C"
class D(B, C):
    pass
d = D()
print(d.method())  # => "Method from B"
print(D.__mro__)  # => (<class '__main__.D'>, <class '__main__.B'>,
                  # => <class '__main__.C'>, <class '__main__.A'>, <class 'object'>)

Dunder (magic methods)

Dunder ("Double Under (Underscores)") methods (or magic methods) in Python are special methods with double underscores used for internal operations. They allow custom classes to mimic built-in types, enabling operator overloading (like + via __add__) and integration with functions (like len() via __len__). Python invokes them automatically for interactions with operators or built-ins.

Common dunder methods:

__init__(self, ...): Initializes a new instance (constructor method)
__str__(self): Returns a human-readable string representation
__repr__(self): Returns an official string representation
__len__(self): Enables the use of len() on the object
__add__(self, other): Defines behavior for the + operator
__eq__(self, other): Defines behavior for the == operator
__getitem__(self, key): Enables indexing using square brackets
__setitem__(self, key, value): Enables item assignment

How does it work?

Python doesn't require direct calls to these methods - they're invoked automatically by operations, operators, or built-in functions when appropriate. For instance, creating a new object from a class calls __init__, and printing an object calls __str__.

Benefits

Operator Overloading: Customize how operators work with your objects
Integration with Built-ins: Make your objects compatible with built-in functions
Custom Behavior: Define how your objects behave in various contexts

NumPy
Pandas
Matplotlib
Scikit-learn

Aspect	Python Lists	NumPy Arrays
Storage Mechanism	General-purpose, store various data types, items stored contiguously but list is array of references	Homogeneous (same type) data, elements in contiguous block of memory, more memory-efficient, faster access
Underlying Optimizations	Not specialized for numerical operations, slower, dynamic size	Optimized for numerical computations, vectorized operations, fixed size
Performance Considerations	Memory Efficiency: NumPy arrays more efficient for large datasets, no type info per element Element-Wise Operations: NumPy faster, no loops Size Flexibility: Lists dynamic, overhead; NumPy fixed, memory-friendly
Use in Machine Learning	General data-handling, before converting to arrays	Foundational for numerical data, used by TensorFlow, scikit-learn

# Essential for numerical computing, providing powerful array objects and
# tools for mathematical operations, especially in scientific and
# technical computing
# Use Cases: Numerical simulations, data analysis, and mathematical modeling

import numpy as np

np_file = np.load('data.npy')                   # Load data from a .npy file
np.save('output.npy', np_file)                  # Save data to a .npy file

arr_asarray = np.asarray(arr1d)                 # Convert a list to a NumPy array: array([1, 2, 3, 4, 5])
arr_iter = np.fromiter(range(5), dtype=int)     # Create a NumPy array from an iterable: array([0, 1, 2, 3, 4])
arr1d = np.array([1, 2, 3, 4, 5])               # Create a 1D NumPy array
arr2d = np.array([[1, 2, 3], [4, 5, 6]])        # Create a 2D NumPy array (matrix)

# Array operations

arr_add = arr1d + 10                    # Add 10 to each element: [11, 12, 13, 14, 15]
arr_sum = np.sum(arr1d)                 # Sum of all elements: 15
arr_dot = np.dot(arr1d, arr1d)          # Dot product of arr1d with itself: 55
arr_mean = np.mean(arr1d)               # Mean of the array: 3.0
arr_reshaped = arr2d.reshape(3, 2)      # Reshape 2D array to 3x2: [[1, 2], [3, 4], [5, 6]]
arr_transposed = arr2d.transpose()      # Transpose the 2D array: [[1, 4], [2, 5], [3, 6]]
arr_filtered = arr1d[arr1d > 2]         # Filter elements greater than 2: [3, 4, 5]
arr_sorted = np.sort(arr1d)             # Sort the array: [1, 2, 3, 4, 5]
arr_unique = np.unique(arr1d)           # Unique elements in the array: [1, 2, 3, 4, 5]
arr_random = np.random.rand(3, 3)       # Create a 3x3 array of random numbers: [[0.1, 0.2, 0.3], [0.4, 0.5, 0.6], [0.7, 0.8, 0.9]]
arr_zeros = np.zeros((2, 3))            # Create a 2x3 array of zeros: [[0, 0, 0], [0, 0, 0]]
arr_ones = np.ones((2, 4))              # Create a 2x4 array of ones: [[1, 1, 1, 1], [1, 1, 1, 1]]
arr_empty = np.empty((3, 2))            # Create a 3x2 empty array: uninitialized values
arr_range = np.arange(0, 10, 2)         # Create an array with values from 0 to 10 with step 2: [0, 2, 4, 6, 8]
arr_linspace = np.linspace(0, 1, 5)     # Create an array with 5 evenly spaced values between 0 and 1: [0.0, 0.25, 0.5, 0.75, 1.0]
arr_sqrt = np.sqrt(arr1d)               # Square root of each element: [1.0, 1.414, 1.732, 2.0, 2.236]
arr_power = np.power(arr1d, 3)          # Raise each element to the power of 3: [1, 8, 27, 64, 125]
arr_variance = np.var(arr1d)            # Variance of the array: 2.0
arr_std = np.std(arr1d)                 # Standard deviation of the array: 1.414
arr_min = np.min(arr1d)                 # Minimum element in the array: 1
arr_max = np.max(arr1d)                 # Maximum element in the array: 5

# Powerful data manipulation and analysis library built on top of NumPy,
# offering data structures like DataFrames for handling structured data.
# Use Cases: Data cleaning, transformation, and analysis in data science and machine
# learning workflows.
import pandas as pd

df_file = pd.read_csv('data.csv')                       # Read data from a CSV file
df_file.to_csv('output.csv', index=False)                    # Write DataFrame to a CSV file

df = pd.DataFrame({'Name': ['Alice', 'Bob', 'Charlie'], 'Age': [25, 30, 35], 'City': ['New York', 'Los Angeles', 'Chicago']})
df2 = pd.DataFrame({'Name': ['David', 'Eva'], 'Age': [28, 22], 'City': ['Miami', 'Seattle']})

print(df.head())                                        # Display the first few rows of the DataFrame
#       Name  Age         City
# 0    Alice   25     New York
# 1      Bob   30  Los Angeles
# 2  Charlie   35      Chicago

print(df.tail())                                        # Display the last few rows of the DataFrame
#       Name  Age         City
# 0    Alice   25     New York
# 1      Bob   30  Los Angeles
# 2  Charlie   35      Chicago

print(df.info())                                        # Get a summary of the DataFrame
# <class 'pandas.core.frame.DataFrame'>
# RangeIndex: 3 entries, 0 to 2
# Data columns (total 3 columns):
#  #   Column  Non-Null Count  Dtype
# ---  ------  --------------  -----
#  0   Name    3 non-null      object
#  1   Age     3 non-null      int64
#  2   City    3 non-null      object
# dtypes: int64(1), object(2)
# memory usage: 120.0+ bytes

print(df.describe())                                    # Get descriptive statistics for numerical columns
#              Age
# count   3.000000
# mean   30.000000
# std     5.000000
# min    25.000000
# 25%    27.500000
# 50%    30.000000
# 75%    32.500000
# max    35.000000

print(df.shape)                                         # Get the dimensions of the DataFrame
# (3, 3)

print(df.dtypes)                                        # Get the data types of each column
# Name     object
# Age      int64
# City     object
# dtype: object

print(df['Name'].unique())                              # Get unique values in the 'Name' column
# ['Alice' 'Bob' 'Charlie']

print(df.isnull())                                      # Check for missing values in each column
# Name    False
# Age     False
# City    False
# dtype: bool

df.fillna(0, inplace=True)                              # Fill missing values with 0
#       Name  Age         City
# 0    Alice   25     New York
# 1      Bob   30  Los Angeles
# 2  Charlie   35      Chicago

df.dropna(inplace=True)                                 # Drop rows with any missing values
#      Name  Age         City
# 0    Alice   25     New York
# 1      Bob   30  Los Angeles
# 2  Charlie   35      Chicago

print(df.iloc[0])                                       # Access the first row by index
# Name      Alice
# Age          25
# City    New York
# Name: 0, dtype: object

df.sort_values('Age', inplace=True)                     # Sort DataFrame by 'Age' column
#       Name  Age         City
# 0    Alice   25     New York
# 1      Bob   30  Los Angeles
# 2  Charlie   35      Chicago

print(df['City'].value_counts())                        # Count occurrences of each unique value in 'City' column
# New York       1
# Los Angeles    1
# Chicago        1
# Name: City, dtype: int64

grouped = df.groupby('City').mean()                     # Group by 'City' and calculate mean of numerical columns
#                   Age
# City
# Chicago          35.0
# Los Angeles      30.0
# New York         25.0

df['Age'] = df['Age'].apply(lambda x: x * 2)            # Apply a function to double the 'Age' values
#       Name  Age         City
# 0    Alice   50     New York
# 1      Bob   60  Los Angeles
# 2  Charlie   70      Chicago

merged_df = pd.merge(df, df2, on='Name', how='outer')   # Merge two DataFrames on 'Name' column
#       Name  Age         City
# 0    Alice   50     New York
# 1      Bob   60  Los Angeles
# 2  Charlie   70      Chicago
# 3    David   28        Miami
# 4      Eva   22      Seattle

concatenated_df = pd.concat([df, df2])                  # Concatenate two DataFrames
#       Name  Age         City
# 0    Alice   50     New York
# 1      Bob   60  Los Angeles
# 2  Charlie   70      Chicago
# 0    David   28        Miami
# 1      Eva   22      Seattle

df.rename(columns={'Name': 'Full Name'}, inplace=True)  # Rename 'Name' column to 'Full Name'
#      Full Name  Age         City
# 0    Alice      50     New York
# 1      Bob      60  Los Angeles
# 2  Charlie      70      Chicago
# 3    David      28        Miami
# 4      Eva      22      Seattle

df.drop('City', axis=1, inplace=True)                   # Drop the 'City' column
#      Full Name  Age
# 0    Alice      50
# 1      Bob      60
# 2  Charlie      70
# 3    David      28
# 4      Eva      22

# A fundamental library for data visualization, enabling the creation of
# static, animated, and interactive plots and charts.
# Use Cases: Visualizing data trends, distributions, and relationships in scientific
# computing and data analysis.

import matplotlib.pyplot as plt

x = [1, 2, 3, 4, 5]
y = [2, 3, 5, 7, 11]

plt.plot(x, y)                                                              # Plots a line graph between two variables
plt.scatter(x, y)                                                           # Creates a scatter plot from two variables
plt.bar(['A', 'B', 'C'], [5, 7, 3])                                         # Makes a bar chart for categorical data
plt.hist([1, 2, 2, 3, 3, 3, 4, 4, 4, 4], bins=4)                            # Creates a histogram for numerical data distribution
plt.pie([10, 20, 30], labels=['X', 'Y', 'Z'])                               # Generates a pie chart representing category proportions
plt.boxplot([[1, 2, 3], [2, 3, 4], [3, 4, 5]])                              # Draws a box plot to show data spread and outliers
plt.imshow([[1, 2], [3, 4]])                                                # Displays image data as a plot
plt.contour([[1, 2], [3, 4]])                                               # Makes contour plots for 3D surface-like data
plt.errorbar(x, y, yerr=[0.5, 0.4, 0.3, 0.2, 0.1])                          # Plots error bars for observations with their uncertainties
plt.stem(x, y)                                                              # Produces a stem plot for discrete sequences
plt.fill(x, y)                                                              # Fills the area between two lines on a plot
plt.plot_date(['2023-01-01', '2023-01-02', '2023-01-03'], y)                # Creates a line plot for time series data with dates
plt.table(cellText=[[1, 2], [3, 4]], colLabels=['A', 'B'], loc='bottom')    # Adds a table to a plot beneath graphs for detailed data checks
plt.text(2, 5, "Sample Text")                                               # Inserts text annotations into a plot
plt.xlabel('X axis')                                                        # Sets label text for x axis
plt.ylabel('Y axis')                                                        # Sets label text for y axis
plt.title('Sample Plot')                                                    # Adds a title to the plot
plt.legend(['Line', 'Scatter'])                                             # Displays legend to label elements on the plot
plt.show()                                                                  # Displays the final rendered plot
plt.savefig('plot.png')                                                     # Saves the plot as an image file

# A powerful and widely-used machine learning library for Python,
# providing simple and efficient tools for data mining and data analysis.
# Use Cases: Classification, regression, clustering, and dimensionality reduction tasks.

from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from sklearn.linear_model import LogisticRegression
from sklearn.ensemble import RandomForestClassifier
from sklearn.svm import SVC
from sklearn.metrics import accuracy_score, classification_report, confusion_matrix
from sklearn.datasets import load_iris

# Load dataset
data = load_iris()
# Splits dataset into training and test sets for model validation
trX, X_test, trY, y_test = train_test_split(data.data, data.target, test_size=0.2, random_state=42)

# Standardizes features by removing the mean and scaling to unit variance
scaler = StandardScaler()
# Fits the scaler on training data and transforms it
trX = scaler.fit_transform(trX)
# Transforms test data using the fitted scaler
X_test = scaler.transform(X_test)


# An ensemble method to train random forest models, increasing accuracy over single trees
model = RandomForestClassifier(n_estimators=100, random_state=42)
# Trains/learns a model or transformer using training data
model.fit(trX, trY)
# Uses a trained model to predict target values for new or test data
y_pred = model.predict(X_test)
# Evaluates a model using cross-validation, returning consistency/stability metrics
cross_val_score(model, data.data, data.target, cv=5)
# Incrementally trains models (useful for large datasets)
model.partial_fit(trX, trY, classes=[0, 1, 2])


# Converts Python dictionaries to NumPy/SciPy arrays for feature extraction
vec = DictVectorizer()
# Fits and transforms data using the vectorizer
vec.fit_transform(data.data)

# An ensemble method to train random forest models, increasing accuracy over single trees
clf = RandomForestClassifier(n_estimators=100)
# Trains/learns a model or transformer using training data
clf.fit(trX, trY)

Naming Conventions​

Naming Conventions