Skip to main content

Free Chapters

 

List – The magic of Python

 

Lists and tuples are the most valuable data structures used in Python. Lists are a versatile way to organize data in Python. The primary difference between List and Tuple is that Tuple is immutable. Once constructed, we cannot add or remove elements from a Tuple.

 

Python does not have arrays like other programming languages, such as C/C++ or Java. They are also known as sequence data types. The Operators like “in” and [:] work with both lists and Tuples.

 

Python uses the [] operator to select one or more elements from a list or a tuple. The basic syntax is [start: end], where end is exclusive (not included).

 

The absence of a start or an end value denotes the boundary in that direction. For example, [1:] will fetch all elements from index 1 to the end.

 

 

Python indexing starts at zero, not one. So, index 1 is the second element in the list/tuple. It should not be confused with the first element because of the face value of 1. Negative indices count from the end of the list (-1 denotes the last component of the list/tuple)

 

The absence of a start or end value can be tricky. Also, negative indices can be confusing, but they are the only way to access the last element in a large list/tuple of unknown length.

 

 

        lists can be heterogeneous.

        a = ['jen', 'hello', 100, 1234, 2*2]

        Lists can be indexed and sliced:

        a[0] à jen

        a[:2] à ['jen', 'hello']

        a[0:2] à [‘one’, ‘two’, 300]

        a[0:0] à []

        Lists can be manipulated.

        a[2] = a[2] + 23 à 123

        The function “len()” is used to find the number of elements in a list or tuple.

        len(a) à 5

 

A for loop is often used to iterate over the elements of a List or Tuple. Elements are accessed using a variable in a for loop with the “in” or “range” operator.

 



 Python has a range operator to create a sample list of numbers on the fly. The range is convenient for counting numbers or generating large amounts of test data.

 

 

The range operator has three arguments:

·        Start – Starting value.

·        End – Ending Value is exclusive. The list will end at a value of 1 less than the ending value. The ending value will not be a part of the resulting list.

·        Increment – Increment value is added to each sequence to derive the next value. The increment can be negative to count backward. The increment is optional. The default increment value is 1.

 

 


 


 

The list has member functions for sorting, appending, removing, and clearing. List members can be updated/overridden using the assignment operator (=) with the left-hand side referring to a particular element with the [] operator.

 

The append method adds an element to the end.

Remove removes an element from the end.

Insert can be used to insert an element at any location. The first argument for insert is the index at which the new element should be inserted. To insert an element at the beginning, use the index 0. Append is the same as inserting with the index length of the list as the index.

The reverse method reverses the order of the elements in a list.

Del is used to remove a member or a range of members from a list

Pop is used to remove and return an element from the end, like a stack.

 

 


 

 

 

We can mix different data types in the same list in python. Boolean literal can only contain True or False without quotes.

 

 

 

 

 

 

 

 

 

 

The sort function can be used to sort a list. It sorts in ascending order by default.

 

To sort in descending order, we can use sort with argument “reverse=True”.

 

 


 


 

Pop is a useful function to combine get by index and remove from the end.

 


 


  

Here are the rules for deleting an element from a list.

 

·        Do not modify the list being iterated over. It will throw an error.

        Always remove by index, not by value, as there might be duplicate values, but only the first value will get deleted.

        Remove slices from the same list (rather than creating a new list)

 


 

Coding Tips

 

Sometimes knowing a programming language is not enough to write effective programs. We also need to know which technique to apply when. Sometimes it is for readability and to keep the program concise and elegant. The code structure could also affect performance or reduce the risk of bugs. We can learn them over time when we fix code smells and defects. Reading a lot of code written by experienced programmers also helps. There are several static code review tools, such as SonarQube, Find Bugs, and Coverity, as well as AI-based coding agents that help write better code. Most of these are also available as plugins for IDEs like PyCharm or VS Code. Here are a few coding Tips for effective Python programming. This is a quick cheat sheet. Please research to fully understand why these are best practices. We do not have to use them all the time. Please note that some of these features are only available in specific versions of Python.

 

CORE PYTHON

 

Use f-strings for all formatting

print(f"Hello, {user}")

 

Prefer pathlib over os.path

Path("data") / "file.txt"

 

Use “with” for file operations

with open("data.txt") as f: ...

 

Use list comprehensions for transformations

[x*x for x in nums]

 

Use “dict” comprehensions for mapping

{id: name for id, name in rows}

 

Use “enumerate” instead of manual counters

for i, item in enumerate(items): ...

 

Use zip to iterate in parallel

for a, b in zip(A, B): ...

 

Use _ for throwaway variables

for _, value in items: ...

 

 

Use join() instead of string concatenation

",".join(names)

 

Use tuple unpacking for swaps

x, y = y, x

 

Use sorted(..., key=...) for custom ordering

sorted(users, key=lambda u: u.age)

 

Use any() and all() for boolean logic

if any(x < 0 for x in nums): ...

 

Use min/max with key=

oldest = min(users, key=lambda u: u.created)

 

Use sum() with comprehensions

total = sum(f.size for f in files)

 

Use for/else for search loops

for u in users:

    if u.id == target:

        break

else:

    raise LookupError()

 

DATA STRUCTURES & COLLECTIONS

 

Use “set” for fast membership checks

if item in allowed: ...

 

Use defaultdict(list) for grouping

groups[key].append(value)

 

Use Counter for frequency analysis

Counter(words)

 

Use deque for fast queue operations

deque().popleft()

 

Use dict.get() for safe lookups

role = user.get("role", "guest")

 

Use dict.setdefault() for grouping

groups.setdefault(k, []).append(v)

 

Use dictionary unpacking to merge

merged = {**a, **b}

 

Use zip(*iterables) to transpose

cols = list(zip(*rows))

 

Use sorted(set(...)) to dedupe

unique = sorted(set(values))

 

Use itertools.chain to flatten

chain.from_iterable(matrix)

 

Use itertools.islice to slice generators

islice(stream(), 10)

 

Use pairwise (3.10+) for sliding windows

for a, b in pairwise(nums): ...

 

Use batched (3.12+) for chunking

for chunk in batched(data, 100): ...

 

Use “bisect” for binary search

bisect_left(sorted_list, x)

 

Use “heapq” for priority queues

heappush(heap, (priority, task))

 

 

FUNCTIONS, CLASSES & DESIGN

 

Use type hints everywhere

def load(path: str) -> dict: ...

 

Use “dataclasse” for simple models

@dataclass

class User:

    id: int

    name: str

 

Use “NamedTuple” for immutable records

class Point(NamedTuple): ...

 

Use “Protocol” for structural typing

class Writer(Protocol):

    def write(self, text: str) -> None: ...

 

Use “property” for computed attributes

@property

def area(self):

    return self.w * self.h

 

Use “classmethod” constructors

@classmethod

def from_json(cls, d): ...

 

Use __repr__ for debugging

def __repr__(self): return f"User(id={self.id})"

 

Use __slots__ for memory-heavy classes

__slots__ = ("x", "y")

 

Use “lru_cache” for caching

@lru_cache

def compute(x): ...

 

Use “partial” to pre-fill arguments

handler = partial(process, debug=True)

 

Use “singledispatch” for generic functions

@singledispatch

def process(x): ...

 

Use “raise.. from” to preserve context,

raise RuntimeError("bad") from e

 

  • Use try/except/else/finally properly. Start with catching specific exceptions to broader ones. Use finally to release resources such as files and network connections.
  • Avoid using exceptions for flow control. Use if/else for that flow control. Relying on exceptions to execute business logic is not a good idea. Exceptions should be used as if we never expect them to happen. We want to prevent harm if an exception is triggered in production. If you can prevent exceptions by checking for the bad value, state, or condition upfront, that would be the preferred approach. Provide meaningful messages when catching exceptions to help in debugging and understanding the error. Most importantly, do not copy and paste the same generic message for every catch in the hierarchy; otherwise, we will need to open the code to guess where the exception was raised.

 

               try:

    x = int(input("Enter a denominator value: "))

    result = 56 / x

except ZeroDivisionError:

    print("Divide by zero occurred!")

else:

    print("Success:", result)

finally:

    print("Execution complete. Cleanup as needed")

 

Use “contextmanager” decorator for simple contexts

@contextmanager

def temp():

…..

   

 

 

CONCURRENCY & PERFORMANCE

 

Use “asyncio” for concurrent I/O

async def fetch(): ...

 

Use “asyncio.gather” for parallel tasks

await asyncio.gather(*tasks)

 

Use “asyncio.to_thread” for blocking work

await asyncio.to_thread(fn)

 

Use “ThreadPoolExecutor” for I/O

ThreadPoolExecutor()

 

Use “ProcessPoolExecutor” for CPU

ProcessPoolExecutor()

 

Use time.perf_counter() for timing

start = perf_counter()

           

IPython and Google Colab has special constructs to profile Python code.  These are some of the magic commands:

·        %time: Time the execution of a single statement

·        %timeit: Time repeated execution of a single statement for more accuracy

·        %prun: Run code with the profiler

·        %lprun: Run code with the line-by-line profiler

·        %memit: Measure the memory use of a single statement

·        %mprun: Run code with the line-by-line memory profiler

To use these, we need to install the line_profiler and memory_profiler extensions. Please refer to this notebook for examples:

https://colab.research.google.com/github/jakevdp/PythonDataScienceHandbook/blob/master/notebooks/01.07-Timing-and-Profiling.ipynb

 

 

Use “cProfile” for profiling

cProfile.run("main()")

 

Use “line_profiler” for line-level timing

@profile def fn(): ...

 

Use “multiprocessing.shared_memory” for large arrays (Python 3.8+)

from multiprocessing import shared_memory

shm_a = shared_memory.SharedMemory(create=True, size=10)

type(shm_a.buf)

 

buffer = shm_a.buf

len(buffer)

 

buffer[:4] = bytearray([22, 33, 44, 55])  # Modify multiple at once

buffer[4] = 100                           # Modify single byte at a time

 

# Attach to an existing shared memory block

shm_b = shared_memory.SharedMemory(shm_a.name)

import array

array.array('b', shm_b.buf[:5])  # Copy the data into a new array.array

 

shm_b.buf[:5] = b'howdy'  # Modify via shm_b using bytes

bytes(shm_a.buf[:5])      # Access via shm_a

 

shm_b.close()   # Close each SharedMemory instance

shm_a.close()

shm_a.unlink()  # Call unlink only once to release the shared memory

 

Use “shutil” for file operations

shutil.copy("a", "b")

 

 

TESTING, DEBUGGING & TOOLING

 

Use pytest for clean tests

def test_add(): ...

 

 

 

Use fixtures for reusable setup

@pytest.fixture

def client(): ...

 

Use pytest.raises for error tests

with pytest.raises(ValueError): ...

 

Use mock.patch to isolate dependencies

with patch("module.api"): ...

 

Use tempfile in tests

TemporaryDirectory()

 

Use logging instead of print

logging.info("Started")

 

Use “redirect_stdout” for capturing output

with redirect_stdout(buf): ...

 

Use pprint for debugging structures

pprint(data)

 

Use warnings to flag deprecated behavior

warnings.warn("deprecated")

 

Use asserts only for internal invariants. Assertions should be used to test fatal conditions and crash the entire program early. We should never need to use assertions in production. There are better ways to handle fatal errors.

 

NETWORKING, APIS & SERIALIZATION

 

Use requests with timeout=

requests.get(url, timeout=3)

 

Use raise_for_status()

r.raise_for_status()

 

 

Use json.loads/json.dumps for API data

json.loads(body)

 

Use “shlex.split” for safe command line command parsing

shlex.split(cmd)

 

Use subprocess.run instead of os.system

subprocess.run([...], check=True)

 

Use environment variables for config

os.getenv("DB_URL")

 

Use sqlite3 for lightweight persistence

sqlite3.connect("app.db")

 

Use “csv.DictReader” for structured CSVs

DictReader(open("data.csv"))

 

Use “atexit” for cleanup

atexit.register(cleanup)

 

Use if __name__ == "__main__" guards for modules with multiple code file which can be run directly. This helps to check if we are running entire module or a specific python file within module.

if __name__ == "__main__":

    main()

 

 

In addition to the above coding tips, we should be familiar with Python coding style. The Python style guide is available at https://peps.python.org/pep-0008/.

 

Programmers should adhere to the organization's coding conventions, which are usually derived from Python’s standard conventions.

 

 


One Liner Cheat Sheet

Python has some nifty tricks to make our statements compact. These one-liners are easy to read. They are equivalent to writing multiple lines of conditional statements. However, we may want to use a more verbose version if the code is intended as a teaching example to make it understandable to beginner or intermediate programmers.

 

1. List & Sequence Operations

 

flat = [x for row in matrix for x in row] # Flatten a list of lists

rev = lst[::-1] #Reverse a list

uniq = list(set(items)) #Unique elements

transposed = list(zip(*matrix)) #Transpose a matrix

chunks = [lst[i:i+n] for i in range(0, len(lst), n)] # Chunk a list

cleaned = list(filter(None, lst)) #remove none

from collections import Counter; freq = Counter(lst) #count occurance

dupes = [x for x, c in Counter(lst).items() if c > 1] #find duplicates

flatten = lambda lst: sum(([x] if not isinstance(x, list) else flatten(x) for x in lst), []) #Flatten nested list recursively

s = ",".join(map(str, lst)) #List to CSV

 

2. Comprehensions & Transformations

 

squares = [x*x for x in nums] # square numbers

evens = [x for x in nums if x % 2 == 0] # filter even number

d = {x: x*x for x in nums} #dictionary comprehension

swapped = {v: k for k, v in d.items()} # swap key and values

merged = {**a, **b} #merge two dictionaries

filtered = {k: v for k, v in d.items() if v > 10} #filter dictionary

inv = {v: k for k, vals in d.items() for v in vals} #invert dictionary of lists

merged = {k: v for d in dicts for k, v in d.items()} #merge list of dictionaries

 

3. Searching & Filtering

 

first = next(x for x in lst if x > 10) #find first match

idxs = [i for i, x in enumerate(lst) if x == target] #find all indices of a value

has_even = any(x % 2 == 0 for x in nums) # check for any match

all_positive = all(x > 0 for x in nums) #check if all match

idx = nums.index(max(nums)) #index of the match

from itertools import groupby; groups = {k: list(v) for k, v in groupby(sorted(items), key=keyfunc)} #group items by key

 

4. Math & Algorithms

 

s = sum(map(int, str(n))) #sum of digits

dist = sum((a-b)**2 for a,b in zip(p,q))**0.5 #euclidian distance

is_prime = n > 1 and all(n % i for i in range(2, int(n**0.5)+1)) #check for prime

import itertools; cumsum = list(itertools.accumulate(nums)) #find cumulative sum

norm = [(x-min(lst))/(max(lst)-min(lst)) for x in lst] #normalize a list

result = [[sum(a*b for a,b in zip(r,c)) for c in zip(*B)] for r in A] #matrix multiplication or dot product

rotated = list(zip(*matrix[::-1])) #rotate matrix by 90 degrees

import heapq; topk = heapq.nlargest(k, nums) #top k largest

 

5. Files & OS

 

lines = open("file.txt").read().splitlines() #read file into a list

open("out.txt","w").write("\n".join(lines)) #write list to a file

count = sum(1 for _ in open("file.txt")) #count file lines

import glob; pyfiles = glob.glob("*.py") #list file with extension .py

import os; size = os.path.getsize("file.txt") #find the file size

 

6. Networking & Web

 

import requests; r = requests.get(url).text #get request

open("out","wb").write(requests.get(url).content) #download a file from a url

data = requests.get(url).json() #parse JSON from a URL

 

7. Concurrency & Async

 

#threadPoolExecuter in one line

from concurrent.futures import ThreadPoolExecutor; out=list(ThreadPoolExecutor().map(func, items))

#ProcessPoolExecutor in one line

from concurrent.futures import ProcessPoolExecutor; out=list(ProcessPoolExecutor().map(cpu_task, items))

#asyncio gather

results = asyncio.run(asyncio.gather(*(task(x) for x in items)))

#Timeout asyncio

result = await asyncio.wait_for(coro(), timeout=2)

#run blocking code in thread using asyncio

result = await asyncio.to_thread(func, arg)

 

8. Utilities & Python Internals

 

a, b = b, a #Swap two variables

import json; print(json.dumps(obj, indent=2)) #pretty print JSON

import random; x = random.choice(lst) #random choice from a list

random.shuffle(lst) #shuffle a list

import uuid; uid = uuid.uuid4() #Generate UUID

import time; t=time.time(); func(); print(time.time()-t) #Find function runtime

import sys; size = sys.getsizeof(obj) #find size of an object in memory

python -m http.server #run HTTP server

from functools import lru_cache; @lru_cache(None) def f(x): ...#memoize any function

compose = lambda f, g: lambda x: f(g(x)) #compose two function

C = type("C", (object,), {"x": 42}) #create dynamic class

if (n := len(lst)) > 10: print(n) #conditional assignment

exec(code, {"__builtins__": {}}, {}) #safe sandbox to execute

#Deep get with default

from functools import reduce; deepget=lambda d,*k,default=None: reduce(lambda c,i:c.get(i) if isinstance(c,dict) else default,k,d)

#partition a list by predicate

true, false = (list(filter(p, lst)), list(filter(lambda x: not p(x), lst)))

 

9. Generators & Itertools Magic

 

fib = (a := 0, b := 1) or ((a := b, b := a + b)[1] for _ in iter(int, 1)) #fibonacci generator

import itertools; nth = lambda it, n: next(itertools.islice(it, n, None)) #nth item generator

import itertools; flat = list(itertools.chain.from_iterable(nested)) #flatten one level with itertool

import itertools; prod = list(itertools.product(a, b)) #cartesian product

import itertools; perms = list(itertools.permutations(items)) #find all permutations

import itertools; combs = list(itertools.combinations(items, 2)) #find all combinations

import itertools; windows = list(zip(*(lst[i:] for i in range(n)))) #list a long list n items at a time

 

1.    One liner with lambda and for loop

 

# Square each number

squares = list(map(lambda x: x*x, [x for x in nums]))

 

#filter even number

evens = list(filter(lambda x: x % 2 == 0, [x for x in nums]))

 

#pair each item with its index in alist

indexed = list(map(lambda i: (i, items[i]), [i for i in range(len(items))]))

 

#convert a list of strings into ints

ints = list(map(lambda s: int(s), [s for s in strings]))

 

#Apply function to only positive numbers in alist

processed = [f(x) for x in nums if (lambda y: y > 0)(x)]

 

#tag each item as even or odd

tags = list(map(lambda x: (x, "even" if x%2==0 else "odd"), [x for x in nums]))

 

#compute string lengths

lengths = list(map(lambda s: len(s), [s for s in words]))

 

#normalize values

norm = list(map(lambda x: x/max(vals), [x for x in vals]))

 

#Multiply each number with index

scaled = list(map(lambda t: t[0]*t[1], [(i, nums[i]) for i in range(len(nums))]))

 

#extract object attribute

names = list(map(lambda o: o.name, [o for o in objs]))

 

#Apply two functions to each element

results = [(lambda x: (f(x), g(x)))(x) for x in nums]

 

#Sort list of tuples by second element

sorted_items = sorted(items, key=lambda t: t[1])

 

#convert a list of objects into dictionary keyed by its attribute

fields = list(map(lambda d: d["id"], [d for d in records]))

 

#Flatten matrix using lambda

processed = list(map(lambda x: x*2, [x for row in matrix for x in row]))

 

#filter string using substring

matches = list(filter(lambda s: "py" in s, [s for s in words]))

 

#generate dictionary from a list

d = {x: (lambda y: y*y)(x) for x in nums}

 

#apply lambda to zipped pairs

products = list(map(lambda t: t[0]*t[1], [(a, b) for a, b in zip(xs, ys)]))

 

2.    Miscellaneous

 

#numpy histogram bins

import numpy as np; hist = np.histogram(data, bins=10)[0]

 

#test if a list is sorted

is_sorted = all(a <= b for a, b in zip(lst, lst[1:]))

 

#create a dictionary of frequency in a list

from collections import Counter; print(dict(Counter(lst)))

 

#Time an expression

import time; (lambda t: (time.sleep(1), print(time.time()-t)))(time.time())

 

#convert list of dicts into csv rows

csv = "\n".join(",".join(str(d[k]) for k in d) for d in rows)

 

#convert a dict of list into list of dicts

rev = [dict(zip(d.keys(), vals)) for vals in zip(*d.values())]

 

#frequency sorted list

sorted_by_freq = sorted(lst, key=lambda x: (-lst.count(x), x))

 

#remove duplicate while preserving the order

seen=set(); out=[x for x in lst if not (x in seen or seen.add(x))]

 

#convert snake case to camel case

camel = lambda s: s.split("_")[0] + "".join(w.title() for w in s.split("_")[1:])

 

#convert camel case to snake case

import re; snake = re.sub(r'(?<!^)(?=[A-Z])', '_', s).lower()

 

#deep flatten a dict

flat = {f"{k}.{ik}": iv for k,v in d.items() for ik,iv in v.items()}

 

#merge list of list into dict of counts

from collections import Counter; merged = Counter(x for sub in lst for x in sub)

 

#convert list of boolean to bit string

bits = "".join("1" if x else "0" for x in flags)

 

#create a dict from two lists

d = {k: v for k, v in zip(keys, values)}

 

#read csv and show 5 rows using panda

df = pd.read_csv("data.csv").head()

 

#filter row by condition using panda

filtered = df[df["value"] > 10]

 

#group by column and compute mean

means = df.groupby("category")["amount"].mean()

 

#add a new column using lamda in panda dataframe

df["ratio"] = df.apply(lambda r: r["a"] / r["b"], axis=1)

 

#select multiple columns

subset = df[["name", "score", "rank"]]