List – The magic of Python
Lists and tuples are the most valuable data structures used
in Python. Lists
are a versatile way to organize data in Python. The primary difference
between List and Tuple is that Tuple is immutable. Once constructed, we cannot
add or remove elements from a Tuple.
Python does not have arrays like other programming languages,
such as C/C++ or Java. They are also known as sequence data types. The Operators
like “in” and [:] work with both lists and Tuples.
Python uses the [] operator to select one or more elements
from a list or a tuple. The basic syntax is [start: end], where end is
exclusive (not included).
The absence of a start or an end value denotes the boundary
in that direction. For example, [1:] will fetch all elements from index 1 to
the end.
The absence of a start or end value can be tricky. Also, negative indices can be confusing, but they are the only way to access the last element in a large list/tuple of unknown length.
•
lists can be heterogeneous.
•
a = ['jen', 'hello', 100, 1234, 2*2]
•
Lists can be indexed and sliced:
•
a[0] Ã jen
•
a[:2] Ã ['jen', 'hello']
•
a[0:2] Ã [‘one’, ‘two’, 300]
•
a[0:0] Ã []
•
Lists can be manipulated.
•
a[2] = a[2] + 23 Ã 123
•
The function “len()” is used to find the number
of elements in a list or tuple.
•
len(a) Ã 5
A for loop is often used to iterate over the elements of a
List or Tuple. Elements are accessed using a variable in a for loop with the “in”
or “range” operator.
The range operator has three arguments:
·
Start – Starting value.
·
End – Ending Value is exclusive. The list will
end at a value of 1 less than the ending value. The ending value will not be a
part of the resulting list.
·
Increment – Increment value is added to each
sequence to derive the next value. The increment can be negative to count
backward. The increment is optional. The default increment value is 1.
The list has member functions for sorting, appending, removing, and clearing.
List members can be updated/overridden using the assignment operator (=) with the
left-hand side referring to a particular element with the [] operator.
The append method adds an element to the end.
Remove removes an element from the end.
Insert can be used to insert an element at any
location. The first argument for insert is the index at which the new element should be inserted. To insert an element at the beginning, use the index 0. Append is the same as inserting with the index length of the list as the index.
The reverse method reverses the order of the elements
in a list.
Del is used to remove a member or a range of members
from a list
Pop is used to remove and return an element from the end, like a stack.
We can mix different data types in the same list in python.
Boolean literal can only contain True or False without quotes.
The sort function can be used to sort a list. It sorts in
ascending order by default.
Pop is a useful function to combine get by index and remove
from the end.
Here are the rules for deleting an element from a list.
·
Do not modify the list being iterated over. It
will throw an error.
•
Always remove by index, not by value, as there
might be duplicate values, but only the first value will get deleted.
•
Remove slices from the same list (rather than creating
a new list)
Coding Tips
Sometimes knowing a programming language is not enough to write
effective programs. We also need to know which technique to apply when.
Sometimes it is for readability and to keep the program concise and elegant.
The code structure could also affect performance or reduce the risk of bugs. We
can learn them over time when we fix code smells and defects. Reading a lot of code
written by experienced programmers also helps. There are several static code
review tools, such as SonarQube, Find Bugs, and Coverity, as well as AI-based
coding agents that help write better code. Most of these are also available as
plugins for IDEs like PyCharm or VS Code. Here are a few coding Tips for
effective Python programming. This is a quick cheat sheet. Please research to
fully understand why these are best practices. We do not have to use them all
the time. Please note that some of these features are only available in specific
versions of Python.
CORE PYTHON
Use f-strings for all formatting
print(f"Hello,
{user}")
Prefer pathlib over os.path
Path("data") /
"file.txt"
Use “with” for file operations
with open("data.txt")
as f: ...
Use list comprehensions for
transformations
[x*x for x in
nums]
Use “dict” comprehensions for
mapping
{id: name for
id, name in rows}
Use “enumerate” instead of manual
counters
for i, item in
enumerate(items): ...
Use zip to iterate in parallel
for a, b in
zip(A, B): ...
Use _ for throwaway variables
for _, value in
items: ...
Use join() instead of string
concatenation
",".join(names)
Use tuple unpacking for swaps
x, y = y, x
Use sorted(..., key=...) for custom
ordering
sorted(users,
key=lambda u: u.age)
Use any() and all() for boolean
logic
if any(x < 0
for x in nums): ...
Use min/max with key=
oldest =
min(users, key=lambda u: u.created)
Use sum() with comprehensions
total =
sum(f.size for f in files)
Use for/else for search loops
for u in users:
if u.id == target:
break
else:
raise LookupError()
DATA STRUCTURES & COLLECTIONS
Use “set” for fast membership
checks
if item in
allowed: ...
Use defaultdict(list) for grouping
groups[key].append(value)
Use Counter for frequency analysis
Counter(words)
Use deque for fast queue operations
deque().popleft()
Use dict.get() for safe lookups
role =
user.get("role", "guest")
Use dict.setdefault() for grouping
groups.setdefault(k,
[]).append(v)
Use dictionary unpacking to merge
merged = {**a,
**b}
Use zip(*iterables) to transpose
cols =
list(zip(*rows))
Use sorted(set(...)) to dedupe
unique =
sorted(set(values))
Use itertools.chain to flatten
chain.from_iterable(matrix)
Use itertools.islice to slice
generators
islice(stream(),
10)
Use pairwise (3.10+) for sliding
windows
for a, b in
pairwise(nums): ...
Use batched (3.12+) for chunking
for chunk in
batched(data, 100): ...
Use “bisect” for binary search
bisect_left(sorted_list,
x)
Use “heapq” for priority queues
heappush(heap,
(priority, task))
FUNCTIONS, CLASSES & DESIGN
Use type hints everywhere
def load(path:
str) -> dict: ...
Use “dataclasse” for simple models
@dataclass
class User:
id: int
name: str
Use “NamedTuple” for immutable
records
class
Point(NamedTuple): ...
Use “Protocol” for structural
typing
class Writer(Protocol):
def write(self, text: str) -> None: ...
Use “property” for computed
attributes
@property
def area(self):
return self.w * self.h
Use “classmethod” constructors
@classmethod
def from_json(cls, d): ...
Use __repr__ for debugging
def
__repr__(self): return f"User(id={self.id})"
Use __slots__ for memory-heavy
classes
__slots__ =
("x", "y")
Use “lru_cache” for caching
@lru_cache
def compute(x): ...
Use “partial” to pre-fill arguments
handler =
partial(process, debug=True)
Use “singledispatch” for generic
functions
@singledispatch
def process(x): ...
Use “raise.. from” to preserve
context,
raise
RuntimeError("bad") from e
- Use
try/except/else/finally properly. Start with catching specific exceptions to
broader ones. Use finally to release resources such as files and
network connections.
- Avoid
using exceptions for flow control. Use if/else for that flow control.
Relying on exceptions to execute business logic is not a good idea. Exceptions
should be used as if we never expect them to happen. We want to prevent harm
if an exception is triggered in production. If you can prevent exceptions
by checking for the bad value, state, or condition upfront, that would be
the preferred approach. Provide meaningful messages when catching
exceptions to help in debugging and understanding the error. Most importantly,
do not copy and paste the same generic message for every catch in the
hierarchy; otherwise, we will need to open the code to guess where the
exception was raised.
try:
x = int(input("Enter a denominator
value: "))
result = 56 / x
except
ZeroDivisionError:
print("Divide by zero occurred!")
else:
print("Success:", result)
finally:
print("Execution complete. Cleanup as
needed")
Use “contextmanager” decorator for
simple contexts
@contextmanager
def temp():
…..
CONCURRENCY & PERFORMANCE
Use “asyncio” for concurrent I/O
async def
fetch(): ...
Use “asyncio.gather” for parallel
tasks
await
asyncio.gather(*tasks)
Use “asyncio.to_thread” for
blocking work
await
asyncio.to_thread(fn)
Use “ThreadPoolExecutor” for I/O
ThreadPoolExecutor()
Use “ProcessPoolExecutor” for CPU
ProcessPoolExecutor()
Use time.perf_counter() for
timing
start =
perf_counter()
IPython and Google Colab has
special constructs to profile Python code. These are some of the magic
commands:
·
%time: Time the execution of a single statement
·
%timeit: Time repeated execution of a single
statement for more accuracy
·
%prun: Run code with the profiler
·
%lprun: Run code with the line-by-line profiler
·
%memit: Measure the memory use of a single
statement
·
%mprun: Run code with the line-by-line memory
profiler
To use these, we need to install
the line_profiler and memory_profiler extensions. Please
refer to this notebook for examples:
Use “cProfile” for profiling
cProfile.run("main()")
Use “line_profiler” for line-level
timing
@profile def
fn(): ...
Use “multiprocessing.shared_memory”
for large arrays (Python 3.8+)
from
multiprocessing import shared_memory
shm_a =
shared_memory.SharedMemory(create=True, size=10)
type(shm_a.buf)
buffer =
shm_a.buf
len(buffer)
buffer[:4] =
bytearray([22, 33, 44, 55]) # Modify
multiple at once
buffer[4] =
100 # Modify
single byte at a time
# Attach to an
existing shared memory block
shm_b =
shared_memory.SharedMemory(shm_a.name)
import array
array.array('b',
shm_b.buf[:5]) # Copy the data into a
new array.array
shm_b.buf[:5] =
b'howdy' # Modify via shm_b using bytes
bytes(shm_a.buf[:5]) # Access via shm_a
shm_b.close() # Close each SharedMemory instance
shm_a.close()
shm_a.unlink() # Call unlink only once to release the shared
memory
Use “shutil” for file operations
shutil.copy("a",
"b")
TESTING, DEBUGGING & TOOLING
Use pytest for clean tests
def test_add():
...
Use fixtures for reusable setup
@pytest.fixture
def client():
...
Use pytest.raises for error tests
with pytest.raises(ValueError):
...
Use mock.patch to isolate
dependencies
with
patch("module.api"): ...
Use tempfile in tests
TemporaryDirectory()
Use logging instead of print
logging.info("Started")
Use “redirect_stdout” for capturing
output
with
redirect_stdout(buf): ...
Use pprint for debugging structures
pprint(data)
Use warnings to flag deprecated
behavior
warnings.warn("deprecated")
Use asserts only for internal
invariants. Assertions should be used to test fatal conditions and crash the
entire program early. We should never need to use assertions in production.
There are better ways to handle fatal errors.
NETWORKING, APIS & SERIALIZATION
Use requests with timeout=
requests.get(url,
timeout=3)
Use raise_for_status()
r.raise_for_status()
Use json.loads/json.dumps for API
data
json.loads(body)
Use “shlex.split” for safe command line
command parsing
shlex.split(cmd)
Use subprocess.run instead of
os.system
subprocess.run([...],
check=True)
Use environment variables for
config
os.getenv("DB_URL")
Use sqlite3 for lightweight
persistence
sqlite3.connect("app.db")
Use “csv.DictReader” for structured
CSVs
DictReader(open("data.csv"))
Use “atexit” for cleanup
atexit.register(cleanup)
Use if __name__ ==
"__main__" guards for modules with multiple code file which can be
run directly. This helps to check if we are running entire module or a specific
python file within module.
if __name__ ==
"__main__":
main()
In addition to the above coding tips, we should be familiar
with Python coding style. The Python style guide is available at https://peps.python.org/pep-0008/.
Programmers should adhere to the organization's coding
conventions, which are usually derived from Python’s standard conventions.
One Liner Cheat Sheet
Python has some nifty tricks to make our statements compact.
These one-liners are easy to read. They are equivalent to writing multiple
lines of conditional statements. However, we may want to use a more verbose
version if the code is intended as a teaching example to make it understandable
to beginner or intermediate programmers.
1. List & Sequence Operations
flat
= [x for row in matrix for x in row] # Flatten a list of lists
rev = lst[::-1] #Reverse a list
uniq = list(set(items)) #Unique
elements
transposed = list(zip(*matrix))
#Transpose a matrix
chunks = [lst[i:i+n] for i in
range(0, len(lst), n)] # Chunk a list
cleaned = list(filter(None, lst))
#remove none
from collections import Counter;
freq = Counter(lst) #count occurance
dupes = [x for x, c in
Counter(lst).items() if c > 1] #find duplicates
flatten = lambda lst: sum(([x] if
not isinstance(x, list) else flatten(x) for x in lst), []) #Flatten nested list
recursively
s =
",".join(map(str, lst)) #List to CSV
2. Comprehensions & Transformations
squares
= [x*x for x in nums] # square numbers
evens = [x for x in nums if x % 2
== 0] # filter even number
d = {x: x*x for x in nums}
#dictionary comprehension
swapped = {v: k for k, v in
d.items()} # swap key and values
merged = {**a, **b} #merge two
dictionaries
filtered = {k: v for k, v in
d.items() if v > 10} #filter dictionary
inv = {v: k for k, vals in
d.items() for v in vals} #invert dictionary of lists
merged =
{k: v for d in dicts for k, v in d.items()} #merge list of dictionaries
3. Searching & Filtering
first
= next(x for x in lst if x > 10) #find first match
idxs = [i for i, x in
enumerate(lst) if x == target] #find all indices of a value
has_even = any(x % 2 == 0 for x in
nums) # check for any match
all_positive = all(x > 0 for x
in nums) #check if all match
idx = nums.index(max(nums)) #index
of the match
from
itertools import groupby; groups = {k: list(v) for k, v in
groupby(sorted(items), key=keyfunc)} #group items by key
4. Math & Algorithms
s
= sum(map(int, str(n))) #sum of digits
dist = sum((a-b)**2 for a,b in
zip(p,q))**0.5 #euclidian distance
is_prime = n > 1 and all(n % i
for i in range(2, int(n**0.5)+1)) #check for prime
import itertools; cumsum =
list(itertools.accumulate(nums)) #find cumulative sum
norm =
[(x-min(lst))/(max(lst)-min(lst)) for x in lst] #normalize a list
result = [[sum(a*b for a,b in
zip(r,c)) for c in zip(*B)] for r in A] #matrix multiplication or dot product
rotated = list(zip(*matrix[::-1]))
#rotate matrix by 90 degrees
import
heapq; topk = heapq.nlargest(k, nums) #top k largest
5. Files & OS
lines
= open("file.txt").read().splitlines() #read file into a list
open("out.txt","w").write("\n".join(lines))
#write list to a file
count = sum(1 for _ in
open("file.txt")) #count file lines
import glob; pyfiles =
glob.glob("*.py") #list file with extension .py
import
os; size = os.path.getsize("file.txt") #find the file size
6. Networking & Web
import
requests; r = requests.get(url).text #get request
open("out","wb").write(requests.get(url).content)
#download a file from a url
data =
requests.get(url).json() #parse JSON from a URL
7. Concurrency & Async
#threadPoolExecuter
in one line
from concurrent.futures import
ThreadPoolExecutor; out=list(ThreadPoolExecutor().map(func, items))
#ProcessPoolExecutor in one line
from concurrent.futures import
ProcessPoolExecutor; out=list(ProcessPoolExecutor().map(cpu_task, items))
#asyncio gather
results =
asyncio.run(asyncio.gather(*(task(x) for x in items)))
#Timeout asyncio
result = await
asyncio.wait_for(coro(), timeout=2)
#run blocking code in thread using
asyncio
result =
await asyncio.to_thread(func, arg)
8. Utilities & Python Internals
a,
b = b, a #Swap two variables
import json; print(json.dumps(obj,
indent=2)) #pretty print JSON
import random; x =
random.choice(lst) #random choice from a list
random.shuffle(lst) #shuffle a
list
import uuid; uid = uuid.uuid4()
#Generate UUID
import time; t=time.time();
func(); print(time.time()-t) #Find function runtime
import sys; size =
sys.getsizeof(obj) #find size of an object in memory
python -m http.server #run HTTP
server
from functools import lru_cache;
@lru_cache(None) def f(x): ...#memoize any function
compose = lambda f, g: lambda x:
f(g(x)) #compose two function
C = type("C", (object,),
{"x": 42}) #create dynamic class
if (n := len(lst)) > 10:
print(n) #conditional assignment
exec(code,
{"__builtins__": {}}, {}) #safe sandbox to execute
#Deep get with default
from functools import reduce;
deepget=lambda d,*k,default=None: reduce(lambda c,i:c.get(i) if
isinstance(c,dict) else default,k,d)
#partition a list by predicate
true,
false = (list(filter(p, lst)), list(filter(lambda x: not p(x), lst)))
9. Generators & Itertools Magic
fib
= (a := 0, b := 1) or ((a := b, b := a + b)[1] for _ in iter(int, 1))
#fibonacci generator
import itertools; nth = lambda it,
n: next(itertools.islice(it, n, None)) #nth item generator
import itertools; flat =
list(itertools.chain.from_iterable(nested)) #flatten one level with itertool
import itertools; prod =
list(itertools.product(a, b)) #cartesian product
import itertools; perms =
list(itertools.permutations(items)) #find all permutations
import itertools; combs =
list(itertools.combinations(items, 2)) #find all combinations
import
itertools; windows = list(zip(*(lst[i:] for i in range(n)))) #list a long list
n items at a time
1.
One liner with lambda and for loop
#
Square each number
squares = list(map(lambda x: x*x,
[x for x in nums]))
#filter even number
evens = list(filter(lambda x: x %
2 == 0, [x for x in nums]))
#pair each item with its index in
alist
indexed = list(map(lambda i: (i,
items[i]), [i for i in range(len(items))]))
#convert a list of strings into
ints
ints = list(map(lambda s: int(s),
[s for s in strings]))
#Apply function to only positive
numbers in alist
processed = [f(x) for x in nums if
(lambda y: y > 0)(x)]
#tag each item as even or odd
tags = list(map(lambda x: (x,
"even" if x%2==0 else "odd"), [x for x in nums]))
#compute string lengths
lengths = list(map(lambda s:
len(s), [s for s in words]))
#normalize values
norm = list(map(lambda x:
x/max(vals), [x for x in vals]))
#Multiply each number with index
scaled = list(map(lambda t:
t[0]*t[1], [(i, nums[i]) for i in range(len(nums))]))
#extract object attribute
names = list(map(lambda o: o.name,
[o for o in objs]))
#Apply two functions to each
element
results = [(lambda x: (f(x),
g(x)))(x) for x in nums]
#Sort list of tuples by second
element
sorted_items = sorted(items,
key=lambda t: t[1])
#convert a list of objects into
dictionary keyed by its attribute
fields = list(map(lambda d:
d["id"], [d for d in records]))
#Flatten matrix using lambda
processed = list(map(lambda x:
x*2, [x for row in matrix for x in row]))
#filter string using substring
matches = list(filter(lambda s:
"py" in s, [s for s in words]))
#generate dictionary from a list
d = {x: (lambda y: y*y)(x) for x
in nums}
#apply lambda to zipped pairs
products
= list(map(lambda t: t[0]*t[1], [(a, b) for a, b in zip(xs, ys)]))
2.
Miscellaneous
#numpy
histogram bins
import numpy as np; hist =
np.histogram(data, bins=10)[0]
#test if a list is sorted
is_sorted = all(a <= b for a, b
in zip(lst, lst[1:]))
#create a dictionary of frequency
in a list
from collections import Counter;
print(dict(Counter(lst)))
#Time an expression
import time; (lambda t:
(time.sleep(1), print(time.time()-t)))(time.time())
#convert list of dicts into csv
rows
csv =
"\n".join(",".join(str(d[k]) for k in d) for d in rows)
#convert a dict of list into list
of dicts
rev = [dict(zip(d.keys(), vals))
for vals in zip(*d.values())]
#frequency sorted list
sorted_by_freq = sorted(lst,
key=lambda x: (-lst.count(x), x))
#remove duplicate while preserving
the order
seen=set(); out=[x for x in lst if
not (x in seen or seen.add(x))]
#convert snake case to camel case
camel = lambda s:
s.split("_")[0] + "".join(w.title() for w in
s.split("_")[1:])
#convert camel case to snake case
import re; snake =
re.sub(r'(?<!^)(?=[A-Z])', '_', s).lower()
#deep flatten a dict
flat = {f"{k}.{ik}": iv
for k,v in d.items() for ik,iv in v.items()}
#merge list of list into dict of
counts
from collections import Counter;
merged = Counter(x for sub in lst for x in sub)
#convert list of boolean to bit
string
bits =
"".join("1" if x else "0" for x in flags)
#create a dict from two lists
d = {k: v for k, v in zip(keys,
values)}
#read csv and show 5 rows using
panda
df =
pd.read_csv("data.csv").head()
#filter row by condition using
panda
filtered =
df[df["value"] > 10]
#group by column and compute mean
means =
df.groupby("category")["amount"].mean()
#add a new column using lamda in
panda dataframe
df["ratio"] =
df.apply(lambda r: r["a"] / r["b"], axis=1)
#select multiple columns
subset =
df[["name", "score", "rank"]]







