A Python application can be written as a standalone application or as part of a large AI project. As we learned in the previous chapter, AI is a very broad topic. Often, when we hear about an AI application, it is simply a client application that uses a large language model or a generative AI API from vendors like OpenAI, Gemini, and Anthropic. Some applications are Model Context Protocol (MCP) agents that can invoke APIs from commercial or open-source MCP frameworks such as LangChain or Llama. We do not need to know how these APIs are implemented or which AI algorithms are used under the hood. However, Python can do much more than that. We can leverage the vast ecosystem of community libraries to build our own AI or ML pipeline. A Pipeline is a series of steps or modules put together to handle every facet of data collection, extraction, processing, training, testing, and implementing various machine learning (ML), Natural Language Processing (NLP), Deep Learning, Neural Networks, and Visualization, as well as an end-user interface like a chatbot or web application.
An AI or Machine Learning application in Python could take
various forms. Some of the popular Python libraries are listed below. This is
not a complete list, and new libraries are coming up. There are also many
commercial libraries and APIs.
·
Data Analytics, Math and exploration
Applications:
These applications
are used for preprocessing, loading, cleaning, extraction, transformation, and analysis of data in various formats. They could involve file processing, a NoSQL
database operation, or an SQL query. Some packages used in this category are:
1)
SQLite 3 is used for quick experiments
and learning. SQLite is not used in the product. https://docs.python.org/3/library/sqlite3.html
2)
Python-sql is a database access library
to execute SQL statements: https://pypi.org/project/python-sql/
3)
Psycopg2 is a SQL library for Postgres: https://www.psycopg.org/docs/
4)
MySQL-Python is a MySQL database library:
https://pypi.org/project/MySQL-python/
5)
ScIPY provides algorithms for
optimization, integration, interpolation, eigenvalue problems, algebraic
equations, differential equations, statistics and many other classes of
problems. https://scipy.org/
6)
SQL Alchemy is the Python SQL toolkit and
Object Relational Mapper: https://www.sqlalchemy.org/
7)
Cx-Oracle is oracle driver library to
support special SQL syntax of PLSQL: https://pypi.org/project/cx-Oracle/
8)
PyMySQL Another MYSQL database library: https://pymysql.readthedocs.io/en/latest/
9)
PyMango is used for data processing for
MangoDB: https://pymongo.readthedocs.io/en/stable/
10) Webscraping
using beautiful Soup: https://pypi.org/project/beautifulsoup4/
11) Webscraping
using Scrapy: https://pypi.org/project/Scrapy/
12) Numpy
is a data processing math library: https://numpy.org/doc/stable/index.html
13) Pandas
is a data analysis and manipulation tool: https://pypi.org/project/pandas/
14) Polar
is an alternative to Panda’s data frame. It is one of the fastest data
processing solutions on a single machine: https://pypi.org/project/polar/
15) PySpark
is a Python API for Apache Spark. Apache Spark is a unified analytics engine
for large-scale data processing. Spark provides an interface for programming
clusters. It enables us to perform real-time, large-scale data processing in a
distributed environment using Python: https://pypi.org/project/pyspark/
·
Visualization, Charting and Web
Application:
We can develop charts,
dashboards, and interactive web applications to present different aspects of a
dataset and enable users to slice and dice it to discover hidden patterns and
trends. Some frameworks and libraries used in this category are:
1)
Matplotlib is the most popular library;
one must master prototyping and presentation. https://matplotlib.org/
2)
Plotly is a Python Open-Source Graphing
Library Artificial Intelligence and Machine Learning Charts. https://plotly.com/python/ai-ml/
3)
Bokeh is used to create an interactive visualization
web browser (https://bokeh.org/)
4)
Seaborn is a Python data visualization
library based on matplotlib. It provides a high-level interface for creating
attractive, informative statistical graphics. It's also used to customize
matplotlib charts(https://seaborn.pydata.org/)
·
Deep Learning and Machine Learning Libraries
These are the most commonly used libraries in research, academia, and real-world AI projects.
1)
PyTorch provides two high-level features:
Tensor computation (like NumPy) with strong GPU acceleration. Pytorch provides
two main features. Tensor computation with strong GPU acceleration and deep
neural networks. https://github.com/pytorch/pytorch
2)
PyCaret : PyCaret is an open-source,
low-code machine learning library in Python that automates machine learning
workflows. https://pycaret.gitbook.io/docs/
3)
PyFlux is a library for time series
analysis and prediction. https://pyflux.readthedocs.io/en/latest/
4)
TensorFlow is an ML library. TensorFlow is
an end-to-end open-source platform for machine learning. It has a
comprehensive, flexible ecosystem of tools, libraries, and community resources https://www.tensorflow.org/
5)
Tensorflow Board: Is a playground and
ali-in-one framework. TensorBoard is a tool for visualizing metrics and
visualizations throughout the machine learning workflow. It enables tracking
experiment metrics such as loss and accuracy, visualizing the model graph,
projecting embeddings into a lower-dimensional space, and more. https://www.tensorflow.org/tensorboard/get_started
6)
SciKit Learn: scikit-learn is a Python
module for machine learning built on top of SciPy and is distributed. https://scikit-learn.org/stable/
·
Natural Language Processing (NLP)
1)
NLTK: NLTK is a natural language
processing tool and framework. NLTK is a leading platform for building Python
programs to work with human language data. It provides easy-to-use interfaces
to over 50 corpora and lexical resources, such as WordNet, along
with a suite of text-processing libraries for classification, tokenization,
stemming, tagging, parsing, and semantic reasoning, and wrappers for
industrial-strength NLP libraries. https://www.nltk.org/
2)
Spacy is used to build ML products or
gather real insights https://spacy.io/
3)
FastText developed by Facebook AI,
is designed for fast text classification and word embeddings. It can handle
large datasets efficiently. https://fasttext.cc/
4)
Keras: Keras is an open-source library
that provides a Python interface for artificial neural networks. https://keras.io/
·
Computer Vision (CV)
1)
OpenCV provides a real-time
optimized Computer Vision library, tools, and hardware. It also supports model
execution for Machine Learning (ML) https://opencv.org/
2)
YOLO is a fast multi-object detection
algorithm that uses a convolutional neural network (CNN) to detect and identify
objects. https://opencv-tutorial.readthedocs.io/en/latest/yolo/yolo.html