An introduction to Python in 2022
9 min read
Beautiful is better than ugly.
Explicit is better than implicit.
Simple is better than complex.
Complex is better than complicated.
Flat is better than nested.
Sparse is better than dense.
Readability counts.
Special cases aren’t special enough to break the rules.
Although practicality beats purity.
Errors should never pass silently.
Unless explicitly silenced.
In the face of ambiguity, refuse the temptation to guess.
There should be one— and preferably only one —obvious way to do it.
Although that way may not be obvious at first unless you’re Dutch.
Now is better than never.
Although never is often better than right now.
If the implementation is hard to explain, it’s a bad idea.
If the implementation is easy to explain, it may be a good idea.
Namespaces are one honking great idea — let’s do more of those!
Python’s transition from version 2 to 3 was interesting to behold as a spectator. Would there be a schism in the language? Like PHP and unlike Perl, it managed to survive and come out stronger than ever. It remains one of the most popular languages in the world and is growing in popularity every year. Recent releases have continuously improved its support for concurrent processing (asyncio
and async/await
) and introduced optional type hints. Type hints reduce the need for unit tests by enforcing many constraints at build time (through the use of an IDE or linter). They enable IDEs to offer great suggestions and automated refactoring tools.
If you’re a beginning programmer and interested in developing systems (backend processes running on servers) or machine learning, Python is the best language to start out with. If you come from a UI-orientated background (like JavaScript or Swift) and are looking to make it into backends, it can also be a great choice. It’s easy to pick up and features a broad-enough standard library that does not contain too much bloat.
Node.js (backend JavaScript) has not yet caught on in many enterprises, but they do interview Python developers for generalist or backend-specific roles. Once you’re hired, you may still be better off learning another language. Python features many concepts used in other languages. Object-orientated and functional programming are both first class citizens. Now that it supports optional typing, you can (and should!) even start practicing writing typed code, making a transition even easier.
Python is frequently used to create command line interfaces (including AWS’s CLI). You can even build desktop applications with it. Its computational performance hasn’t improved much over the years. However, the Python community has recently had renewed interest in improving performance. For most systems, correctness and features are more important than optimizing computing costs. The Python ecosystem includes many packages which have bindings to performant C code. Python continues to be an excellent choice for ‘glue code’: applications where developer costs triumph performance needs.
Avoid writing compute-bound workloads in pure Python. Another major limitation of the language is the global interpreter lock: only one thread can execute code at a given time. To work around this, Python code must use multiple processes to processing data in parallel, which has more overhead. The abstractions provided by the standard library make this really easy.
This article serves as a starting point for learning more about what Python and its ecosystem has to over. I won’t be going into much detail. Most bullet points will have a link to the Python manual or the library/tool’s homepage. If you’re reading this article on a two-handed form factor’ screen (think tablet and above), you’ll see a list of icons on the right whenever you select some text. Use these to commence an in-depth search. Search queries from this page are prefixed with Python
.
Python language features
Read section Python language features- Automatic arbitrary precision numbers (bignums)
- Classes including multiple inheritance
- Operator overloading
- Slices
- Function arguments passable by name
- Multiple return
- Array (list) comprehensions and generator expressions
- Closures (lambda functions)
- Enums and data classes
- Concise assertions
- Decorators
- Resource management (
with
statement) - Pattern matching
Standard library
Read section Standard library- Regular expressions
- Binary data parsing
- Times and dates and timezones
- Data structures:
- Deque
- ChainMap
- Counter (multiset)
- OrderedDict
- Heap
- PriorityQueue
- Complex numbers
- Decimal numbers (infinite precision floating point)
- Fractions
- Randomness including seeded PRNGs, shuffling and sampling
- Statistical analysis
- Iteration helpers
- Memoization & caching
- OS file path manipulation
- OS file comparison
- OS file globbing
- SQLite
- Gzip, tarballs, LZMA
- CSV
- INI
- JSON
- base64
- IPs
- URLs
- HTML
- Plist
- Cryptographic and password-safe hashing
- UUIDs
- HMAC
- Benchmarking
- Internationalization
- MIME types
- Command line options parser
- Logging
- Ncurses
- C FFI
- Python language parsing
OS fundamentals (subprocesses, files, streams and pipes, platform information) are of course supported as well. Python offers extensive metaprogramming support. The language parsing module can even modify the bytecode of classes and functions at runtime. Built-in XML parsers and HTTP request/server modules are included as well, but should be avoided.
Editors
Read section EditorsProject management
Read section Project management- Poetry is just as easy to use as
npm
. It replacespip
,virtualenv
, and more. - PyPI is the official package repository for Python
- Cookiecutter provides templates to bootstrap projects with
- Jupyter renders your code and its output inline, like a “notebook”
- Sphinx is the documentation generator used by most Python projects
Quality
Read section Quality- Black formats your code in a standard style
- Mypy verifies the type hints in your code
- https://pytest.org/ is the standard testing framework
- the built-in unittest module is available if you can’t install dependencies
- Coverage.py provides code coverage metrics
- Freezegun mocks
datetime
- Pytest plugins can combine all these tools into one command:
- pytest-cov integrates
coverage
- pytest-black integrates
black
- pytest-mypy integrates
mypy
- pytest-cov integrates
Essential dependencies
Read section Essential dependenciesTimes & dates
Read section Times & dates- Dateutil extends the built-in
datetime
- Arrow is easier to work with than
datetime
, especially when it comes to handling timezones
Debugging
Read section Debugging- PySnooper logs what each line of your function is doing
- Stackprinter adds context to exceptions
Command line interfaces
Read section Command line interfaces- Click is a general framework for CLIs
- Rich formats text
- Fire automatically generates a CLI for an object or class
- PyInquirer handles multi-stage prompts
Web servers
Read section Web servers- Django is the batteries-included web framework for Python, like Rails is to Ruby.
- Flask is for simple web servers
- Gunicorn is a popular server-runner for Python web applications
Databases
Read section Databases- SQLAlchemy is an ORM for SQL databases
- Psycopg2 is a driver for PostgreSQL databases
- Mysql-connector-python is the official driver for MySQL databases
- Pymongo is the official driver for MongoDB
- Kafka-python is a client to Kafka
- Redis-py is the official Python client to Redis
- Walrus makes
redis-py
easier to use
Desktop applications
Read section Desktop applicationsScientific computing
Read section Scientific computing- Numpy provides high-performance array routines
- Many libraries in the scientific computing and machine learning sections use Numpy.
- Pandas processes tabular (rows and columns) data efficiently
- Scipy contains countless mathematical functions, algorithms, and data structures
- K-means clustering algorithms (commonly used in machine learning) are provided as well
- Matplotlib can output over a dozen (interactive) plot types to image formats, HTML, and more
- Seaborn wraps
matplotlib
to provide statistical data visualizations - Bokeh creates rich interative HTML visualizations
Machine learning
Read section Machine learning- OpenCV enables realtime computer vision applications
- Tensorflow is a fully-featured framework for machine learning models from Google
- PyTorch is a fully-featured framework for machine learning models from Facebook
- Keras is built on top of
Tensorflow
and makes deep learning approachable - Scikit-learn uses machine learning for data analysis
Text processing
Read section Text processing- NLTK offers classification, tokenization, stemming, tagging, parsing, and semantic reasoning
- PyParsing makes it easy to write parsers
Image processing
Read section Image processing- Scikit-image provides hundreds of functions for image processing
- Pillow
Web scraping
Read section Web scraping- BeautifulSoup parses HTML and XML
- Scrapy is a framework for web scraping
- Selenium controls a real web browser like Chrome, Firefox, or Safari.
- Selenium is often used for end-to-end (e2e) tests
- RoboBrowser combines BeautifulSoup and Requests to provide a Selenium-like API