User Tools

Site Tools


python

Python

Versions

Python 3 is NOT backward compatible with version 2.

  • Python 2.7 - release 2010, end of support 2020
  • Python 3.8 - release 2019-Oct, end of support 2024-Oct
  • Python 3.9 - release 2020-Oct, end of support 2025-Oct

print

  • in version 2, print is a statement: print 'hello'
  • in version 3, print is a function: print('hello')

string

  • in v2, strings are ASCII by default. Use u'to force unicode' in v2.
  • in v3, strings are UTF-8 by default

division

  • in v2, 3/2 returns 1. Use 3.0/2.0 to force float division.
  • in v3, 3/2 returns 1.5

https://www.appdynamics.com/blog/engineering/the-key-differences-between-python-2-and-python-3/

Components

Physically, python is a set of files on a computer.

Executables

/usr/bin/python3.8

  • python - interpreter
  • pip - package installer
  • pyclean
  • pycompile

Configuration files

/usr/bin/x86_64-linux-gnu-python3.8-config # an executable script

/etc/python3/debian_config # an ini file

Standard Library

$ /usr/lib/python3.8

Installed Packages

$ ls -al /usr/lib/python3/dist-packages
$ pip list

Installable Packages

See https://pypi.org/

$ pip search

Data Types

String

Encoding

https://docs.python.org/3/howto/unicode.html

Strings, bytes, unicode.

  • In python2, string and byte are identical.
  • In python3, string and byte and bytearray are different.
  • Python source code is UTF-8.
  • By default a quoted string literal is a UTF-8 encoded string. As in ‘str’.
bytes = str.encode(‘utf-8’)  # convert a string to a byte object
str = bytes.decode(‘utf-8’)  # convert a byte object to a string

Formatting

r'raw string'
b'byte array'
f'formatted string'

Mutable vs Immutable

  • immutable: integer, string, tuple, set
  • mutable: list, dict
  • mutables have methods that change the data: append(), insert(), push(), etc.
  • when you set a variable to an immutable, you create a new object
x = 1    # integer is immutable
x = 2    # does not change x, creates a new x
a = [1,2,3]    # list is mutable
t = (1,2,3)    # tuple is immutable
s = (1,2,3,1)  # set is immutable

d = {'a':1, 'b':2}  # dict is mutable
s = 'a'        # immutable string
s += 'b'       # slow, creates new object 
l = ['a']      # mutable list
l.append('b')  # fast, modifies existing object

Collections

Collections are objects.

Native Classes

list

mutable, square brackets

a = list(1,2,3)  # constructor for object of class list
a = [1,2,3]  # shortcut using square brackets
tuple

immutable, parens

a = tuple(1,2,3)  # constructor for object of class tuple
a = (1,2,3)  # shortcut using parens
set

immutable, no order, no dupes, curly braces

a = set(1,2,3,1)  # constructor for object of class set
a = set(1,2,3,1)  # shortcut
len(set) => 3   # does not accept dupe value
a = {}    # does NOT create empty set, creates empty dict
dict

mutable, key value pairs, curly braces

a = dict('name'='John', age=32)  # constructor for object of class dict
a = {'name'='John', age=32}    # shortcut with curly braces

More Collections

from collections import namedtuple  # derived from tuple
from dataclasses import dataclass   # derived from dict

source:
https://www.youtube.com/watch?v=W8KRzm-HUcc
https://www.youtube.com/watch?v=daefaLgNkw0

Functions

The following terms are generic programming concepts, not limited to Python.

First-Class Function

A first-class function, like a first-class citizen can be treated the same as any other entity. It can be:

  • assigned to a variable
  • passed as an argument
  • returned from a function
Higher Order Function

Accepts a function as an argument, or returns a function as output.

First Order Function

Not a higher order function.

Closure

A inner-function returned as output that retains access to variables of it's outer function.

def outer_func():
  message = 'Hi'
  def inner_func():
    print(message)      # free variable, not defined  in inner function
  return inner_funcbb   # return a function

clo = outer_func()
clo()                  # 'Hi', still has access to free variable: message
print(clo.__name__)    # 'inner_func'

Arguments

function(*args, **kwargs)

Objects

Everything is an object.

Formal OOP can be done using the class keyword, complete with inheritance.

Identity

Every object has an id.
The builtin function id(object) gives up the id.
The “is” keyword compares for identical objects.

a = [1,2,3]      # create object
id(a)            # get its id
b = a            # copy pointer to object
id(a) == id(b)   # True
a is b           # equivalent to above, compares ids

Iterables and Iterators

An iterable is an object with an __iter__() method.

The __iter__() returns an iterator.

An iterator is an object with a __next__() method, and an __iter__() method that returns itself.

Comprehensions

Reverse the syntax of a for-loop, when copying a list, set, or dict.

nums = [1,2,3,4]

evens = []   # for-loop syntax
for n in nums:
  if n%2 == 0:
    evens.append(n*n)

evens = [n*n for n in nums: if n%2 == 0]  # comprehension syntax

Generators

What does a generator generate? Well, I suppose it generates a kind of pseudo collection.

It allows you to process a bunch of things without physically building a collection in memory.

It can assume the syntax of a function, or an expression.

Generator Function

Like an iterator, but does not build collection in memory.

The yield keyword is used to give control to the caller.

def gen_func(nums):
  for n in nums:
    yield n*n  # returns to calling loop with next item

my_gen = gen_func([1,2,3,4])

for i in my_gen:
  print i   # yield calls back here

Generator Expression

Similar to Comprehension syntax, but with parens, and it does not build a collection in memory.

my_gen = (n*n for n in [1,2,3,4])

for i in my_gen:
  print   

Decorators

In decorating a function, we create a wrapper function that adds functionality to that original function. This can be done with a class, but is usually done with a function. A decorator function is a closure that takes the original function as an argument, and returns the wrapper function that calls that original function.

def decorator_function(original_function):
  def wrapper_function():
    return original_function()
def display():
  print('display function ran')
decorated_display = decorator_function(display)
decorated_display()  # prints the output 'display function ran'

Now the @ syntax.

@decorator_function
def display():
  print('display function ran')

is equivalent to:

def display():
  print('display function ran')
display = decorator_function(display)

Decorators can be stacked.

from functools import wraps

def my_logger(orig_func):
  import logging
  logging.basicConfig(filename='{}.log'.format(orig_func.__name__),level=logging.INFO)
  
  @wraps(orig_func)
  def wrapper(*args, **kwargs):
    logging.info(f'Ran with args: {args}, and kwargs: {kwargs}')
    return wrapper
    
def my_timer(orig_func):
  import time
  
  @wraps(orig_func)
  def wrapper(*args,**kwargs):
    t1 = time.time()
    result = orig_func(*args, **kwargs)
    t2 = time.time() - t1
    print(f'{orig_func.__name__} ran in: {t2} sec'
    return result
  return wrapper

@my_logger
@my_timer
def display_info(name,age):
  print(f'display_info ran with arguments({name}, {age}f')

display_info('Hank',30)

source: https://www.youtube.com/watch?v=FsAPt_9Bf3U&t=997s

Context Manager

keyword with

There have always been amateur programmers who don't know what the fuck they're doing. They refuse to check return codes, they forget to close files, they forget to free memory handles… And instead of taking responsibility, fixing their mistakes and vowing to do better in the future, they point the finger at the language. Then they jump through hoops trying to fix the language and in the process they fuck things up for the professional programmers who do know what they're doing.

First example, once we had a bunch of amateurs writing unmanageable spaghetti code. Instead of either training them or firing them, some group of boneheads decided it was the fault of the goto statement. So they outlawed the goto statement. That made it more difficult for responsible programmers to work, and we still had those incompetents running around causing trouble.

Next case, they said “I shouldn't have to check return codes”, and came up the the try-catch construct, which is an abomination in any language. Now, the competent programmers have to check return codes AND catch errors. Double work.

Other bullshit constructs:

  • the Entity Bean in Java, for people who don't understand relational database
  • the Promise in Javascript, and believe me if you're not capable of writing asynchronous event-driven code, you're not going to be able to use a Promise either.

And now we come to the case at hand. Amateurs often forget to release resources. Instead of admitting a mistake and improving their technique, they blame the language. “The system should know when I'm finished with that file and close it for me.” Hence, the context manager.

What if this happened in other walks of life? What if you took your kid to school, and forgot to pick him up?

A competent coder writes like this:

f = open('myfile.text', 'w')
f.write('I hate stupid people.')
f.close()

The shit-for-brains who can't pay attention long enough to close the file would have us write it like this:

with open('myfile.text', 'w') as f:
  f.write('I hate stupid people.')

The “with” keyword calls upon a context manager.

A context manager can be implemented as a class like this:

class Open_File():
  def __init__(self, filename, mode):
    self.filename = filename
    self.mode = mode
  def __enter__(self):
    self.file = open(self.filename, self.mode)
    return self.file
  def __exit__(self):
    self.close()

A context manager can also be implemented as a decorated function like this:

from contextlib import contextmanager

@contextmanager
def open_file(file,mode):
  f = open(file, mode)
  yield f
  f.close()

Prove to yourself the file was closed like this:

print(f.closed)

Structure

module, package, library, framework, application

module = source code file

package = folder tree of files

Folder and filenames, all lowercase.

Top- level folder of a package

  • Name is name of package
  • Does not contain source code files
  • Contains file named init.py which may be empty

import module

  • built-in
  • third party
  • my own

source: https://docs.python-guide.org/writing/structure/

Concretely, the import modu statement will look for the proper file, which is modu.py in the same directory as the caller if it exists. If it is not found, the Python interpreter will search for modu.py in the “path” recursively and raise an ImportError exception if it is not found.

import modu

Makes the module's functions available in the module namespace.

from modu import *

Makes the module's functions available in the local namespace of the caller.

Any directory with an init.py file is considered a Python package.

Modules

A module is a file.

For filename mymod.py, the module name is mymod.

A module can be run directly by filename.

A module can be imported by module name.

import

Importing a module runs that file.

sys.modules - a list of modules already imported with module name and filename. A module is imported only once.

After running, the variables in that module are available as qualified by module name.

import mymod
print(mymod.x)

from mymod import x
print(x)

from mymod import x as y
print(y)

from mymod import *
print(x)

import

  • find the file by searching sys.path
  • if the module is already present in sys.modules, ignore
  • run the file
  • insert modulename and filename into sys.modules

sys.path

Modules are loaded from a directory in the sys.path list.

import sys
print(sys.path)
sys.path.append('~/.local/python')  # sys.path is a list

Directories on the path include in order:

  1. The directory from which the current module was loaded.
  2. Python path environment variable
    • $ export PYTHONPATH='~/.local/python'
  3. Standard libraries directories
  4. The site packages directories

standard library

import random
import math
import datetime
import sys
import os
import calendar
import antigrav

Examine a Module

if __name__ == '__main__':
  print('run directly')
else:  # __name__ == modulename
  print('run on import')

print(antigrav.__file__)  # show the directory and filename of the module
dir(antigrav)             # list variables and functions in the module

Packages

A package is a directory of files.

__init__.py is an optional file that, if present, is run when the package is imported. NOTE. Much of the online documentation says an __init__.py file is required to make the directory a package. That is no longer true as of python version 3.3.

Like a module, a package can create a namespace, depending on how it is imported.

import pack.mod
pack.mod.func()

from pack import mod
mod.func()

Package Management

pip list # all installed packages

PIP3 Install

Local

pip3 install websockets
~/.local/lib/python3.8/site-packages/

Global

sudo pip3 install websockets
/usr/local/lib/python3/dist-packages/

As of Oct 2020, on RacerSwift, I have installed these packages, all global.

  • numpy
  • opencv-python
  • websockets
  • python-networkmanager
  • psutil

As of Oct 2020, on a2hosting, sudo is not available. All installs are local, within a virtual environment.

PIP

Pip Installs Packages (PIP)

Python Package Index (PyPI) hosted by Python Software Foundation

Included by default in Python 3.4 and later.

$ pip install opencv-python
$ pip uninstall opencv-python
$ pip --version
$ pip help
$ pip list

apt

Installs packages from Ubuntu repositories hosted by Canonical.

Compared to pip, apt supports fewer packages, only one version of each package, sometimes different package names, no support for virtual env.

$ sudo apt install python-opencv

Virtual Environments

Python programmers are not big on backward compatibility. Python version 3 is not compatible with version 2. Django version 3 is not compatible with version 2. When you upgrade to new versions, applications importing those packages may break. So, do you upgrade and rewrite, or stick with your old versions?

You don't have to make that choice. You can create multiple environments, each with a specific set of python and libraries. If necessary, you can create a separate environment for every project.

A virtual environment creates a local area in which to install python and packages. This allows multiple versions of python and/or packages on one machine. A2 hosting uses this to keep shared host customers separate.

There are three mechanisms for managing virtual environments, described here in chronological order: virtualenv, venv, pipenv.

virtualenv

$ virtualenv myenv1           # create a directory amed myenv1
$ source myenv/bin/activate   # sets aliases
$ which python                # display path of env is active
$ which pip
$ pip list                    # immediately after create, pip and setup tools only
$ deactivate                  # reset aliases to revert to global python
$ rm -rf myenv                # delete the environment

$ pip freeze --local >requirements.txt

venv

More modern version of virtualenv, part of standard library. Can NOT be used to install older versions of python.

venv 

pipenv

Also part of the standard library, collapses venv syntax into pip.

pipenv 

Capabilities

CLI

  • argparse - standard library
  • click
  • docopt
  • plac
  • cliff
  • cement
  • fire

repl

REPL stands for Read, Evaluate, Print, Loop. The REPL is how you interact with the Python Interpreter.

Sometimes aka serial terminal.

  • pyserial
  • tinycom
  • GNU Readline - not a python package, but a linux utility

Run Shell Commands

Your choice of two packages, os and subprocess.

$ os.system('nmcli dev wifi connect JASMINE')   # return 0

$ subprocess.run()
$ subprocess.check_output()

GUI

  • tkinterIO
  • Qt
  • WxPython
  • Remi
  • PySimpleGUI - wraps tkinter or Qt
  • PySimpleWx - wraps WxPython
  • PySimpleGUIWeb - wraps Remi

Web Framework

Top 3

  • Django (43%) - full-stack framework
  • Flask (41%) - micro framework
  • Tornado (6%) - micro asynchronous framework

Networking

  • sockets - widely-supported multi-platform protocol for IPC
  • http.server - implements a complete webserver
  • websockets - defined in html 5, built on asyncio

socket

Two ways to address a socket:

  • address_family=AFI_UNIX, server_address='50000' # str: socket number
  • address_family=AFI_INET, server_address=('127.0.0.1',8080) # tuple: (host, port)

The first is restricted to sockets on the same computer. The second can be used across the network.

A websocket cannot talk directly to a socket because the data packet is framed differently. A middleman is needed to translate.

Datetime

timestamp

$ import uuid
$ uuid.uuid4() # generate a unique id

Concurrency

Threading refers to the concept of threading multiple threads of processing through a single CPU.

Process vs thread. In OS theory, an executable becomes a process and the programmer can split the processing into multiple threads. A process can have multiple child threads.

In Linux, a thread is a type of process. Threads and processes are both created with the clone() function. Task is the generic term to refer to either a process or a thread.

Task switching is performed by the scheduler.

Context switching is the sometimes expensive process of swapping memory and registers when preparing to switch from one task to another.

The one OS scheduler is used to switch context to a process and to a thread within a process. The scheduler does preemptive task switching. Because of the global interpreter lock (GIL), a Python process only runs one thread at a time.

To write a program that can take advantage of multiple CPU’s and run multiple tasks simultaneously, one must use multiprocessing, as opposed to threading.

threading - The stock Python threading package uses OS threads and the OS scheduler running in kernel space. This system uses a preemptive threading model. This system has heavy context-switching costs and is therefore not scalable; increased numbers of threads slow the system exponentially.

preemptive threading - The scheduler interrupts a thread at any time, at any point in its processing. The Linux kernel scheduler is preemptive.

cooperative threading - There is no scheduler preemptively switching between threads. The programmer voluntarily gives up control at specific points in the code. Implementations in Linux are custom and running in user space. These systems usually have low context-switching costs and are scalable to large numbers of threads.

Asynchronous IO - A language-independent model describing an implementation of cooperative threading.

asyncio - A Python library implementing asynchronous IO. In asyncio, a thread loop is called a coroutine. Two new Python language keywords were created for use with the asyncio package: async and await.

Green-threads, greenlet, tasklets, eventlet, gevent - Alternative implementations of cooperative threading in Python.

Python library CPU’s processes threads Speed up:
threading 1 1 multiple IO-bound tasks
asyncio 1 1 1 IO-bound tasks
multiprocessing multiple multiple optional CPU-bound tasks

Parallel Processing

Threads and Processes

import threading
t = threading.**Thread**(target=do_something)
t.start()
t.join()
import multiprocessing
p = multiprocessing.**Process**(target=do_something, args=[1.5])
p.start()
process.join()
import concurrent.futures
with concurrent.futures.**Thread**PoolExecutor() as executor:
      executor.map(do_something, list_of_things)
      
with concurrent.futures.**Process**PoolExecutor() as executor:
      executor.map(do_something, list_of_things)

source: https://www.youtube.com/watch?v=fKl2JW_qrso&t=1s

asyncio

The asyncio package simulates threading, with supposedly less overhead.

Along with the package come two new keywords:

  • async - defines a coroutine
  • await - calls a coroutine, and yields control to other virtual threads

A coroutine is defined with async def. The first coroutine is called with asyncio.run(fn()). This is the corollary to creating a thread. The coroutine may call only coroutines and must use the await keyword to do so. The await keyword gives control up for other virtual threads to run.

NOTE: Websockets are built on asyncio.

software architecture options

synchronous

  • a set of tasks must be executed one-at-a-time in a specific order *

async

  • a set of tasks can be run in any order, and maybe simultaneously
  • each task may be interruptible

blocking

  • with a synchronous design, long-running tasks run until completion, blocking any other tasks
  • python's native http libraries all block between request and response
  • * over 1700 python libraries for http, and every one of them blocks:
  • * * requests.get( url) blocks
  • * * http.client.getResponse() blocks
  • * * urllib3.PoolManager().request() blocks
  • * * etc
  • this is the opposite of http's design intention: an unthinkable * implementation
  • the requests library has the tagline: “http for humans”
  • * this implies:
  • * * the library authors find http progamming difficult
  • * * so they are here to save the day by simplifying it
  • * in fact:
  • * * a better tagline would be: “http for amateurs”

event-driven

  • event-driven design is one example of async design
  • using callbacks or signals
  • make an http request and return immediately
  • on receipt of the response, set a signal or call a callback

cooperative multitasking

  • asyncio is an example of cooperative multitasking
  • the programmer inserts an await directive into a function
  • * where it is appropriate to interrupt the function and allow other tasks to run

preemptive multitasking

  • the os decides when to interrupt a task

asyncio

  • a python library
  • implements cooperative multitasking
  • def async functionname(): - defines an interruptible function
  • await - a directive placed within an async function to yield cpu to other async functions
  • all threads run in one process, which means
  • * they can theoretically run async (non-blocking), but not simultaneously
  • the python http libraries do not allow a place to put the await directive
  • * between the request and the response,
  • * so the asyncio library does not solve the http blocking problem
  • because we call requests.get() within a subprocess
  • * the combinatino of asyncio and subprocess might work

threading

  • a python library
  • programmer can start multiple threads and assign functions to each thread
  • all threads run in one process
  • the global interpreter lock (GIL) - in CPython, a predominate python interpreter -
  • * prevents two threads from executing python code simultaneously,
  • therefore, threads can execute async (non-blocking), but not simultaneously

multiprocessing

  • a python library
  • similar to threading, but using “process” instead of “thread”
  • each process can be assigned to a different cpu or core,
  • * allowing multiple tasks to run simultaneously
python.txt · Last modified: 2024/05/23 11:04 by jhagstrand

Except where otherwise noted, content on this wiki is licensed under the following license: Public Domain
Public Domain Donate Powered by PHP Valid HTML5 Valid CSS Driven by DokuWiki