Feineigle.com - Software Architecture with Python

Home · Book Reports · 2018 · Software Architecture With Python

Published: August 29, 2018
Tags:  Programming · Python



The book in...
One sentence:
Architecture (high level) versus design (low level).

Five sentences:
You can't cover every aspect of architecture (or design) in one book. Every single chapter's topic is more of a crash course and could easily (and have been) discussed in their own book. That said, it was an interesting read for someone who has no plans or desires to build a system at the scales the book discusses. So who and when should you read this? Probably best suited for someone with programming experience that wants to scale up their career.

designates my notes. / designates important.


Thoughts

Architecture is different from design. It is done from a higher, enterprise, level and sees everything from the top down. The components of the architecture can be delegated to subsystems architects, each responsible for their piece. (security, technology, data, etc)

Even if you didn’t explicitly map out an architecture you still have one. You always have one, even if might not be formal.

Some things to think about at the architecture level:

Some of the documentation practice is silly. A loop contains the following:

  # sleeps for 1 second every time
  time.sleep(1)

No shit! From what I have learned, it is best to write self-documenting code. In this case the time.sleep() function is pretty self-documented.

There is some suspect advice on potentially using multiple return types. It also mentions returning objects that can hold data (like url.content for example) or returning tuples that contain error codes.

Some tools to make your code cleaner:

Some things to be on the look at for within your code:

Chapter three talks about testability and test strategies. It offers advice on using stubs and mocks and when to white/black box your tests.

Specifically it mentions unittest, nose, and py.test.

Code coverage is covered here alongside integration tests. For example unittest is one unit while integration tests the whole, like selenium.

And you won’t get out of the chapter without the buzzword test driven development. (not that it is terrible)

Chapter four is definitely adjacent to chapter three, literally and figuratively. It talks about testing, but for performance. Should you test for perfomance throughout development (I say no) or should you do it at the end (the end? modern software has that?)?

There is discussion on big O complexity, data structures, and some tools to aid you in squeezing every bit out of your code.

With chapter five it is all about scalability (horizontal and vertical). Can you take advantage of concurrency? What kind of latency and performance do you have?

An example of a producer/consumer to output thumbnails is given.

Should you use locks or semaphores (which were faster by 4x)?

Under certain conditions can you throttling aspects of your program? If you are going to use concurrency, will you take advantage of threads or processes.

When you deploy, which interpreter will you choose?

Finally it ends with some suggested best practices.

Chapter six deals with security, but is a little lacks. This can be forgiven since the book tries to cover a lot and security is its own book.

It first starts with things like what kind of protection are you looking for: confidential, integrity, availability?

How is access going to be handled: authentication, authorization, non-reputability?

There is a warning about how eval is unsafe and that you should never use pickle’s from unknown sources (use JSON or YAML instead)

They don’t say it but I will: never trust your users. Always validate and sanitize input.

Don’t use the old template method, instead use “{}".format().

There is a little on passwords. Basic stuff is covered like: don’t store passwords, store hashes and don’t reference passwords in functions. Also, use password library. There is no need to reinvent the wheel (and yours will probably be poorly implemented).

While I personally like to keep everything in-house to reduce reliance on libraries as much as reasonable, passwords are something you shouldn’t take a risk with.

Keeping with the thin content, chapter seven touches on design patterns. You can read many (better) books on this (like The Gang of Four).

They break the patterns down into: Creational, Structural, and Behavioral patterns.

Chapter eight is again a bit skimpy with limited coverage of the model/view/controller architecture.

It mentions event driven architecture and some examples and use cases: chat servers, select, sockets, twisted, eventlet, gevent.

My least favorite thing, microservices, comes next. They are not an architecture for a particular problem and can be used in many places. What they don’t mention is the fragility this can lead to.

Pipe and filter architecture

It was touched on before, kind of, when they mentioned WSGI, but chapter nine is all about deploying. How are you going to set up your dev/testing/staging/production environment?

You should install/setup python packages with pip and run in a virtualenv. There is some discussion on pypi and packaging for pypi distribution: structure and imports, setup.py.

Interestingly they never mention docker, but do talk about managing with fabric vs (the better) ansible (can be run multiple times, won’t change what doesn’t need changed).

A few deployment patterns:

The final chapter, ten, is on debugging. From the simple print peppered throughout to eliminate blocks of code and using sys.exit().

It circles back on itself and recovers mocking with random data generation via the schematics module.

To save some bandwidth (and time and money) you can use caching (so you don’t need to hit external APIs or make expensive database calls. Some ways to handle the cache include:

Finally it looks at more advanced tools:


Table of Contents


· 01: Principles of Software Architecture

page 3:

· 02: Writing Modifiable and Readable Code

page 39:
page 64:

· 03: Testability - Writing Testable Code

page 105:
$ pip install coverage
page 117:

· 04: Good Performance is Rewarding

page 159:
$ pip3 install line_profiler
page 161:
$ pip3 install memory_profiler
page 179:
>>> cities = ['Jakarta','Delhi','Newyork','Bonn','Kolkata',
              'Bangalore','Bonn','Seoul','Delhi','Jakarta','Mumbai']
>>> cities_odict = OrderedDict.fromkeys(cities)
>>> print(cities_odict.keys())
odict_keys(['Jakarta', 'Delhi', 'Newyork', 'Bonn', 'Kolkata',
            'Bangalore', 'Seoul', 'Mumbai'])

· 05: Writing Applications that Scale

page 191:
page 235:

· 06: Security - Writing Secure Code

· 07: Design Patterns in Python

page 339:
class Borg(object):
  """ I ain't a Singleton """

  __shared_state = {}

  def __init__(self):
    self.__dict__ = self.__shared_state
  class IBorg(Borg):
  """ I am a Borg """

  def __init__(self):
    Borg.__init__(self)
    self.state = 'init'

  def__str__(self):
    return self.state

>>> i1 = IBorg()
>>> i2 = IBorg()
>>> print(i1)
init
>>> print(i2)
init
>>> i1.state='running'
>>> print(i2)
running
>>> print(i1)
running
>>> i1==i2
False
page 340:
>>> i1.x='test'
>>> i2.x
'test'
page 341:
>>> class ABorg(Borg):pass
...
>>> class BBorg(Borg):pass
...
>>> class A1Borg(ABorg):pass
...
>>> a = ABorg()
>>> a1 = A1Borg()
>>> b = BBorg()

#Now let's attach a dynamic attribute x to a with value 100:

>>> a.x = 100
>>> a.x
100
>>> a1.x
100

# Let's check if the instance of the sibling class Borg also gets it:

>>> b.x
100

· 08: Python – Architectural Patterns

page 439:
page 443:
# pipe_recent_gen.py
# Using generators, print details of the most recently modified file
# matching a pattern.

import glob
import os
from time import sleep

def watch(pattern):
  """ Watch a folder for modified files matching a pattern """
  while True:
    files = glob.glob(pattern)
    # sort by modified time
    files = sorted(files, key=os.path.getmtime)
    recent = files[-1]
    yield recent
    # Sleep a bit
    sleep(1)

def get(input):
  """ For a given file input, print its meta data """
  for item in input:
    data = os.popen("ls -lh " + item).read()
    # Clear screen
    os.system("clear")
    yield data

if __name__ == "__main__":
  import sys
  # Source + Filter #1
  stream1 = watch('*.' + sys.argv[1])
  while True:
    # Filter #2 + sink
    stream2 = get(stream1)
    print(stream2.__next__())
    sleep(2)

· 09: Deploying Python Applications

continuous
blueGreen
canary
A/B testing
induced chaos

· 10: Techniques for Debugging

page 497:
import hashlib
import json
import os

def unique_key(address, site):
  """ Return a unique key for the given arguments """
  return hashlib.md5(''.join((address['name'],
                              address['street'],
                              address['city'],
                              site)).encode('utf-8')).hexdigest()

def filecache(func):
  """ A file caching decorator """
  def wrapper(*args, **kwargs):
    # Construct a unique cache filename
    filename = unique_key(args[0], args[1]) + '.data'
    if os.path.isfile(filename):
      print('from file')
      # Return cached data from file
      return json.load(open(filename))
    # Else compute and write into file
    result = func(*args, **kwargs)
    json.dump(result, open(filename,'w'))
    return result
  return wrapper

@filecache
def api_search(address, site='yellowpages.com'):
  """ API to search for a given business address
      on a site and return results """
  req_params = {}
  req_params.update({'key': get_api_key(site),
                     'term': address['name'],
                     'searchloc': '{0}, {1}, {1}'.format(address['street'],
                                                         address['city'],
                                                         address['state'])})
  return requests.post(search_api % locals(), params=req_params)
page 498:
from redis import StrictRedis

def memoize(func, ttl=86400):
  """ A memory caching decorator """
  # Local redis as in-memory cache
  cache = StrictRedis(host='localhost', port=6379)
  def wrapper(*args, **kwargs):
    # Construct a unique key
    key = unique_key(args[0], args[1])
    # Check if its in redis
    cached_data = cache.get(key)
    if cached_data != None:
      print('from cache')
      return json.loads(cached_data)
    # Else calculate and store while putting a TTL
    result = func(*args, **kwargs)
    cache.set(key, json.dumps(result), ttl)
    return result
  return wrapper