Submit Blog  RSS Feeds

Monday, December 9, 2013

A few words on boolean casting/mapping in python

So I thought I'd write a few words about truth value testing in python. It's quite common to if some_object conditionals, and I believe that programmers are not always aware of what actually happens when such a line is being evaluated, and tend to perceive it only as len(some_object) > 0 if it's a collection or some_object is not None in other cases. We may refer to the python docs to verify it:

Called to implement truth value testing and the built-in operation bool(); should return False or True, or their integer equivalents 0 or 1. When this method is not defined, __len__() is called, if it is defined, and the object is considered true if its result is nonzero. If a class defines neither __len__() nor __nonzero__(), all its instances are considered true.
So basicly when we compose something like this:


Is evaluated like this:

some_obj.__nonzero__() if hasattr(some_obj, '__nonzero__') else len(some_obj) <> 0 if hasattr(some_obj, '__len__') else True

So basically conditionals that  made with premeditation not from laziness, especially when using 3rd-party libraries/frameworks. For example, when using python requests,

>>> import requests
>>> res = requests.get('')
>>> res
<Response [404]>
>>> bool(res)
>>> res = requests.get('')
>>> res
<Response [200]>
>>> bool(res)

In case of python-requests http errors are not raised by default, instead if a status code of 4XX or 5XX is returned, the __nonzero__ method returns False. In any case res is not None is always True in the above case.

Thursday, February 21, 2013

Python logging introspection

How many times have you used a print instruction instead of logger.debug or Well I used to do it frequently. The thing is, setting up a logger in an application that has many of its own, is problematic. There is a tool however that may help you identify the right place for your logger (or identify the logger you want to use). 

~ $ pip install logging_tree

So what does this package do? In practice this is a logging introspection tool that recreates a tree structure of your current loggers (along with handlers and filters). This is very useful since you may immediately identify which logger you should use, or at least confirm that adding a new logger will be mandatory.  

For example: if You type the following in a python terminal:

>>> import logging_tree
>>> logging_tree.printout()
   Level WARNING

Well this is quite obvious, no modules are loaded, thus no custom logger was registered. On the other hand let's look at a logging tree of a young django app:

In [1]: import logging_tree

In [2]: logging_tree.printout()

   Level WARNING
   |   Level INFO
   |   Handler File '/tmp/wt.log'
   |   Handler Stream <open file '<stderr>', mode 'w' at 0x7f21e9890270>
   |     Filter <django.utils.log.RequireDebugTrue object at 0x1676350>
   |   |
   |   o<--[django.db]
   |   |   |
   |   |   o<--"django.db.backends"
   |   |
   |   o<--"django.request"
   |       Level ERROR
   |       Handler <django.utils.log.AdminEmailHandler object at 0x1676790>
   |   |
   |   o<--""
   |   |
   |   o<--"nose.config"
   |   |
   |   o<--"nose.core"
   |   |
   |   o<--"nose.failure"
   |   |
   |   o<--"nose.importer"
   |   |
   |   o<--"nose.inspector"
   |   |
   |   o<--"nose.loader"
   |   |
   |   o<--"nose.plugins"
   |   |   |
   |   |   o<--"nose.plugins.attrib"
   |   |   |
   |   |   o<--"nose.plugins.capture"
   |   |   |
   |   |   o<--"nose.plugins.collect"
   |   |   |
   |   |   o<--"nose.plugins.cover"
   |   |   |
   |   |   o<--"nose.plugins.doctests"
   |   |   |
   |   |   o<--"nose.plugins.isolation"
   |   |   |
   |   |   o<--"nose.plugins.logcapture"
   |   |   |
   |   |   o<--"nose.plugins.manager"
   |   |   |
   |   |   o<--"nose.plugins.multiprocess"
   |   |   |
   |   |   o<--"nose.plugins.testid"
   |   |
   |   o<--"nose.proxy"
   |   |
   |   o<--"nose.result"
   |   |
   |   o<--"nose.selector"
   |   |
   |   o<--"nose.suite"
   |   |
   |   o<--"py.warnings"
   |       Handler Stream <open file '<stderr>', mode 'w' at 0x7f21e9890270>
   |         Filter <django.utils.log.RequireDebugTrue object at 0x1676350>
       Handler <south.logger.NullHandler object at 0x20c9350>

It's much easier to read this tree output than getting familiar with your applications logging configuration along with the documentation of other packages that are using the logging module.

Hope this saves You a lot of time.


Thursday, February 14, 2013

A non-production function decorator

As most developers know, not every piece of code is meant to be run on a production server. Instead of using a lot of "ifs" here and there I suggest implementing a framework specific "non_production" decorator. A simple django-specific implementation could look like this:

def non_production(func):

    def is_production():
        #django specific 
        from django.conf import settings 
        return getattr(settings, "PRODUCTION", False) 

    def wrapped(*args, **kwargs):
        if is_production():
            raise Exception("%s is not meant to be run on a production server" % \
            return func(*args, **kwargs)
    return wrapped

Now all you have to do is to apply it to your dev/test only functions:

def test_something(a, b):


Thursday, January 31, 2013

Python code readability - PEP 8

Everybody familiar with python should be aware there are a bunch of documents called PEPs (Python Enhancement Proposals). As the name states, this documents are intended to help improve python, not only by adding new features to the interpreter and enhancing standard libraries, they also give guidelines (proposals) about meta-programming. Many experienced and respectable members of the python community participate in the PEP life-cycle, thus the documents are really reliable.

The thing I would like to talk about is PEP 8, also known as the Style Guide for Python Code.
As you may know, the proposed name convention differs a bit from other high level programming languages (like Java, whitch its known for UlitimateLongAndDescriptiveClassName naming convetion, along with other funny things like evenLongerAndMoreSpohisticatedMethodNames). Anyway it's very intuitive, 4 spaces instead of tab, underscore inside multi-word variables, etc. Sounds cool, you may even download a package that checks your type convention with reference to PEP 8:

~ $ pip install pep8

Now you may check your perfectly compatible with PEP 8 source codes:

~ $ pep8 app/

and to your wonderment get a result similar to this:

app/ E128 continuation line under-indented for visual indent
app/ E502 the backslash is redundant between brackets
app/ W293 blank line contains whitespace
app/ E303 too many blank lines (5)
app/ E501 line too long (82 > 79 characters)
app/ E501 line too long (95 > 79 characters)
app/ E501 line too long (103 > 79 characters)
app/ E501 line too long (82 > 79 characters)
app/ E501 line too long (83 > 79 characters)
app/ E302 expected 2 blank lines, found 1
app/ W391 blank line at end of file

A clean and tidy source file prints so many errors... well yes it does. In fact without additional IDE features it's nearly impossible to write PEP 8 valid code. If you're using vim, you can get a PEP 8 validation plug-in that opens a quick-fix buffer with a list of PEP 8 incompatible statements. This is good for shaping your coding habits, but don't get to orthodox - don't ever change an existing projects coding convention, just keep to the current one.

~Thus spoketh KR,

Tuesday, January 29, 2013

The forloop django tempalte variable

Django template for-loop iteration is usually executed on QuerySets (paginated or not). For front-end purposes it's sometimes good to provide row numbers (especially if the data is supposed to be presented in a specific order).  A first non-django solution that comes my mind is something like this:

def test_view(request):
    ctx = {}
    objects = MyModel.objects.all()
    ctx['objects'] = zip(range(1,objects.count()+1), objects)
    return render_to_response("app/test.html", ctx)

And for the template:

{% for el in objects %}
        {% comment %} Other fields.... {% endcomment %}
{% endfor %}

This is a great python solution, but it's not a django solution. First of all, by using zip function we select the records from the database, adding pagination now would be a bit complicated (using the PaginationMiddleware won't be effecient since it works well only on lazy QuerySets). Another Disadvantage is the need of addressing the counter and data by index. Of course we could make the counter an attribute of the data objects... but it's still not THE solution.

Django provides a better way of solving this problem. Inside each for-loop you may access a forloop template variable. According to the django docs the following attributes are available:

forloop.counter The current iteration of the loop (1-indexed)
forloop.counter0 The current iteration of the loop (0-indexed)
forloop.revcounter The number of iterations from the end of the loop (1-indexed)
forloop.revcounter0 The number of iterations from the end of the loop (0-indexed)
forloop.first True if this is the first time through the loop
forloop.last True if this is the last time through the loop
forloop.parentloop For nested loops, this is the loop "above" the current one

Now the template could look like this:

{% for el in objects %}
        {% comment %} Other fields.... {% endcomment %}
{% endfor %}

This is more elegant and practical. It's also easy to combine it with the PaginationMiddleware. Ale you need to do is add forloop.counter with each page start index.


Saturday, January 26, 2013

Setting up a development tmux session

Lets face it, it takes some time to setup a development session. Besides the IDE there are usually lots of other scripts and tools that need to be run, and monitored throughout the development process. Usually you need to run each script/tool in a separate terminal. This is quite inconvenient since multiple tools could be aggregated (you usually do not need a full-screen version  of htop running, same goes for logfiles). This problem may be solved using a terminal multiplexer, in our case tmux. If you're missing it, installing it is a must:

~ $ sudo apt-get install tmux

Using tmux is a enables organizing your dev session in a better/tidier manner. The greatest advantage  of using tmux over screen or a tabbed gnome-terminal is the possibility to splitting a pane (both horizontally and vertically), this eventually enables to setup your scrips/tools in any way you can imagine. Term 'eventually' was not used accidentally, eventually because it usually takes some typing to achieve the intended results. We programmers are lazy and like automating things, so why not create a script that sets up tmux with our predefined panes. This can be achieved by using tmuxinator. You can install it using gem.

~ $ sudo gem install tmuxinator

Now we can create a new session definition:

~ $ tmuxinator new fxbot

And type in some basic instructions to ~/.tmuxinator/fxbot.yml:

# ~/.tmuxinator/fxbot.yml

project_name: FXBot
project_root: ~/prj/forex/forex_bot/
    - editor: vim .
    - console:
        layout: main-vertical
            - #bash
            - ipython
    - stats:
        layout: main-vertical
            - htop
            - tail -f logs.txt

Now when you run:

~ $ tmuxinator start fxbot

You will end up running tmux with three tabs, the editor tab will contain a running instance of vim, the console  tab will be split vertically, the left pane will have a bash terminal, while the right will have ipython running. The final stats tab will be displaying htop and the last entries of a logfile. Well maybe its not a hard session to set up manually, but imagine setting up 5+ panes with split screens and various tasks running. 

This tool may also be configured to execute tasks before starting (like setting up the database server). More information is available on the projects site. 



Saturday, January 19, 2013

Implementing advanced python decorators

Python decorators may save a lot of time, through the ease of applying them with other functions/methods. In practice python decorators are something between the decorator pattern and macros. An example decorator that enables running a function only in a single process at a time is presented here.

Today I'll present how to implement a decorator that enables a few ways to handle the embedded function. Such a mechanism is used in the celery project.

A quick celery task usage example:

def add(x, y):
    return x + y

res = add.delay(5,6) #asynchronous task
res = add(5,6) #synchronous task 

This is quite convenient, you may have such decorators applied to functions and still dynamically choose how the decorator should act in a specific context. An implementation of a similar decorator is presented below (with some test code):

class SomeDecorator():

    def __call__(self, func, some_flag=False):

        def f(*args, **kwargs):
            if some_flag:
                print "Some flag is set"
            print "Some decorator routines"

        setattr(f, "with_flag", lambda *args, **kwargs: \
                self.__call__(func, some_flag=True)(*args,**kwargs)) 

        return f

def test(a, b): 
    print a, b

if __name__ == '__main__':
    print "*****"
    test(5, b = "test")
    print "*****"
    test.with_flag(7, b="test2")

We have to be aware that after a decorator is used, each code referring to test will instead refer to f (function object type). To achieve our goal we have to provide f  capabilities of performing different type of decorator activities. In this example code I have added a function attribute (line 11) which executes the default __call__ method, and still passes arguments/keyword arguments to the embedded function.

The standard output of running the presented code:

Some decorator routines
5 test
Some flag is set
Some decorator routines
7 test2

Feel free to use / modify / upgrade this code to achieve desired results.


Tuesday, January 8, 2013

Incremental backups with rsync

There are two types of computer users in the world: those who backup their data, and those who eventually will backup their data. Making regular backups consumes some time, but saves a lot of nerves and time(money) the primary data source crashes. Let's face it, backups are important.

Program source codes usually do not make problems, there are distributed version control systems with remote repositories which are ideal for not only sharing but also backing up the data.

Usually there are gigabytes of data that you will want to backup besides your source code. A good place for such backups is a remote storage or an external drive. If you'd like to automate the backup process as much as possible I suggest using rsync. It enables making incremental backups, which save your Internet bandwith / makes it faster to synchronize external drives. The following makes an incremental copy of some directories located in the home directory.

declare -a SOURCE_DIRS=("a" "b" "img" )


for source_dir in ${SOURCE_DIRS[@]}
    echo "Current directory: $HOME/$source_dir"
    rsync -a "$HOME/$source_dir" $BACKUP_DIR
echo "Backup complete"

This works exceptionally well. The -a option stands for:

  • -r, --recursive recurse into directories
  • -l, --links copy symlinks as symlinks
  • -p, --perms preserve permissions
  • -t, --times preserve modification times
  • -g, --group preserve group
  • -o, --owner preserve owner
  • --devices, preserve device files
  • --specials, preserve special files
More available at the rsync manual.

free counters