Submit Blog  RSS Feeds

Tuesday, July 31, 2012

Potential problems with custom primary keys in django ORM

I had a strange situation lately that encouraged me to do some research about how django really handles primary keys.

As for a mature web framework django's ORM generates automatic integer primary keys if none are present. This feature is cool, cause we are relieved from generating our own primary keys (if its problematic), but we loose some easily accessible semantic information - for example: a car's primary key may be it's registration number instead of a meaningless integer. Whats more, a primary key is used in the default object __eq__ / __cmp__ functions, which makes it easier to filter data (no need for subselects).

But let's look at this situation:

class Dude(models.Model):
    name = models.CharField(primary_key=True, max_length=100)
    #and so on

We have a 'Dude' that should have a unique name, primary keys are in general unique so there's no problem here. Suppose we have some form where you add 'dudes'. So we execute:

    Dude(name='John', **some_fields).save()
    #some more code
    Dude(name='John', **some_other_fields).save()
    print "You don\'t expect to see this message"
except IntegrityError as ie:

The logic states, that if you try to create another entry with a same value of a unique field, django should raise an integrity error. Well it will not, because this is also a primary key, if a 'dude' with a specific name exists, django interprets it like this:


For me this was a bit unintuitive, but when think about it - this approach makes sense. All in all, I would recommend using a custom primary_key field only if there is some semantic/logical background behind it.


Sunday, July 29, 2012

A quick look into HTML5 canvas tag

Lately I've experimenting with the new HTML5 features. There are many new tags, and attributes that add new functionality to existing tags. I wouldn't want to get to much into a general preview of these elements, instead I want to focus on single tag: <canvas>. A canvas element/property frequently appears in high level GUI designing programs or interfaces, it allows to draw custom graphics/primitives on a panel. The HTML5 canvas semantic is similar: it allows drawing graphics on the fly! Yes, no more CSS tricks to get an animation running effectively, now all you need is a browser supporting HTML5 and some basic JS knowledge.

Canvas and 2D graphics usually are connected with games, so after getting familiar with some basic concepts I started implementing a simple game - tetris similar. Here's what I have so far:

The currently implemented mechanics include:
  • falling bricks (1 type)
  • brick stacking (if one lands on another)
  • moving bricks left/right (arrow keys, it may not work via iframe)
The current code is available at

Feel free to have a peek at the code, it's far from being tidy - this was just a fast canvas experiment.

The most popular browsers support HTML5, so I don't see any disadvantages in using it in day to day programming.


Tuesday, July 24, 2012

Cheating online games /polls / contests by using anonumous HTTP proxies / python

This post is indeed about cheating. You know those browser-based game profile refs that provide you with some benefits each time a persons clicks it? Thats right, everybody spammed them here and there, some people had many visitors, some had not. I wanted to gain some extra funds too, but the thing I hate more than not-having-extra-funds is spamming... I just felt bad about posting that ref anywhere it was possible, like:

Check out these hot chicks <a href="#stupid_game_ref#">photos</a>

It's quite obvious, that clicking the link again and again yourself didn't have much effect. Only requests with unique IP numbers (daily) where generating profit.  So the question was how to access an URI from many IP from one PC (my PC that is). The answer is simple: by using anonymous HTTP proxies.

There are many sites that aggregate lists of free proxies, like It's best to find a site that enables fetching proxy IP/PORT data in a script processable way (many sites have captchas). 

Let's get to the fun part, the following script executes specific actions for each proxy in the provided list.

  1 import socket
  2 import urllib
  3 import time
  5 HEADERS = {"User-Agent" : "Mozilla/5.0" }
  7 proxies = ["", ""]
  9 #timeout in case proxy does not respond
 10 socket.setdefaulttimeout(5)
 12 for proxy in proxies:
 13     opener = urllib.FancyURLopener({"http" : proxy})
 14     opener.addheaders = HEADERS.items()
 15     try:
 16         res ="http://some.uri/?ref=123456")
 18         time.sleep(3)
 19     except:
 20         print "Proxy %s is probably dead :-(" % proxy

Line 5 contains a basic User-Agent header, you can find more about setting the appropriate headers here. Line 10 sets the default socket timeout on 5 seconds - many proxies tend not to work 24/7, so best catch those exceptions.  Finally we create an opener from each IP and request some resources (our ref link), you might replace this simple request with a set of actions, or even make bots that act via proxies. Just make sure you're proxy is truly anonymous (easy to verify by a simple PHP script) .

This may be called cheating, but at least it's not spamming :-)


Thursday, July 19, 2012

Frontend number formating - thousands separators and rounding

It's good for a backend developer to do some JS scripting from time to time. Usually this scripting considers the user interface or other presentation mechanisms. Everybody knows that a good user interface has a great impact for attracting people to use your service.

Let's skip positioning/css stuff and move along to data presentation. Usually there is a difference between the data we actually process and the data shown to users. So we have a variable containing the cost of some imaginary service, i.e. 4231.1578. This looks just like an old-school float representation - and in fact it is. So whats wrong with the users?... oops... I mean why is  this representation not friendly to users?

Firstly, a common decimal delimiter for European countries is a comma, not a dot - we should respect it.

Secondly, humans do not process text the way machines - the larger the presented amount, the longer it takes to evaluate its value, or even the order of magnitude. Thats why you should provide thousands separators, all in all we get something like this: 4.231,1578

Now all it takes round the value a bit, we don't need so many decimal places. Below the code that converts your js floats to a formatted and readable string (val  is the float value, while dec_places is the number of decimal places).

1 function round_repr(val, dec_places){
2     var tmp = Math.round((val*Math.pow(10, dec_places))).toString();
3     tmp = tmp.slice(0, tmp.length-dec_places) + ',' + tmp.slice(tmp.length-dec_places, tmp.length);
4     var int_length = tmp.length - dec_places - 1 - 3;
5     var i = int_length;
6     for (i=int_length; i>0; i-=3){
7         tmp = tmp.slice(0, i) + "." + tmp.slice(i,tmp.length);
8     }
9    return tmp;
11 }


Monday, July 9, 2012

Make websites recognise python/urllib as a webbrowser - (part 3: managing sessions); loading Firefox cookies to a cookiejar

Most browser based games/applications require a user to authenticate himself before providing access to other functions. A typical login process usually consists of filling out a form (with user credentials and other some other information) and posting it to the server. If the authentication process is completed successfully the server adds a Set-Cookie header with your session cookie to the HTTP response. Setting up urllib to manage cookies may be achieved the following way:

  1 import urllib
  2 import urllib2
  3 import cookielib
  5 opener = urllib2.build_opener(urllib2.HTTPCookieProcessor(\
  6         cookielib.CookieJar()))
  7 urllib2.install_opener(opener)
  9 login_form = urllib.urlencode({
 10     'user' : 'john',
 11     'password' : 'secrect_password',
 12     })
 14 req = urllib2.Request('', login_form)
 15 res = urllib2.urlopen(req)

The fun part starts when the authentication is more sophisticated (requires a captcha or other means of security features that discourage the use of robots). We'll just have to login via Firefox and use it's cookies! Firefox stores it's cookies in a sqlite database, we'll just have to open it and fetch them.

  1 import urllib2
  2 import cookielib
  3 from sqlite3 import dbapi2
  5 host = ''
  6 ff_cookie_file= '/home/%s/.mozilla/firefox/%s/cookies.sqlite' % ("user_name", "profile_name")
  8 file = open("cookie.txt", "w")
  9 file.write("#LWP-Cookies-2.0\n")
 10 match = '%%%s%%' % host
 12 con = dbapi2.connect(ff_cookie_file)
 13 cur = con.cursor()
 14 cur.execute("select name, value, path, host from moz_cookies where host like ?", [match])
 15 for item in cur.fetchall():
 16     cookie = "Set-Cookie3: %s=\"%s\"; path=\"%s\";  \
 17     domain=\"%s\"; expires=\"2038-01-01 00:00:00Z\"; version=0\n" % (
 18     item[0], item[1], item[2], item[3],
 19     )
 20 file.write(cookie)
 21 file.close()
 23 cj = cookielib.LWPCookieJar()
 24 cj.load("cookie.txt")
 26 opener = urllib2.build_opener(urllib2.HTTPCookieProcessor(cj))
 27 urllib2.install_opener(opener)

In order to make use of this code you have to locate your Firefox cookie file, if you are using linux it will be probably under a path like presented in line 6. Lines 12-19 select cookie data from moz_cookies table and writes them in a LWPCookieJar compatible way in a text file (match filters cookies for a specific domain). Next these cookies are loaded to a cookiejar and installed inside a cookie processor which is added to the default urllib handler list.

This is great, because you can share your session between a webbrowser and web robots.


By the way: it is best to make a copy of Firefox cookies - when the browser is running the cookie file may be locked which may crash your script or prevent your from getting access to the session.
free counters