Submit Blog  RSS Feeds

Monday, July 9, 2012

Make websites recognise python/urllib as a webbrowser - (part 3: managing sessions); loading Firefox cookies to a cookiejar

Most browser based games/applications require a user to authenticate himself before providing access to other functions. A typical login process usually consists of filling out a form (with user credentials and other some other information) and posting it to the server. If the authentication process is completed successfully the server adds a Set-Cookie header with your session cookie to the HTTP response. Setting up urllib to manage cookies may be achieved the following way:

  1 import urllib
  2 import urllib2
  3 import cookielib
  5 opener = urllib2.build_opener(urllib2.HTTPCookieProcessor(\
  6         cookielib.CookieJar()))
  7 urllib2.install_opener(opener)
  9 login_form = urllib.urlencode({
 10     'user' : 'john',
 11     'password' : 'secrect_password',
 12     })
 14 req = urllib2.Request('', login_form)
 15 res = urllib2.urlopen(req)

The fun part starts when the authentication is more sophisticated (requires a captcha or other means of security features that discourage the use of robots). We'll just have to login via Firefox and use it's cookies! Firefox stores it's cookies in a sqlite database, we'll just have to open it and fetch them.

  1 import urllib2
  2 import cookielib
  3 from sqlite3 import dbapi2
  5 host = ''
  6 ff_cookie_file= '/home/%s/.mozilla/firefox/%s/cookies.sqlite' % ("user_name", "profile_name")
  8 file = open("cookie.txt", "w")
  9 file.write("#LWP-Cookies-2.0\n")
 10 match = '%%%s%%' % host
 12 con = dbapi2.connect(ff_cookie_file)
 13 cur = con.cursor()
 14 cur.execute("select name, value, path, host from moz_cookies where host like ?", [match])
 15 for item in cur.fetchall():
 16     cookie = "Set-Cookie3: %s=\"%s\"; path=\"%s\";  \
 17     domain=\"%s\"; expires=\"2038-01-01 00:00:00Z\"; version=0\n" % (
 18     item[0], item[1], item[2], item[3],
 19     )
 20 file.write(cookie)
 21 file.close()
 23 cj = cookielib.LWPCookieJar()
 24 cj.load("cookie.txt")
 26 opener = urllib2.build_opener(urllib2.HTTPCookieProcessor(cj))
 27 urllib2.install_opener(opener)

In order to make use of this code you have to locate your Firefox cookie file, if you are using linux it will be probably under a path like presented in line 6. Lines 12-19 select cookie data from moz_cookies table and writes them in a LWPCookieJar compatible way in a text file (match filters cookies for a specific domain). Next these cookies are loaded to a cookiejar and installed inside a cookie processor which is added to the default urllib handler list.

This is great, because you can share your session between a webbrowser and web robots.


By the way: it is best to make a copy of Firefox cookies - when the browser is running the cookie file may be locked which may crash your script or prevent your from getting access to the session.

1 comment:

  1. Good Post! Thank you so much for sharing this pretty post, it was so good to read and useful to improve my knowledge as updated one, keep blogging. Selenium with python Training in Electronic City


free counters