Submit Blog  RSS Feeds

Monday, July 9, 2012

Make websites recognise python/urllib as a webbrowser - (part 3: managing sessions); loading Firefox cookies to a cookiejar

Most browser based games/applications require a user to authenticate himself before providing access to other functions. A typical login process usually consists of filling out a form (with user credentials and other some other information) and posting it to the server. If the authentication process is completed successfully the server adds a Set-Cookie header with your session cookie to the HTTP response. Setting up urllib to manage cookies may be achieved the following way:

  1 import urllib
  2 import urllib2
  3 import cookielib
  4 
  5 opener = urllib2.build_opener(urllib2.HTTPCookieProcessor(\
  6         cookielib.CookieJar()))
  7 urllib2.install_opener(opener)
  8 
  9 login_form = urllib.urlencode({
 10     'user' : 'john',
 11     'password' : 'secrect_password',
 12     })
 13 
 14 req = urllib2.Request('http://some.site/login_resource', login_form)
 15 res = urllib2.urlopen(req)
 16 


The fun part starts when the authentication is more sophisticated (requires a captcha or other means of security features that discourage the use of robots). We'll just have to login via Firefox and use it's cookies! Firefox stores it's cookies in a sqlite database, we'll just have to open it and fetch them.

  1 import urllib2
  2 import cookielib
  3 from sqlite3 import dbapi2
  4
  5 host = 'some.site'
  6 ff_cookie_file= '/home/%s/.mozilla/firefox/%s/cookies.sqlite' % ("user_name", "profile_name")
  7
  8 file = open("cookie.txt", "w")
  9 file.write("#LWP-Cookies-2.0\n")
 10 match = '%%%s%%' % host
 11
 12 con = dbapi2.connect(ff_cookie_file)
 13 cur = con.cursor()
 14 cur.execute("select name, value, path, host from moz_cookies where host like ?", [match])
 15 for item in cur.fetchall():
 16     cookie = "Set-Cookie3: %s=\"%s\"; path=\"%s\";  \
 17     domain=\"%s\"; expires=\"2038-01-01 00:00:00Z\"; version=0\n" % (
 18     item[0], item[1], item[2], item[3],
 19     )
 20 file.write(cookie)
 21 file.close()
 22
 23 cj = cookielib.LWPCookieJar()
 24 cj.load("cookie.txt")
 25
 26 opener = urllib2.build_opener(urllib2.HTTPCookieProcessor(cj))
 27 urllib2.install_opener(opener)

In order to make use of this code you have to locate your Firefox cookie file, if you are using linux it will be probably under a path like presented in line 6. Lines 12-19 select cookie data from moz_cookies table and writes them in a LWPCookieJar compatible way in a text file (match filters cookies for a specific domain). Next these cookies are loaded to a cookiejar and installed inside a cookie processor which is added to the default urllib handler list.

This is great, because you can share your session between a webbrowser and web robots.

~KR

By the way: it is best to make a copy of Firefox cookies - when the browser is running the cookie file may be locked which may crash your script or prevent your from getting access to the session.

1 comment:

  1. Good Post! Thank you so much for sharing this pretty post, it was so good to read and useful to improve my knowledge as updated one, keep blogging.

    https://www.emexotechnologies.com/courses/software-testing-training/selenium-with-python-training/ Selenium with python Training in Electronic City

    ReplyDelete

free counters