Python: display refreshing status (like top)

In scripts I often want to display some status information (e.g. progress), this can be achieved with e.g.:

PYTHON:
  1. print "progress: %i %%\r" % i,
  2. sys.stdout.flush()

but this just works for one-liners. I wanted to have what top or less do: open a new "window" and being able to write everywhere in the window, not just on the last line.

Found out that curses does exactly that - but I didn't find a stripped-to-the-bare-necessities-example, so here you are:

PYTHON:
  1. import curses, time
  2. from datetime import datetime
  3. w = curses.initscr()
  4. try:
  5.   while True:
  6.     w.erase()
  7.     w.addstr("some status..\ncurrent time\n%s" % datetime.now())
  8.     w.refresh()
  9.     time.sleep(1)
  10. finally:
  11.   curses.endwin()

 

Python: Print list of dicts as ascii table

Sometimes I want to print list of dicts as an ascii table, like this:

| Programming Language | Language Type | Years of Experience |
+----------------------+---------------+---------------------+
| python               | script        |                   4 |
| php                  | script        |                   5 |
| java                 | compiled      |                  11 |
| assember             | compiled      |                  15 |

I searched on Google - but without luck.

That's what I came up with - it's not particularly nice but it does the job:

PYTHON:
  1. def table_print(data, title_row):
  2.     """data: list of dicts,
  3.        title_row: e.g. [('name', 'Programming Language'), ('type', 'Language Type')]
  4.     """
  5.     max_widths = {}
  6.     data_copy = [dict(title_row)] + list(data)
  7.     for col in data_copy[0].keys():
  8.         max_widths[col] = max([len(str(row[col])) for row in data_copy])
  9.     cols_order = [tup[0] for tup in title_row]
  10.        
  11.     def custom_just(col, value):
  12.         if type(value) == int:
  13.             return str(value).rjust(max_widths[col])
  14.         else:
  15.             return value.ljust(max_widths[col])
  16.    
  17.     for row in data_copy:
  18.         row_str = " | ".join([custom_just(col, row[col]) for col in cols_order])
  19.         print "| %s |" % row_str
  20.         if data_copy.index(row) == 0:
  21.             underline = "-+-".join(['-' * max_widths[col] for col in cols_order])
  22.             print '+-%s-+' % underline

Use it like that:

PYTHON:
  1. data = [dict(name='python', type='script', years_experience=4),
  2.     dict(name='php', type='script', years_experience=5),
  3.     dict(name='java', type='compiled', years_experience=11),
  4.     dict(name='assember', type='compiled', years_experience=15)
  5.     ]
  6. titles = [('name', 'Programming Language'),
  7.     ('type', 'Language Type'),
  8.     ('years_experience', 'Years of Experience')]
  9. table_print(data, titles)

It will produce the table printed above. It's not fancy - the only 'smart' thing it does is right-adjusting integers, strings are left-adjusted.

P.S. no, I don't have 15 years of experience of Assembler - I just know it since 15 years - it's one of the first programming languages I learned - and I even wrote a text editor with it - then I learned that's probably not the best language to write an editor :-)

 

dealing with “MySQL backend does not support timezone-aware datetimes”

When fetching events from an iCal feed and saving this into a database I got

MySQL backend does not support timezone-aware datetimes

This did the trick for me:

Install pytz (download here)

PYTHON:
  1. import pytz
  2. that_datetime_in_utc.astimezone(pytz.timezone('Europe/Zurich')).replace(tzinfo=None)

Caution: I don't really understand what I'm writing here - it feels like those posters in PHP forums who explain 'how to ...' and then trying to explain something they have no clue of, but well.. :-)

 

Fetch publicly available google calendar data with python

I tried accessing the google data api with python - that api seems either overly complicated or just not suited for just grabbing events from a public google calendar.

This worked for me:

  1. find out ical address (subscribe to the calendar, other calendars->arrow down->calendar settings)
  2. install icalendar

Then this is possible:

PYTHON:
  1. from icalendar import Calendar
  2. import urllib
  3. ics = urllib.urlopen('http://www.google.com/calendar/ical/fchppllvcaupb6fgguigobkfj4@group.calendar.google.com/public/basic.ics').read()
  4. ical=Calendar.from_string(ics)
  5. for vevent in ical.subcomponents:
  6.   if vevent.name != "VEVENT":
  7.     continue
  8.   title = str(vevent.get('SUMMARY'))
  9.   description = str(vevent.get('DESCRIPTION'))
  10.   location = str(vevent.get('LOCATION'))
  11.   start = vevent.get('DTSTART').dt             # a datetime
  12.   end = vevent.get('DTEND').dt                 # a datetime

 

Send javascript errors by mail

I'm running a Django-powered site for a closed user group and added a bit of JavaScript magic here and there (mainly Prototype and Tooltip).

Now Django sends me a mail whenever a 404 or 500 error occurs. But when one of my users encounters a JavaScript-Error, I'm not informed. I thought anyone in the web has solved this problem but didn't find anything, so here's my take: Just send any error using Ajax (here: using Prototypes Ajax abstraction) to the server

JAVASCRIPT:
  1. onerror = Extranet.mailError;
  2. function mailError(msg, url, line) {
  3.   var postBody = 'url=' + url + '&line=' + line + '&message=' + escape(msg) + '&useragent=' + escape(navigator.userAgent) + '&user=' + escape(user_name);
  4.   var myAjax = new Ajax.Request('/api/jserror/', {method: 'post', postBody: postBody});
  5. }

user_name is a JavaScript variable holding the Django username (so I know whom I can inform when the error is fixed).

On the server side, I just send me mails containing the JavaScript error message, the username and the user agent:

PYTHON:
  1. def jserror(request):
  2.   from django.core.mail import mail_admins
  3.   omit_messages = ['pointerobj is not defined', 'tipobj is not defined', 'ns6 is not defined', 'enabletip is not defined']
  4.   if request.POST.get('message', '') not in omit_messages:
  5.     message = """url: %s (%s)
  6. %s
  7. user-agent: %s
  8. username: %s
  9. """ % (request.POST.get('url', ''), request.POST.get('line', ''), request.POST.get('message', ''), request.POST.get('useragent', ''), request.POST.get('user', ''))
  10.     mail_admins("javascript error", urldecode(message))
  11.   return HttpResponse()

Yeah, that's all very trivial but I wonder what other solutions exist for this problem...

 

Django: Serve big files via fcgid

I've got a django project running which requires you to login to access files.
That means that I have to serve the files via python, like this:

PYTHON:
  1. @login_required
  2. def download(request, filename):
  3.     # ... some code specific to my site ...
  4.     response = HttpResponse(mimetype=postUpload.mimetype)
  5.     response['Content-Disposition'] = "attachment; filename=" + original_filename
  6.     response['Content-Length'] = os.path.getsize(filename_path)
  7.     response.write(open(filename_path).read())
  8.     return response

The problem: If the download of a file exceeded 5 minutes (big files and/or low bandwidth) the download was canceled on the server side by a timeout. This Apache configuration for mod_fcgid solved the problem (see mod_fcgid documentation for BusyTimeout)

CODE:
  1. <IfModule mod_fcgid.c>
  2.   BusyTimeout 1200
  3. </IfModule>

The problem was that the apache module scanned every minute for processes that run for more than BusyTimeout seconds. These processes are potentially in bad health (infinite loop et al.) and have to be killed. Not so with my processes (since I know what I'm doing..). The setting of the busy timeout to 1200 seconds now lets my processes run for a maximum of one hour.

As this setting can't by overwritten in a htaccess file by default I needed to bug my web hosting provider with the request, which was handled in 24 hours, so thanks for that one!

PS: If you know of another way how to serve protected static files via a single sign on (no HTTP basic auth), please let me know.

 

Python: Sort a list of dicts by dict-key

I always forget that one and end up searching for half an hour:

PYTHON:
  1. >>> list = [ dict(a=1,b=2,c=3),
  2. ...   dict(a=2,b=2,c=2),
  3. ...   dict(a=3,b=2,c=1)]
  4. >>> list.sort(key=operator.itemgetter('c'))
  5. >>> list
  6. [{'a': 3, 'c': 1, 'b': 2}, {'a': 2, 'c': 2, 'b': 2}, {'a': 1, 'c': 3, 'b': 2}]
  7. >>> list.sort(key=operator.itemgetter('a'))
  8. >>> list
  9. [{'a': 1, 'c': 3, 'b': 2}, {'a': 2, 'c': 2, 'b': 2}, {'a': 3, 'c': 1, 'b': 2}]

 

Python: Find out cpu time of a certain process

To find out how many percentage a certain process uses the cpu:

PYTHON:
  1. import os, time
  2.  
  3. # find out the pid by username.
  4. # "-o pid h" omits the header and just prints the pid
  5. pid = os.popen('ps -U my_user_name -o pid h').read().strip()
  6.  
  7. # 14th column is utime, 15th column is stime:
  8. # The time the process has been scheduled in user/kernel mode
  9. # The time value is in jiffies. One jiffie is appox 1/100 second
  10. # see man proc for more info
  11. stat = os.popen('cat /proc/%s/stat' % pid).read().strip()
  12. cpu_time1=int(stat.split()[14]) + int(stat.split()[15])
  13. time1=time.time()
  14.  
  15. time.sleep(1)
  16. stat = os.popen('cat /proc/%s/stat' % pid).read().strip()
  17. cpu_time2=int(stat.split()[14]) + int(stat.split()[15])
  18. time2=time.time()
  19.  
  20. print str(float(cpu_time2 - cpu_time1) / (time2 - time1)) + "%"

I don't know though if the number is accurate.
What is "cpu time" anyway? It's the time the process is running (using the cpu for 100%) divided by the time the process is laid asleep by the scheduler.
Then, jiffies seem to be not a safe number for time measurements.
But for relative measurements it should do the trick.

 

add bandwidth to a file download in python

Another short code snippet I came up when writing the module download script

The following class keeps a download under a certain ratio and prints live download stats (KB/s).
Read the rest of this entry »

 

set timeout for a shell command in python

I wanted to run a shell command in python without knowing if the shell command is going to exit within reasonable time (adplay that was, sometimes it simply hangs).

Update: the "task" module of Rob Hooft seems to solve this exact problem. At the time I wrote this, the python.net website was down. I leave my solution here just for archive purpose.

Read the rest of this entry »