Optimizing a slow Django view

Here I'll outline some attempts to optimize a slow view – the facilitator view – in UELC.

To find out where time was being spent, I added print('a'), print('b') etc, throughout the get function, and loaded the view while watching the logs.

After reducing the query count by removing duplicate calls to Pagetree's .block(), I noticed that this page loads four columns of data, displayed as a column for each "group user" in the template. Each iteration of this loop calculating the columns takes about half a second. Not really thinking things through, I figured why not calculate this concurrently, splitting it up into multiple processes, potentially running on different CPUs? After reading about python's multiprocessing library, I refactored the for loop into the function render_user_gates, and came up with this:

    pool = multiprocessing.Pool(processes=4)
    args_list = [[u, hierarchy, section, hand, gateblocks]
                  for u in cohort_users]
    user_sections = pool.map(render_user_gates, args_list)
    print('done', len(user_sections))
  
from here.

This doesn't work.. I think that it may have even messed up my database. I got a "multiple hierarchies returned" error from pagetree, so I had to get a new database. Because my render_user_gates method makes database queries, it's not an option to just disconnect from the database at the beginning of this function.

I asked on #django if anyone's using multiprocessing.pool with Django's ORM. Someone named "moldy" said probably not, because it was a really strange thing to do. I guess I didn't even think of how this would behave, deployed on a server, running through gunicorn or something.

Moldy mentioned that this could be a use case for celery. I know we use that on some other projects, like PMT. I'm reading more about celery now – I don't want to put the time in to set everything up until I'm sure that it would work, and also I would have to be pretty sure that it will actually improve performance in this view.