WordPress: Under Siege
As I mentioned in my last post, I recently switched my website from my homegrown Django blogging app to WordPress. Before installing WordPress on my VPS, I installed it on a VM so I could test the waters before jumping in. I created an Ubuntu 12.04 VM using VirtualBox and gave it a gigabyte of RAM to work with. After I had WordPress up and running, I created some test posts and played around with various plugins and themes that I could find on the WordPress directory. I was dismayed to discover that WordPress has terrible performance out of the box, even if you disable all installed plugins. The WordPress dashboard served by my Ubuntu VM would easily take 4–5 seconds to load, and individual posts would take at least 2–3 seconds to load. I found this unacceptable, so I started searching StackOverflow and the excellent WordPress StackExchange for answers.
The two most straightforward performance optimizations that I could find were:
- Install a PHP opcode cache.
- Install a page caching plugin.
Installing an opcode cache on Ubuntu is easy:
$ sudo apt-get install php-apc
No extra configuration is required on Ubuntu. If you use a different distro, read the php-apc documentation on the PHP website.
Installing WP Super Cache is similarly easy and a number of excellent tutorials for setting it up are scattered around the Web. Here is a good one for Apache, and here is one for nginx. I also recommend perusing this GitHub repository that contains a complete set of configuration files for serving WordPress through nginx.
The Numbers
The numbers that follow are for a fresh install of WordPress 3.5.1 running on an Ubuntu 12.04 VM with 1GB of RAM, served by nginx 1.1.19, php-fpm 5.3.10, and backed by MySQL 5.5.29. The host OS is Mac OS X 10.8.2 running on a MacBook Pro. All testing done with Siege 2.74 hitting different pages of the WordPress website in a random order.
$ siege -d5 -c100 -i -f url_list.txt -t5m
Note that these numbers only reflect a general trend in WordPress performance under load. Real world page load performance depends on many factors, including network latency, page size, whether you’re using a CDN or not, the number of separate JavaScript/CSS/image files per page, etc. The following numbers only indicate how quickly WordPress can push HTML to the client.
Despite the flawed testing methodology, these numbers are useful as rough indicators of the effectiveness of opcode and page caching.
Fresh Install Without Opcode Cache or Page Cache
Measurement | Value |
---|---|
Transactions | 2710 hits |
Availability | 100.00 % |
Elapsed time | 299.47 secs |
Data transferred | 9.59 MB |
Response time | 8.37 secs |
Transaction rate | 9.05 trans/sec |
Throughput | 0.03 MB/sec |
Concurrency | 75.74 |
Successful transactions | 2710 |
Failed transactions | 0 |
Longest transaction | 9.58 |
Shortest transaction | 0.20 |
Fresh Install With WP Super Cache and php-apc
Measurement | Value |
---|---|
Transactions | 11833 hits |
Availability | 100.00 % |
Elapsed time | 299.70 secs |
Data transferred | 23.62 MB |
Response time | 0.02 secs |
Transaction rate | 39.48 trans/sec |
Throughput | 0.08 MB/sec |
Concurrency | 0.75 |
Successful transactions | 11881 |
Failed transactions | 0 |
Longest transaction | 0.42 |
Shortest transaction | 0.00 |
Bonus: Numbers for ankursethi.in (this website)
This website uses the same software as my testing VM. The only difference is that it is hosted in somewhere in Germany on a Hetzner VQ12 VPS, and I’m hitting it with Siege from New Delhi in India.
Measurement | Value |
---|---|
Transactions | 8566 hits |
Availability | 98.73 % |
Elapsed time | 299.54 secs |
Data transferred | 40.03 MB |
Response time | 0.55 secs |
Transaction rate | 28.60 trans/sec |
Throughput | 0.13 MB/sec |
Concurrency | 15.72 |
Successful transactions | 8577 |
Failed transactions | 110 |
Longest transaction | 5.39 |
Shortest transaction | 0.38 |
Closing Words
My test blog went from 39 transactions per second to 9 transactions per second with these two simple performance optimizations, and page load time went from ~8 seconds to 0.02 seconds. This page load time is for users who have not logged in or left a comment. I see a more modest 1.5–2 second load time for logged in users, which is still a 4x improvement. The concurrency number went from 75.74 to 0.75, which is a good thing in this case.
These optimizations should be enough for a majority of low to medium traffic self-hosted WordPress blogs. For more advanced optimization techniques, I recommend reading this excellent article on the New Relic blog.
Okay WordPress, You Win This Round
One of my biggest blogging crimes so far has been this: every time I get an itch to write a blog post, I start thinking of possible workflows I could use to make my writing process simpler. A few hours later, instead of the post I wanted, I have UI mockups for The Perfect Blogging App™ that will never be built. It’s a disease, I tell ya.
In the summer of 2012, though, I did finally get around to building a blogging app. I called it Can ‘O Beans, and it was nothing like my mockups. I had to write my posts using a text editor and then paste them into my app’s autogenerated admin interface—the one you get out of the box with Django. I neither added Markdown support to the app, nor a WYSIWYG editor; instead, I marked up all my posts using plain ol’ HTML, which made for a frustrating writing experience. The app didn’t even support uploading images to the server. I had envisioned a slick, AJAX-y media manager for Can O’ beans but, since it would have taken some time to build, I kept putting off hacking on it. Real life commitments kept me from improving the usability of Can ‘O Beans, and the bad usability of Can ‘O Beans prevented me from ever thinking beyond the app itself and concentrating on what mattered: writing.
Hacking on Can ‘O Beans was a lot of fun, but it was only fun for the part of my brain that enjoyed engineering puzzles. The part that wanted to just fucking write was frustrated with the whole deal.
The astute reader may now ask why I had to go and write my own blogging app when I could just have used Wordpress like every other sane person. An year ago, I would have told said astute reader that Wordpress was a terrible piece of software. I would have gone on to mention that it was slow and unresponsive, that it had a cluttered UI that was anathema to the creative mind, that it was insecure, that it was impossible for a sane person to customize Wordpress because it was written in PHP, etc. etc.
Of course, the real reason I didn’t use Wordpress was that I was a complete ass.
(The astute reader may also ask why I didn’t use one of the many static blog generators that are all the rage these days. The reason is that I prefer having a dashboard I can log into from any Internet connected computer I have access to and manage my blog from there. Also: plugins and themes.)
When faced with a problem, it’s tempting for a hacker to immediately switch to her text editor and start building something completely new. Down this path lies madness and despair, for what thou codeth, thou maintaineth, and I’ve learned this lesson well after Can ‘O Beans. More often than not, mature, well-tested, actively maintained software exists that satisfies ~99% of your needs. There are people out there who have spent years building, deploying, securing and scaling software that, with a tiny bit of customization, does everything you want it to. This was something I always knew, but I had to actually experience it for myself before I internalized it. And internalize it I did—ankursethi.in is now (proudly?) powered by Wordpress. I can stop worrying about having to maintain Can ‘O Beans, and concentrate on blogging. Phew.
I still have some minor issues with the Wordpress admin UI, and I still don’t like PHP and MySQL, but I’m willing to let all that slide because what Wordpress lets me do is fucking write. With php-apc and WP Super Cache enabled, it performs shockingly well even on low-end servers. On my slightly-better-than-low-end Hetzner VPS, it flies.
Time to roll up my sleeves and write!
2012: Year in Review
The Good
- Finished my four years of college. The nighmare ends at last.
- Rediscovered my love for hip-hop music, all thanks to /r/HipHopHeads. More on this in another post.
- Technology for the sake of technology is no fun. In the past, I wasted a lot of my time and energy on mildly interesting tools and programming languages when I could’ve been building useful software using more practical technology. 2012 was the year practicality finally beat purity.
- Started writing software for myself. I intend to eventually polish up and release this software for public consumption.
- Started maintaining proper information hygiene. More on this in another post.
- Became more organized. More on this in another post.
The Bad
- Have to take a few more exams in May before I get my degree.
- Stopped exercising. Shameful.
- Only read five books in the entire year. Doubly shameful.
- Almost everyone I know is now in Bangalore while I’m stuck in Delhi.
The Ugly
- Lost a friend to cancer.
- Started a business, but shuttered it after a few months of work.
Highlights
- Best movie watched: Forrest Gump
- Best book read: Skinny Legs and All by Tom Robbins
- Best musicians discovered: Eyedea, MF DOOM
- Favorite albums of the year: First Born by Eyedea & Abilities, The Many Faces of Oliver Hart by Eyedea, Theatre is Evil by Amanda Palmer and the Grand Theft Orchestra
- Favorite new programming language: Go
- Favorite new software: Sublime Text 2, Clementine, LastPass
What Next?
- Release one or two useful webapps into the wild.
- Learn Go and build a webapp with it.
- Further streamline my web development and deployment workflow.
- Learn my technology stack better. This means learning more about the capabilities of nginx, PostgreSQL, uWSGI and of course Django and Flask.
- Start keeping track of how I spend my time. If you can measure it, you can optimize it.
- Start keeping track of the movies I watch, the books I read and the music I listen to.
- Read more. Twenty-five books is an achievable goal for 2013.
- Read less SciFi and genre fiction.
- Exercise. I have been intending to start with the Convict Conditioning bodyweight training program for the last three months. The new year is as good an excuse as any to start having some fun!
- Explore New Delhi more and get involved in local community events.
- Move to Bangalore by the end of the year.
- Super secret goal.
This was an year of many small epiphanies, and one or two big ones ones. Those deserve little mention, though, since us silly inexperienced twentysomethings have epiphanies all the fucking time.
Anyway, happy new year to everyone reading this. Make it one worth remembering!
Mobile Tweaks and Chrome Extension
I just made some changes to my website’s CSS to make it more readable on small screens. I’m not completely happy with how it looks—the header looks jarringly out of place and code samples are all messed up—but at least now you don’t have to play with your browser’s zoom settings to be able to read the text properly. Baby steps.
I also wrote a little Chrome extension for adding bookmarks to my homegrown bookmarks manager. I was pleasantly surprised at how easy it is to extend Chrome. The APIs are well-designed and well-documented, and a number of examples are available from Google’s developer website. I can see myself writing more extensions for Chrome in the future.
My bookmarks extension is nothing to write home about. It consists of the mandatory manifest.json
file, four or five lines of JavaScript that open a new tab when a toolbar button is clicked, and a terrible icon I quickly designed in Photoshop. The code is on GitHub so that my friends and family can laugh at my terrible Photoshop skills.
Bookmarks
There are three reasons why I don’t like using web browsers for storing my bookmarks. First, browsers make it hard for me to access my bookmarks from my phone or a friend’s computer; second, I find the browsers’ interface for managing bookmarks unwieldy and confusing; and third, I switch between browsers very often, which means I inadvertently have my bookmarks spread out between Chrome, Firefox, Safari, Opera Mobile and the Android Browser. Web-based bookmarking services are accessible via any Internet-connected computer, have user interfaces that are in line with my personal preferences and are browser agnostic. They alleviate pretty much all the issues I have with browser-based bookmarking.
After thinking about it for longer than I should have, I built my own little bookmarking tool yesterday evening and deployed it to my VPS. It’s a part of Can ‘O Beans and lives behind this humble bookmarks page. It lets me add bookmarks to my collection, assign descriptions and tags to them, and search through my collection by tag or hostname. Next up on my TODO list is a Chrome extension. Also in the works is a better search, an RSS feed, an import/export feature and several UI tweaks.
Building this little tool must have taken me about three hours. Time well spent, I’d say.
Scripting tmux
tmux ranks highly on the list of programs that I cannot live without. I consider its split-screen and terminal multiplexing capabilities absolutely essential for day-to-day hacking. It belongs to that rare breed of software that has measurably improved my development productivity, software that makes me genuinely happy.
This is what my tmux pane layout usually looks like when I’m working on a project:
---------------------------------------------------------------
| | |
| | |
| | |
| | repl |
| | |
| directory operations | |
| git interactions |-------------------------------|
| etc. | |
| | directory watcher |
| | server |
| | logger |
| | etc. |
| | |
---------------------------------------------------------------
Since I use this layout for nearly every project that I work on, it makes sense to have tmux automatically set it up for me so that I don’t have to type a bunch of keyboard shortcuts every time I start working on a new project. Luckily, tmux is highly scriptable. Here’s a BASH function that automatically sets up the three-pane layout from above:
setup_tmux_layout() {
# Create a new window.
tmux new-window -a -n "$1" -c "$2"
# Now split it twice, first horizontally and then vertically.
tmux split-window -h
tmux split-window -v
}
Now I can run the following from my shell:
$ setup_tmux_layout <window_name> <starting directory>
(Note that at least one tmux session must be active for this to work. This function affects the most recently active tmux session.)
This is good, but we can go a step further. I have several Django projects, and whenever I start working on one of them there are a number of additional Python/Django-specific actions that I take.
- Switch to the project directory in all three panes.
- Activate the project’s virtualenv in all three panes.
- Run a git status in the large pane on the left.
- Start a Django shell (python manage.py shell) in the top right pane.
- Run the Django development server (
python manage.py runserver
) in the bottom right pane.
Once again, it makes sense to automate these actions. Here is the actual code I have in my ~/.bash_profile
to do that.
workon_project() {
if [ $# -lt 2 ]
then
echo "Usage:"
echo -e "tworkon_project <project directory> <virtualenv name>"
return 1
fi
# Create a new window.
tmux new-window -a -n "$2" -c "$1"
# Send keys to the large pane on the left.
tmux send-keys "workon $2" C-m
tmux send-keys "git status" C-m
# Split the window horizontally.
tmux split-window -h -c "$1"
# Send keys to the top right pane.
tmux send-keys "workon $2" C-m
tmux send-keys "python manage.py shell" C-m
# Split the window again, this time vertically.
tmux split-window -v -c "$1"
# Send keys to the bottom right pane.
tmux send-keys "workon $2" C-m
tmux send-keys "python manage.py runserver" C-m
}
Brilliant! Now I can perform all those tedious actions in one fell swoop by running this in my shell:
$ workon_project <project directory> <virtualenv name>
(In case you are wondering, the workon
command above comes from virtualenvwrapper.)
This little function has saved me hundreds of keystrokes in the last few months, and it only scratches the surface of what tmux is capable of. If you are a regular tmux user, I highly recommend skimming the tmux manpage at least once.
A Django Admin Wishlist
It is okay to skim this post and only read the parts that you find interesting.
When I was just learning how to use Django, I dismissed django.contrib.admin
(“the admin” from now on) as a nice-to-have extra, a marginally useful demo of framework functionality, but not much more. I didn’t even enable it for most of the apps I wrote as a Django novice. The simple, honest django.forms.Form
and the darkly magical django.forms.ModelForm
together did everything I needed. Learning to customize the admin was a waste of my time. Or so I thought.
As the scope of my projects grew, my attitude towards the admin thawed a little. I had heavily leveraged it for an app that I had written for my personal use and found that it was more customizable than I had initially believed. I was surprised to discover that I could make it mostly work the way I wanted by making only minor adjustments to my Python code. While experimenting with some third-party Django apps, I found that all of them use the admin in one way or another. Mezzanine, Cartridge, django-cms, Zinnia, Satchmo, django-filebrowser … all of them hook into and extend the admin framework instead of using something home-grown. Of course, sometimes the admin framework doesn’t have the functionality these apps need, and the developers end up building the missing functionality from scratch, but it’s the existing functionality that gets them 70–75% of the way.
Consequently, when I was starting work on an internal webapp for my company, I decided to not waste my time writing a custom management UI for it and instead use the admin from day one. This was about a month ago. Since then I have learned a lot about how to customize and extend the admin to my app’s needs. I have also, in the course of customizing the admin, come across some annoying limitations in the admin framework. Some of these limitations are easily overcome using third-party apps, others require a bit of extra work on the developer’s part, and still others need to be looked at by the core Django developers. This post details the limitations I ran into, and—where possible—ways to overcome them.
Limitation #1: The admin cannot display nested inlines
Let’s say you have three models called Group
, Person
and EmailAddress
. A Group
can have multiple Persons
, and a Person
can have multiple EmailAddresses
. You might be tempted to do this in your admin.py
:
# In admin.py.
class EmailAddressInline(admin.TabularInline):
model = models.EmailAddress
extra = 1
class PersonInline(admin.TabularInline):
model = models.Person
extra = 3
inlines = [
EmailAddressInline,
]
class GroupAdmin(admin.ModelAdmin):
inlines = [
PersonInline,
]
admin.site.register(models.Group, GroupAdmin)
This will not work. The admin framework doesn’t expect to ever find an InlineModelAdmin
nested inside another InlineModelAdmin
, so the inlines list defined on PersonInline
makes no sense to it and it will happily ignore it. When you try to edit or add a Group
in your admin, you will see a list fields for editing or adding related Persons
at the bottom of the page, and that’s it. The list of fields for editing or adding EmailAddresses
that you expected to see attached to each Person
will not appear in the admin UI.
If you want nested inlines, you will have to extend the admin template for your model to handle them. You will also have to create a custom ModelForm
subclass to deal with the data that you get from your new template. See this StackOverflow question for some more details.
Support for displaying nested inlines in the admin is an oft-requested feature that even has a working patch, but it hasn’t made its way into Django yet because it doesn’t meet the community’s quality standards. See ticket #9025.
Limitation #2: All staff users can access all AdminSite instances
If you have multiple admin sites in your project, then all Users
who have their is_staff
flag set to True
have access to all those admin sites. Often, this is not desirable. For example, if you’re building a forum application, you might want to have separate dashboards for forum administrators and forum moderators. Or, if you’re building an app for managing your company’s payroll, you might want separate dashboards for accountants and employees. You don’t want someone from one of these groups to accidentally stumble across the other group’s admin interface.
The solution I use to get around this limitation is a little obtuse, but it works. First, I create a new model to serve as my AUTH_PROFILE_MODULE
:
# In models.py.
class ForumUserProfile:
pass
# In settings.py.
AUTH_PROFILE_MODULE = 'appname.ForumUserProfile'
On this model I define one permission for each admin site in my project:
class ForumUserProfile:
class Meta:
permissions = (
('access_user_admin', 'Can access user admin.'),
('access_mod_admin', 'Can access moderator admin.'),
('access_admin_admin', 'Can access main admin.'),
)
Then, I create a middleware that returns an HTTP 403 error if a user tries to access an admin he doesn’t have the permission to access:
# In middleware.py.
class RestrictAdminMiddleware(object):
RESTRICTED_URLS = (
('/user/', 'access_user_admin'),
('/mod/', 'access_mod_admin'),
('/admin/', 'access_admin_admin'),
)
def process_request(self, request):
user = request.user
# If user is not authenticated, we'll let the admin
# framework deal with him. We just care about staff users.
if not user.authenticated: return None
# Superusers are allowed to access everything, so we'll
# let them through.
if user.is_superuser(): return None
if user.is_staff():
for url in self.RESTRICTED_URLS:
if re.findall(url[0], request.path):
if not user.has_perm(url[1]):
raise PermissionDenied()
return None
# In settings.py.
MIDDLEWARE_CLASSES = (
'django.middleware.common.CommonMiddleware',
'django.contrib.sessions.middleware.SessionMiddleware',
'django.middleware.csrf.CsrfViewMiddleware',
'django.contrib.auth.middleware.AuthenticationMiddleware',
'django.contrib.messages.middleware.MessageMiddleware',
# Our shiny new middleware!
'appname.middleware.RestrictAdminMiddleware',
)
Finally, I set up the appropriate permissions for my users in the admin and I’m good to go.
Limitation #3: It is not possible to reorder, group or hide models on the admin index page without extending the index template
When the number of models you want to manage using the admin becomes large, it becomes necessary to organize them into logical groups for easy navigation. The admin currently groups models by app, which is fine when your apps have a small number of models. This organization scheme quickly becomes unwieldy when the number of models in a single app grows large. It becomes even worse when you have two related models in two separate apps that you think should logically be displayed under one group.
While the admin app itself doesn’t currently have this functionality, it is easy to add. You can:
-
Extend the admin index template (admin/index.html) and create groups yourself.
-
Use a third party app that allows you to customize the admin using a straightforward Python API. Two such apps are django-admin-tools and django-grappelli.
I personally use django-grappelli on all my websites.
Limitation #4: ImageFields are not displayed as thumbnails
If you have an ImageField
on any of your models, nine times out of ten you want to see a thumbnail of the image when you’re editing an existing instance of that model. There are several third party apps that add this feature to the admin, and the Django wiki lists a variety of solutions.
Limitation #5: Select boxes cannot be “chained”
This is a use-case that occurs very often. You have two select boxes on a page, and the list of choices in the second select box depends on the choice the user has made in the first select box. For example, the first select box could contain a list of Indian states and the second one a list of cities. You want to dynamically change the list of cities in the second box depending on which state is selected in the first box. The Django admin does not allow you to do that out of the box.
Fortunately, the solution is very simple: use django-smart-selects.
Limitation #6: “Save and continue editing” reloads the page
I think the purpose of the “Save and continue editing” button is lost if it reloads the page every time you click it. If you’re editing a complex model that has many fields, reloading the page means you lose your position on the page and have to scroll around to find the field you were editing. While composing long-form text in a <textarea>
—like I’m doing right now—clicking “Save and continue editing” means losing your position within the five-hundred or so words you just wrote. This is not an issue if you use the admin for one-off edits, but it becomes a major usability problem if you spend most of your time in the admin.
A full solution involves quite a bit of JavaScript, but you can get halfway there with a simple asynchronous POST request. Here’s an example:
# In admin.py.
class JournalAdmin(admin.ModelAdmin):
class Media:
js = ('ajax_submit.js',)
// In ajax_submit.js.
// NOTE: I'm not very good at writing idiomatic JavaScript.
// If you think this snippet can be improved, do
// let me know.
// WARNING: this code is meant for demonstration purposes.
// Do not use it in production. It fails on several edge cases
// and, in the process, destroys your data, empties your bank
// account, kidnaps your children and takes down reddit for a
// month.
"use strict;";
var AJAXSubmit = function () {
if (!$) {
var $ = django.jQuery;
}
function ajax_submit(e) {
e.preventDefault();
var data = {
// Don't forget the CSRF middleware token!
"csrfmiddlewaretoken":
$("input[name='csrfmiddlewaretoken']").val(),
"title": $("textarea#id_title").val(),
"slug": $("input#id_slug").val(),
"author": $("select#id_author").val(),
"published_on_0": $("input#id_published_on_0").val(),
"published_on_1": $("input#id_published_on_1").val(),
"content": $("textarea#id_content").val()
};
if ($("input#id_published").is(":checked")) {
data["id_published"] = "on";
}
$.ajax({
type: "POST",
url: "",
data: data,
success: function() {
alert("Saved!");
}
});
return false;
};
$(document).ready(function() {
var btn = $("div.submit-row input[name='_continue']");
btn.click(ajax_submit);
});
}();
Warning: don’t use this code in production. I’ve only tested it on Firefox and Chrome, where it fails on several edge-cases. If you use this in production and lose data, don’t blame me.
Limitation #7: Generic relationships are displayed poorly
Currently, the admin treats generic relationships just like any other data. Consider this:
# In models.py.
class TaggedItem(models.Model):
tag = models.SlugField()
content_type = models.ForeignKey(ContentType)
object_id = models.PositiveIntegerField()
content_object = generic.GenericForeignKey()
In the admin for TaggedItem
, content_type
will be displayed like any other foreign key: a select box containing a list of all content types in your project. object_id
will not get any special treatment either: it will be displayed as a simple <input>
box. This kind of treatment makes it impossible to figure out at a glance which object the generic foreign key actually refers to.
What I’d like the admin to do here is display the URL returned by the content_object’s get_absolute_url()
in a read-only field right after the object_id
field. Alternatively, the read-only URL field could simply contain a link to the content_object
’s admin page.
Limitation #8: While editing a model instance with a relationship to another model, it is possible to add a new instance of the other model to participate in the relationship, but impossible to edit an existing related instance
Let’s say you’ve got a BlogPost
model with a ManyToMany
relationship to a Tag
model. While editing your BlogPost
in the admin, you see a nice multi-select box where you can choose the Tags
you want to associate with your BlogPost
. There’s a small “+” button next to the multi-select box. Clicking this button opens a pop-up window where you can add new Tag
instances. Now, what happens when you create a new tag using this feature but accidentally give it an incorrect name? You’ll find that there’s no edit button next to the multi-select box. To fix the name of your newly-created tag, you will have to navigate to the admin page for that tag and fix the name from there. This is another minor inconvenience that can become a major usability issue if you use admin a lot.
I haven’t sorted this issue out yet, but I can think of a couple of solutions that could work. I’ll post them here in a separate blog post after I’ve implemented them in my own apps.
Closing Remarks
The Django admin app is incredibly useful. With a little tweaking, it makes writing custom management interfaces for your webapps completely unnecessary. Yes, it has some shortcomings, but most of them can be overcome with a little extra code. Besides, Django development happens at a very fast pace, so you can expect some of these shortcomings to be fixed in future releases of the framework.
Cache all the Things!
Yesterday night I installed memcached on the server that runs AnkurSethi.in and hooked it up to Django’s caching framework. Since then, page load times are significantly lower and so is the server load. This little machine is now ready to handle whatever the Internet throws at it.
According to the Chrome developer tools, the slowest loading resource on AnkurSethi.in is now PT Sans, the font that I use for all the text on this website. Since I set up caching, resources from my own server take less than 700ms to reach me in New Delhi. On the other hand, PT Sans as served from the Google CDN takes up to 1 second to get here. This doesn’t bother me much, though, since the browser caches the font after it is downloaded once, which means further page loads are blazing fast. Besides, web fonts are loaded asynchronously, so you can still read the page content while the font is loading in the background. Optimizing font loading time is simply not a productive endeavor.
Setting up caching in Django was easy enough, but I did run into one major issue: cache expiry. When I finish writing a journal entry, I often go back and edit it multiple times in order to fix lingering typos and grammatical errors. Sometimes I even do this after I’ve published it. I never want to serve up a journal entry with incorrect spelling or grammar. However, I had set a one hour expiry limit on the items in my cache. This meant that, even after I had made corrections to an existing journal entry, the server would continue to serve the incorrect entry from the cache for some time. In the worst case, “some time” could mean an hour, which was unacceptable.
The solution I used to fix this issue was simple, if inelegant. I hooked up the post_save
signal that is defined on every Django model to a function that simply nukes the entire cache. It may sound like a terrible solution, but it works great in practice. Sure, rebuilding the cache is expensive, but not expensive enough to warrant a more complex solution. Having to rebuild the entire cache is a huge issue for large, distributed, high-traffic webapps where running one database query may take tens of seconds. It is, however, not an issue for my humble personal blog, which contains a few kilobytes of data, runs off a single machine and is lucky to see a few hundred visitors per day. On the whole, I’m very happy with my current setup.
One more item crossed off the TODO list for this website. Next item: support for syntax highlighting.
A Whole New Can of Beans
I’m envious of people who have consistently posted to their blogs for years. I, too, have blogged semi-regularly over the years, but I’ve never managed to stick to a single blog for long.
All I remember about my first blog is that it was mostly a link dump, but I do remember my second one. Titled Infuriated, it was a rant blog where the 15-year-old me wrote angry, vitriol-laden posts about things that annoyed me. A few months later, just when I was running out of swear words, someone introduced me to the world of Free Software. I scrapped Infuriated and started a short-lived blog called The Free Speech Blog, where I wrote exactly one holier-than-thou post about something disgusting that Sony had done. Then I proceeded to neglect all my blogs for a couple of months until one day I got angry at something that had happened at school and created yet another rant blog on WordPress.com (you see a pattern here?) This one was called Badger Alert!, and ended up being very different from Infuriated. While Infuriated was just pure vitriolic fun, Badger Alert! had content that was more spontaneous and sincere than anything I had written before or have written since. You see, by this time I was well into my rebellious teenage years, and the smallest things would tick me off. Badger Alert! was my “personal space” on the internet, somewhere I could talk about my feelings and occasionally vent my anger. It got very pretentious sometimes, but on the whole it was the most fun I ever had writing.
The last blog I wrote before I dropped off the grid for close to two years was titled A Series of Uncool Events, hosted at Uncool.in (that URL redirects to AnkurSethi.in now). I didn’t have much fun with it, and I wasn’t particularly proud of what I posted there, yet it ended up being my most popular blog. The most popular post on Uncool.in, titled Computer Science FAIL: Higher Education in India, was a list of dangerously incorrect information that a professor at my college had fed my class during our first Introduction to Computer Science lecture. The post stayed on the Hacker News front page for a little time and sparked a handful of comments. It was more popular on reddit, where it spawned a long comment thread complete with flamewars and bad puns. It was was blogged and re-blogged across the Internet, leading to over a hundred thousand page views in the space of a week or two. At one point, it even showed up on the Cincom Smalltalk community blog. The resulting increase in PageRank ensured at least a few hundred hits every day. For the first time ever, my blog showed up as the number one result when you ran a search for my name on Google. Eventually I got rid of Uncool.in too, because I have a medical condition commonly known as being a complete idiot.
After Uncool.in, I became apathetic about blogging. I didn’t feel I had enough to say, or that what I had to say mattered, or that blogging was worth it at all. I’d purchased a nice beefy VPS for hosting my personal website and blog, but I didn’t set it up with a blogging tool. For many months both AnkurSethi.in and Uncool.in pointed at a plain HTML page with my email address and twitter handle on it.
If you’re part of the tech community, a blog or a personal webpage is a wonderful thing to have. I wouldn’t say it’s essential, because some of the best engineers out there don’t care for or need a web presence, but it’s definitely a huge plus. Blogging leads to good things. For me, it was one of the ways I found people who shared my interests. Every time one of my posts got more than a few hundred hits, I’d get emails from people who wanted to talk or just say hi. Blogging was also a way to get involved in discussions about topics I really cared about. Instead of merely participating in discussions that were trending on Hacker News or reddit, I could start discussions that I wanted to have. Now, it wasn’t as if I wrote insightful posts that sparked deeply nested comment threads across the Internet. No. The best discussions I had via my blog were simply the ones I had with people I already knew, and most of these took place over IRC, IM or beer. Still, there was the occasional post that would get attention, and that was nice.
As soon as I stopped blogging, I started feeling the negative effects. I noticed a drop in communication sent my way, and participated in fewer online discussions on the whole (IRC flamewars are not discussions). As my long-form writing output decreased, my thinking started to get muddled. Plus, there were so many things I was mulling over but was unable to discuss with anyone. I wanted to talk about how great Racket was, about the state of programming documentation, about things I was doing at college, about my little side projects and hacks, about Linux and free software, about privacy, about the books I was reading and the music I was listening to. I tried writing in real, physical, paper-and-pencil journal for a while, but it didn’t feel the same as writing publicly on a blog. Twitter and Facebook were useless for the kind of conversations I wanted to have, and reddit and Hacker News had their own issues. Eventually, I installed WordPress on my new VPS so I could start blogging again.
Despite having spent several hours setting up the new WordPress blog and getting it to behave just right, I never really posted to it because my frustrations with WordPress had grown to a point where I couldn’t even stand to look at the interface (a blog post for some other day). The blog just sat there, attracting only bots and spammers.
But then something wonderful happened. Earlier this year, I started getting into web development and, as a way of getting to know how to build, deploy and maintain a Django app, I decided to replace WordPress with a custom blogging tool of my own creation. Can ‘O Beans was born.
If you go look at the Can ‘O Beans code on GitHub, you’ll notice two things. One, some of the code—especially the template code—is atrocious. Once you get over that, the second thing you’ll notice is that Can ‘O Beans is as simple as a blogging tool can possibly be. It consist of exactly one database model, plus some templates and CSS. I didn’t even build a publishing interface, choosing instead to use the Django admin for writing new posts.
Can ‘O Beans as I’m using it now was “finished” a few weeks after I started building it, but I didn’t deploy it for months because I couldn’t bring myself to dive into the big bad world of WSGI servers and reverse proxies and databases and what have you. Then last month, I found that I needed to deploy several Django apps for a project I’m working on. This forced me to sit down for three days and absorb everything I could about using Django in production. I wanted to experiment with what I had learned on my own server before I went live with it on the server my project is hosted on, so Can ‘O Beans finally got a chance to shine.
And kids, that is why AnkurSethi.in once again has the pleasure of hosting real content instead of an unfriendly white HTML page.
Episode 11: New Year Special, 2011 Edition
Note: I originally posted this article on a personal blog I ran when I was in my late teens and early twenties. I discovered in May 2020 that the Internet Archive had preserved the contents of that blog in its entirety, including some of the media. That blog was an important part of my personal history, so I reposted all of that content on this website for archival purposes. While my politics, opinions, and outlook on the world have changed radically since I wrote those posts between 2009 and 2011, it’s good to know that I was as much of an idiot then as I am now.
It’s January 2. Unsolicited new year’s wishes are no longer clogging my inbox. Partygoers have manged to somehow stumble home. Hangovers have been cured by home remedies whose effectiveness cannot satisfactorily be explained by science. People who resolved to be productive this year have already spent over 5 hours looking at allegedly funny pictures of allegedly cute animals. Those who resolved to read more books have finished reading the first three paragraphs of the prologue to a vampire love story they heard about on the telly. Nihilists who were claiming that the holiday season has no meaning and, hence, we should all be ashamed of enjoying ourselves have moved on to claiming that January has no meaning and, hence, we should all be ashamed of January. Existentialists are currently in an introspective stupor. They’ll wish you a happy new year sometime in April. Absurdists are throwing balls of yarn at Existentialists.
I, of course, am trying not to fail yet another semester. Happy new year, indeed.
To counter the feeling of despondency brought on by the variety of digital modulation techniques I’m having to cram into my pretty little head, I have taken some time off to bring you my list of the ups and down of year 2010. Neatly categorized and methodically packaged for mass consumption, just like last year.
Here goes.
The Good
-
Made some progress on Goonj, but Pratul and I ultimately decided to can the project. The failure taught us so much about software development that I’m putting it at the top of the “good” list.
-
Rediscovered Lisp. Land of Lisp is easily the best programming book to ever come out of the great publishing corporations of planet Earth.
-
Rediscovered my love for reading. This year’s highlights include: Stephenson’s Anathem, nearly everything by Haruki Murakami, Asimov’s Foundation and Vonnegut’s Cat’s Cradle. Thanks, /r/books!
-
Rediscovered my love for writing. Didn’t write anything, though. Maybe this year.
-
Internalized a few important (and obvious) life lessons: unbridled curiosity is harmful, focus is important, hard work is the only way to get anywhere in life, the joy of creation is the greatest joy known to man, and the first draft of everything is crap.
-
Traveled alone for the first time in my life. Hope to do more traveling this year.
-
Met nearly everyone from #hackers-india.
-
Became more confident and less self-conscious than ever before. I doubt the people responsible for the change have any idea about how much they helped me.
-
Got my diet under control and started exercising. Lost weight. This was easier than I thought. Thanks, /r/fitness!
The Bad
-
Did not hang out much with Akshay and Apoorv. You guys have no clue how much I want to spend an evening with you.
-
Got carried away by the hype surrounding web frameworks. Wasted too much time wrestling with them. My next project will use minimalist frameworks like Flask or neatly modularized frameworks like Pyramid.
-
Architectural astronautery and NIH. Why do you think Goonj failed?
The Ugly
-
Spent most of the year navel-gazing. Did not accomplish anything meaningful.
-
Learned lessons about love—the hard way. I’m more clueless about the opposite gender than I thought.
-
Despite much prodding from Dipanshu and Pulkit, did not clear the exams I had failed in 2008 and 2009. Failed more exams. The major goal this year is to pass every single exam I previously failed so I can get my degree.
-
Spent too much time thinking about things I would never have thought of before. I think there was a period when I was just not myself. The good news is that this is a pretty normal developmental phase that everyone goes through. It’s over now, and I’m probably better off for it.
So there it is, my life in the year 2010 condensed to a few bullet points. Last year was about life lessons, internal conflict and growth. It was as painful, frustrating and unproductive as it could possibly have been. Whatever. It’s over now. Pratul said 2011 is the year of actualization and I have a feeling he might be right.
Goodbye. Have a great 2011 :)