We’ve mentioned before how Oyster.com’s Python-based web servers cache huge amounts of static content in huge Python dicts (hash tables). Well, we recently saved over 2 GB in each of four 6 GB server processes with a single line of code — using __slots__
on our Image
class.
Here’s a screenshot of RAM usage before and after deploying this change on one of our servers:
We allocate about a million instances of a class like the following:
class Image(object): def __init__(self, id, caption, url): self.id = id self.caption = caption self.url = url self._setup() # ... other methods ...
By default Python uses a dict to store an object’s instance attributes. Which is usually fine, and it allows fully dynamic things like setting arbitrary new attributes at runtime.
However, for small classes that have a few fixed attributes known at “compile time”, the dict is a waste of RAM, and this makes a real difference when you’re creating a million of them. You can tell Python not to use a dict, and only allocate space for a fixed set of attributes, by settings __slots__
on the class to a fixed list of attribute names:
class Image(object): __slots__ = ['id', 'caption', 'url'] def __init__(self, id, caption, url): self.id = id self.caption = caption self.url = url self._setup() # ... other methods ...
Note that you can also use collections.namedtuple, which allows attribute access, but only takes the space of a tuple, so it’s similar to using __slots__
on a class. However, to me it always feels weird to inherit from a namedtuple class. Also, if you want a custom initializer you have to override __new__
rather than __init__
.
Warning: Don’t prematurely optimize and use this everywhere! It’s not great for code maintenance, and it really only saves you when you have thousands of instances.