Python Descriptor

Python objects have attributes that can be accessed, modified or deleted. Python provides a default manner in which these operations are performed but what if we wish to customize these operations? This is where descriptors are helpful.

Descriptors can be used to manage access to an attribute, validate attribute values, customize error messages, track data corruption bugs, or perform dynamic computations on attribute lookup. Descriptors are commonly defined at class level but when defined at module level they can be used to display deprecation warnings or implement lazy loading.

It's been said that a good understanding of descriptors can improve a developer's coding skills. Descriptors are commonly used by library developers.

Discussion

  • What are some use cases of Python descriptors?
    An example descriptor that computes size on demand. Source: Hettinger 2021.
    An example descriptor that computes size on demand. Source: Hettinger 2021.

    Suppose there's a Directory class that has a size attribute to reflect the number of files in that directory. Naturally, its value can change dynamically as files are added or removed. Without descriptors, we might implement this as a method get_size(). The method will query the file system and obtain the current size.

    Descriptors allow us to get the size dynamically while also accessing the size as a data attribute. Thus, we can say d.size rather than d.get_size(). Moreover, the behaviour of querying the file system was previously within Directory.get_size(). With descriptors, the behaviour is nicely encapsulated within the descriptor class DirectorySize.

    Another use case of descriptors is to manage data access. For example, public attribute age is exposed in a Person class but managed privately as _age within a descriptor class. The descriptor class could also perform validations. For example, when assigning a value to age it can check if the value is a non-negative integer.

    A real-world example of a descriptor is GenericForeignKey in Django.

  • How can I implement Python's descriptor protocol?
    Illustrating __get__ and __set__ methods. Source: Adapted from Beaumont 2013, slides 16-17.
    Illustrating __get__ and __set__ methods. Source: Adapted from Beaumont 2013, slides 16-17.

    Assume one or more instances of TextWrapper class are instantiated as members of Email class. TextWrapper is the descriptor class and Email is the owner class. To support the descriptor protocol, TextWrapper implements the following methods (but not necessarily all):

    • __get__(self, instance, owner=None): Called when the attribute of the owner class or its instance is accessed. Eg. print(e.sender).
    • __set__(self, instance, value): Called to set the attribute on an instance of the owner class. Eg. e.sender = 'foo'.
    • __delete__(self, instance): Called to delete the attribute on an instance of the owner class. Eg. del(e.sender).
    • __set_name__(self, owner, name): Called when the owner class is created. This allows descriptor instances to access their own name as defined in the owner class.

    An object that has defined one or more of __get__, __set__ or __delete__ is a descriptor. Descriptors work only as class variables of the owner class.

    The statement e.sender += 'bar' will automatically call both __get__ and __set__.

  • What's the difference between data and non-data descriptors?

    A descriptor that defines __set__ and/or __delete is a data descriptor. A descriptor that defines only __get__ is a non-data descriptor.

    Data descriptors override redefinitions in instance dictionary whereas instances can override non-data descriptors.

    For a descriptor that doesn't define __get__, attribute access will return the value from the instance dictionary. If this doesn't exist, the descriptor object is returned.

    Non-data descriptors are typically used for methods. In fact, Python methods are implemented as non-data descriptors, including those decorated with @staticmethod and @classmethod. Instances can override these methods. On the other hand, property() function is implemented as a data descriptor, meaning that instances can't override its behaviour.

  • Why do we need the descriptor protocol when we have __getattr__, __setattr__ and __delattr__?
    The descriptor method __set__ is better than __setattr__. Source: Adapted from Egan 2018, 4:50, 16:17.
    The descriptor method __set__ is better than __setattr__. Source: Adapted from Egan 2018, 4:50, 16:17.

    Methods __getattr__, __setattr__ and __delattr__ allow us to customize the behaviour of attribute access. These methods are defined on the owner class. If the owner class has many attributes to manage, these methods could become complex to maintain. Method implementations may have ugly if-else constructs to manage the different attributes.

    Descriptors offer a more modular, maintainable and extensible approach. Instead of the owner class controlling how attributes should be managed, the control is now with the class of the attribute being managed. In other words, descriptors have a say on how their instances should be managed.

    In the figure, we see how any value assigned to a name attribute is capitalized. Method __setattr__ of Person class can be used for this but requires the attribute be named "name". The descriptor protocol offers a better way. Class CapitalizedValue is a data descriptor that does the capitalization. Person class defines an attribute of name "name" but any other name could have been given.

  • How do Python descriptors work under the hood?
    Python internals of attribute get/set/delete. Source: Adapted from Cohen 2017b.
    Python internals of attribute get/set/delete. Source: Adapted from Cohen 2017b.

    How Python descriptors work under the hood is basically about how the language does attribute access. Python applies a bunch of rules in a well-defined order of precedence. This is shown in the figure for both get and set/delete operations. These rules check whether class overrides the access methods __getattribute__ and __getattr__, whether class or instance access is performed, and whether attribute is a descriptor object and what parts of the descriptor protocol are implemented.

    We can see that for a get operation, data descriptor has higher precedence than instance attribute, which in turn has higher precedence than non-data descriptor.

    We can summarize the lookups for instance and class attribute accesses:

    • Get: obj.xCls.__dict__['x'].__get__(obj, Cls)and Cls.xCls.__dict__['x'].__get__(None, Cls)
    • Set: obj.x = vCls.__dict__['x'].__set__(obj, val) and Cls.x = v ⇒ Regular override
    • Delete: del obj.xCls.__dict__['x'].__delete__(obj) and del Cls.x ⇒ Regular deletion
  • What other features of Python are implemented by descriptors?

    Built-in functions classmethod(), staticmethod(), property(), and functools.cached_property() are all implemented as descriptors.

    A property attribute is one that triggers method calls when accessed. Property is implemented as a data descriptor. It has an easier interface than a descriptor and a different abstraction. The methods are typically defined in the same class in which the attribute resides. There are two ways to link methods to a property: (i) using the property(fget=None, fset=None, fdel=None, doc=None) built-in function; (ii) using decorators @property, @x.setter and @x.deleter where x is the name of the property and also the name of the methods so decorated.

    Another Python feature is slots, available as a class variable __slots__. This variable can be assigned all allowed attributes of the class and its instances. Thus, instances can't add new attributes. Instances also don't contain __dict__ and __weakref__ unless explicitly included in __slots__. Every item in __slots__ is implemented as a descriptor.

  • Could you share some tips for working with Python descriptors?

    For most use cases, or at least for beginners, property is a simpler interface than defining a descriptor from scratch. In fact, property has been called a "high-level descriptor builder" or a "descriptor factory".

    A read-only data descriptor can be constructed by defining both __get__ and __set__ but the latter is coded to raise an AttributeError exception.

    We can use a descriptor to implement a one-time initialization. This is done by setting an internal flag in __set__.

    Descriptors enable many other use cases elegantly. Logging all accesses of an attribute can be implemented by a descriptor. When an attribute changes, we can update all dependent attributes, such as, updating area when circle radius changes. Lazy evaluation is another use case in which behaviour is invoked only when necessary.

Milestones

Jan
1994

Python 1.0 is released. Descriptors don't exist in this initial release.

Dec
2001

Python 2.2 is released. This introduces descriptors into the language via PEP 252. Descriptors themselves have the following attributes: __name__, __doc__, __get__(), __set__() and __delete__(). Descriptors are now used to support static methods and class methods. New features slots and properties are also new kinds of descriptors.

Jun
2011

Python 2.7.2 is released. In the accompanying documentation, Hettinger states in the Descriptor HowTo Guide that,

Learning about descriptors not only provides access to a larger toolset, it creates a deeper understanding of how Python works and an appreciation for the elegance of its design.
Feb
2015

In PEP 487, Teichmann proposes the use of __set_name__() initializer for descriptors. While a descriptor knows its owner when __get__() is called, it doesn't know its own name, that is, the instance name that the owner has defined. Method __set_name__() solves this problem. This feature is accepted for Python 3.6 (December 2016).

Sep
2017

In PEP 549, Hastings proposes instance descriptors, since currently Python supports descriptors only as members of the type of an object. This proposal is rejected in preference to a simpler proposal by Levkivskyi in PEP 562. The idea is to support __getattr__() and __dir__() at module level. These allow customization of module attribute access. Two use cases for this are deprecation warnings and lazy loading. This feature is accepted for Python 3.7 (June 2018).

Aug
2018
Use of the descriptor protocol in public repositories at GitHub. Source: Egan 2018, 8:08.
Use of the descriptor protocol in public repositories at GitHub. Source: Egan 2018, 8:08.

At PyCon Australia, Egan reveals his analysis of the use of the descriptor protocol in public GitHub repositories. Method __get__() is implemented 1.2 million times. Method _set_name__() is implemented the least since it's been around only since Python 3.6.

Sample Code

  • # Source: https://docs.python.org/3/howto/descriptor.html
    # Accessed: 2021-12-25
    # An example of managed access in which obj._age is not exposed publicly
     
    import logging
     
    logging.basicConfig(level=logging.INFO)
     
    class LoggedAgeAccess:
     
        def __get__(self, obj, objtype=None):
            value = obj._age
            logging.info('Accessing %r giving %r', 'age', value)
            return value
     
        def __set__(self, obj, value):
            logging.info('Updating %r to %r', 'age', value)
            obj._age = value
     
    class Person:
     
        age = LoggedAgeAccess()             # Descriptor instance
     
        def __init__(self, name, age):
            self.name = name                # Regular instance attribute
            self.age = age                  # Calls __set__()
     
        def birthday(self):
            self.age += 1                   # Calls both __get__() and __set__()
     
    >>> mary = Person('Mary M', 30)         # The initial age update is logged
    INFO:root:Updating 'age' to 30
    >>> dave = Person('David D', 40)
    INFO:root:Updating 'age' to 40
     
    >>> vars(mary)                          # The actual data is in a private attribute
    {'name': 'Mary M', '_age': 30}
    >>> vars(dave)
    {'name': 'David D', '_age': 40}
     
    >>> mary.age                            # Access the data and log the lookup
    INFO:root:Accessing 'age' giving 30
    30
    >>> mary.birthday()                     # Updates are logged as well
    INFO:root:Accessing 'age' giving 30
    INFO:root:Updating 'age' to 31
     
    >>> dave.name                           # Regular attribute lookup isn't logged
    'David D'
    >>> dave.age                            # Only the managed attribute is logged
    INFO:root:Accessing 'age' giving 40
    40

References

  1. Arias, Pablo. 2018. "Python Descriptors Are Magical Creatures." Blog, November 25. Accessed 2021-12-21.
  2. Beaumont, Chris. 2013. "Python Descriptors Demystified." Via SlideShare, April 24. Accessed 2021-12-22.
  3. Cohen, Lior. 2017a. "Trespassing the Python Property and Staying Alive — Part I." On Medium, November 30. Accessed 2021-12-25.
  4. Cohen, Lior. 2017b. "Trespassing the Python Property and Staying Alive — Part II." On Medium, December 17. Accessed 2021-12-22.
  5. Egan, Matthew. 2018. "Describing Descriptors." PyCon AU, on YouTube, August 25. Accessed 2021-12-22.
  6. Hastings, Larry. 2017. "PEP 549 -- Instance Descriptors." Python.org, September 4. Updated 2021-02-09. Accessed 2021-11-21.
  7. Hettinger, Raymond. 2011. "Descriptor HowTo Guide." Documentation, Python HOWTOs, v2.7.2, June 11. Updated 2013-09-08. Accessed 2021-12-22.
  8. Hettinger, Raymond. 2021. "Descriptor HowTo Guide." Documentation, Python HOWTOs, v3.10.1, December 21. Accessed 2021-12-21.
  9. Kuchling, A.M. 2002. "What’s New in Python 2.2." Release 2.2.2, October 14. Updated 2021-12-21. Accessed 2021-12-21.
  10. Levkivskyi, Ivan. 2017. "PEP 562 -- Module __getattr__ and __dir__." Python.org, September 9. Updated 2018-07-08. Accessed 2021-11-21.
  11. Pranskevichus, Elvis, (ed). 2018. "What’s New in Python 3.7." Release 3.7, June 27. Updated 2021-12-21. Accessed 2021-12-21.
  12. Pranskevichus, Elvis and Yury Selivanov, (eds). 2016. "What’s New in Python 3.6." Release 3.6, December 23. Updated 2021-12-21. Accessed 2021-12-21.
  13. Python. 2011. "Python 2.7.2." Downloads, Python, June 11. Accessed 2021-12-22.
  14. Python Docs. 2021a. "Data Model." Section 3, The Python Language Reference, v3.10.1, December 21. Accessed 2021-12-21.
  15. Python Docs. 2021b. "Built-in Functions." The Python Standard Library, v3.10.1, December 21. Accessed 2021-12-21.
  16. Starostin, Alex. 2012. "Introduction to Python descriptors." Tutorial, IBM Developer, IBM, June 27. Updated 2012-06-26. Accessed 2021-12-21.
  17. Teichmann, Martin. 2015. "PEP 487 -- Simpler customisation of class creation." Python.org, February 27. Updated 2020-03-30. Accessed 2021-11-21.
  18. van Rossum, Guido. 2001. "PEP 252 -- Making Types Look More Like Classes." Python.org, April 19. Updated 2021-12-21. Accessed 2021-11-21.
  19. van Rossum, Guido. 2009. "A Brief Timeline of Python." Blog, The History of Python, January 20. Accessed 2021-12-21.

Further Reading

  1. Hettinger, Raymond. 2011. "Descriptor HowTo Guide." Documentation, Python HOWTOs, v2.7.2, June 11. Updated 2013-09-08. Accessed 2021-12-22.
  2. Cohen, Lior. 2017b. "Trespassing the Python Property and Staying Alive — Part II." On Medium, December 17. Accessed 2021-12-22.
  3. Egan, Matthew. 2018. "Describing Descriptors." PyCon AU, on YouTube, August 25. Accessed 2021-12-22.
  4. Gardner, Jonathan. 2019. "Managing Attribute Access and Descriptors." In: Theory of Python, Real Physics, on YouTube, November 30. Accessed 2021-12-22.
  5. Karan. 2020. "Demystifying Python’s Descriptor Protocol." Blog, DeepSource, April 16. Accessed 2021-12-21.
  6. Mărieș, Ionel Cristian. 2015. "Understanding Python metaclasses." Blog, ionel's codelog, February 9. Updated 2020-11-19. Accessed 2021-12-22.

Article Stats

Author-wise Stats for Article Edits

Author
No. of Edits
No. of Chats
DevCoins
4
0
1044
1532
Words
0
Likes
242
Hits

Cite As

Devopedia. 2022. "Python Descriptor." Version 4, January 8. Accessed 2022-01-18. https://devopedia.org/python-descriptor
Contributed by
1 author


Last updated on
2022-01-08 12:35:25