1. Anuncie Aqui ! Entre em contato fdantas@4each.com.br

[Python] In Python 3.12, why does 'Öl' take less memory than 'Ö'?

Discussão em 'Python' iniciado por Stack, Outubro 1, 2024 às 06:23.

  1. Stack

    Stack Membro Participativo

    I just read PEP 393 and learned that Python's str type uses different internal representations, depending on the content. So, I experimented a little bit and was a bit surprised by the results:

    >>> sys.getsizeof('')
    41
    >>> sys.getsizeof('H')
    42
    >>> sys.getsizeof('Hi')
    43
    >>> sys.getsizeof('Ö')
    61
    >>> sys.getsizeof('Öl')
    59


    I understand that in the first three cases, the strings don't contain any non-ASCII characters, so an encoding with 1 byte per char can be used. Putting a non-ASCII character like Ö in a string forces the interpreter to use a different encoding. Therefore, I'm not surprised that 'Ö' takes more space than 'H'.

    However, why does 'Öl' take less space than 'Ö'? I assumed that whatever internal representation is used for 'Öl' allows for an even shorter representation of 'Ö'.

    I'm using Python 3.12, apparently it is not reproducible in earlier versions.

    Continue reading...

Compartilhe esta Página