How Variable Memory works inside Python
Video :
Notes :
Everything is an Object in Python
The example we will be using here is for the python interpreter written in C language (there are other Python interpreters written in different languages CPython,Jython, IronPython ,PyPy)
- PyObject is defined as a struct as C
The pyObjects do not move around in memory and have fixed size and memory address, and when a reference is assigned to the memory its count increments from 0 to 1, and if no reference is pointing to the object then the reference count becomes zeros and the python memory manager periodically deletes such objects from the memory (Garbage collection)
- When the same variable is assigned a different data type a different object is created that is assigned the string value
- Python does not have a variable as name binding and when a reference is removed the reference count of the object is decremented not the variable is deleted
- When del is used in python
- Mutable and Immutable Objects in Python
Lists in Python
- Lists can hold multiple values and there internal Objects can be changed
- Unlike PyObjects lists are stored in PyVarObject (which is pyobject with an additional size field)
- Size field is used to store the number of objects the field holds
- Value will not directly store the value instead it will point to the memory location where the objects are stored
- For array indexing to work each element in the array must have the same size
- Pointer to any type string , integer etc will all have same size , this is why you can store different type of objects in the list in Python
- On a 64 but system, the size of the pointer is 64 bits or 8 bytes same as before actual objects are created somewhere in the heap memory and they all will have the reference count of 1
- The list Object reference count will also be 1 initially
- When multiple reference points to the same object the reference count increases
- When a new object is added to the list a new pointer is added to array , so the size field of object is incremented
- The Backing array has fixed capacity which is basically contiguous memory allocated for
- When you keep adding elements to the list the array will get full data time and python allocates new memory usually twice the size of the existing array and moves items to the new array
Then, the Internal value pointer is also updated and the memory allocated to the older array is released
- The Address of the pyvarriableobject never changes in the memory only the backing storage array is moved around in the memory
When Copying list
- When copying list only the pointer are copied , actual objects are never moved and copy operations are cheap since pointers are small in size compared to the actual value of the object
None in Python
- None is used to represent empty values or null pointer in Python
- Like everything else, none is also an object
- When x= None is executed a none object is created in the memory and x will be a reference to it
- When another y=None is done a new None object is not created and both x and y will both will point to the same None object
- this can be verified using built ID function in python to see what the reference is pointing towards (works in Cpython) concept of interning or using the same object
- At start python pre-loads and caches a few most commonly used objects so it does not need to create them when needed
- Integers in the inclusive range of -5 to 256 are also interned so new objects are not created every time we use them
- If we are using a new value ie 257 that is not interned a new object is created for its reference
- In case of Strings that looks like function names are interned (some_string) the other strings are not interned in the current implementation of Python and may change in the future version of Python
Intern in Python
It is possible to intern custom strings using the intern function from the “sys” module
Difference between == and =
- == operator compares both list objects element by element
- “is” keywords the object memory address
- If x and y reference to same object “is” will return true
- “is” keyword should be used to check if variable is none , since only one none object is ever created in python in python interpreter during the lifetime of the program
How variables are passed to the function
- Objects are passed by reference to the function
- nums and my_list will point to the same list object
- Inside the function my_list will point to the new list
- when the function goes out of scope new list will be removed by memory manager because there is no reference to it , so changing value inside function wont change the original nums
- To change the value inside the reference list use the methods provided by the data type
- Append 10 will point to the original nums
Default Mutable Parameters
- Everything is an object in python including functions
- Function is an object of function class and the function object is created at some memory address
- Function objects has a __ defaults __ attribute which stores the default value as a tuple
- consider a function object that is created in the memory, so when function def is created so a my_list attribute is created in the memory and its reference is stored in my_list attribute of the function object
- The function body is not evaluated at this point , it only happens when the function is called and both times called the reference is saved in the same memory
- This mutable container type behavior is same for dictionary or anything else even for the mutable objects also work same if use int or boolean as there is no way to mutate their internal state
- Better approach is to use ‘None’ as the default value for the mutable objects and check in function using if statement
The += Operator
- it is same for other arithematic operators
- Create two variables and check there identity
- Using only + Operator
The original list remains unchanged
- When += is used the new elements are appended to the existing ones at the end of the list and both x and its copy will have the same memory reference
Reason for this behavior
when += is used the python calls the underlying implementation which passes self as another reference to the same object to mutate the internal state of the object
Why this understanding is useful ?
Without specifying a round bracket we are declaring a tuple with two elements, which can be verified using type function
- Tuples are immutable and cannot be changed once they are created
- Exception is raised when we try to modify the tuple
another example where we can still modify the element inside a tuple which is a list, because there is a reference to another memory location
- When append is used only list is modified not the tuple
Even its unexpected it still works