Python Variables and references¶
Python does not store values of it's variables, it stores references to the value object instead
In [1]:
a = "a" # variable a references a string object which contains the value: a
type(a)
Out[1]:
In [2]:
id(a) # let's check the identity of a
Out[2]:
In [3]:
b = a # let's create another reference
print id(b)
print id(a)
print a == b # value coparison
print a is b # identity comparison
In [4]:
# Because the string object is immutable, changing the "value" of b won't affect the "value" of a
b = "b"
print id(b)
print a
print id(a)
In [5]:
a = "a"
b = "b"
print id(a+b)
print id(b+a)
print id(a+b), id(b+a) # Note that ID's of these objects are the same, it's because Python
# first concatenates "a" and "b" into newly allocated string "ab".
# This string is then passed to id() and deallocated, since it's no
# longer used. Because of the way CPython works, the next concatenation
# of "b" and "a" is allocated at the same location in memory, hence the result
In [6]:
print a+b is b+a # Here Python has to store both concatenated strings in memory at the same time,
# so it can't allocate both to the same location which results in different IDs
In [7]:
print a+b
print a+b is "ab" # Although both sides evaluate to "ab" string, the IDs will be diffrent because
# internally the a+b results in a product of BINARY_ADD, which gets a new ID.
# The "ab" is a result of LOAD_CONST which creates an object and then all subsequent
# references will point to the same object
In [8]:
print 256 is 256 # Integers are immutable objects, these objects are usually cached
a = 256
b = 256
print a is b
In [9]:
print 257 is 257 # Python has to have both objects allocated at the same time (to evaluate the expression)
# so their IDs match
c = 257
d = 257
print c is d # But they won't match if passed via reference, numbers above 256 (and strings
# longer than 2 characters) are not cached internally
Notes on identity and mutability¶
- In Python, everything is represented by objects, including code.
- Every object has an identity, a type and value, once created, object's ID and type never changes.
- The value of some objects can change, these are called mutable, otherwise an object is immutable
- Immutable object's value can change if the value of a mutable object it contains/references changes
- Objects are never explicitly destroyed, unreferenced objects, hovewer, might be wiped out by the garbage collector
- Objects which reference external resources like files should be explicitly closed (ex file.close())
- Immutable objects of the same value are not guaranteed to have the same ID
- Mutable objects of the same value are guaranteed to have different IDs [] is [] => False
In [10]:
a = [1, 2, 3]
b = a # Passing the list by reference to variable b
b[1] = 100
# Because _list_ type is mutable and both variables reference the same list object,
# changing b will affect the shared object and consequently the value of a
print a
In [11]:
a = [1, 2, 3]
b = a[:] # This creates a shallow copy of object referenced by a
b[1] = 100
print a
In [12]:
a = ([], [], []) # Let's create a tuple of lists, the referenced objects cannot be replaced,
# IDs are protected
a[0] = [1, 2, 3] # Trying to reference a new list object will raise **TypeError**
In [13]:
print a
a[0].extend([1, 2]) # Extending existing (mutable) object is possible
print a
In [15]:
def foo(bar):
bar.append("Bob")
print id(bar)
some_list = []
foo(some_list) # Passing reference to the list
print id(some_list)
print some_list # List is a mutable object, it can be changed by any assigned variable
Default argument evaluation in funcitons¶
- Funciton's default arguments are evaluated once, when the funciton is defined, not when it is called
- If the mutable object is used as a default argument, once mutated, all subsequent calls to this function will see it mutated too
- This functionality can be "exploited" to maintain a state between calls of a funciton (often used in caching functions)
In [19]:
def bar(sth = []):
sth.append(1)
print id(sth)
print sth
bar()
bar()
bar()
Copying objects¶
- Assignment statements in Python do not copy objects, they create a binding/reference between a target and an object
- For objects/collections that are mutable/contain mutable items, a copy is needed in order to change one copy without changing the other
- A shallow copy constructs a new object and then tries to copy object references found in the original
- A deep copy will create a new object and recursively insert copies of all objects found in the original
- Recursive objects (directly or indirectly referencing themselves) might cause a recursive loop while deep copying
- Deep copy copies everything which might copy objects which should not be copied but shared instead
- Deep copying mechanism can be controlled in a class by overriding __deepcopy__() method (__copy__ for shallow copy)
- Lists can be shallow copied by assigning a slice of the entire list: copied_list = original_list[:]
- Deep copying does not copy types like: module, file, socket, array or similar, __deepcopy__() should be overloaded to properly reinitialize these
In [44]:
import copy
f = open("/tmp/test001.txt", "w")
obj = object()
print id(f), id(obj)
In [45]:
a = {"file": f, "foo": obj} # Reference objects
print id(a["file"]), id(a["foo"])
In [46]:
b = copy.copy(a) # Create a shallow copy
print id(b["file"]), id(b["foo"]) # the IDs are the same as original
print b["file"].closed # File object is still open
In [47]:
c = copy.deepcopy(a) # Create a deep copy
print id(c["file"]), id(c["foo"]) # IDs are different
print c["file"].closed # The file object is closed