Assembly Tutorial – The Data Section

So far we have only used the data section of our programs to define data and to read that data at runtime. However, we can also write to the data section! Try running the following code through the debugger:

.section .data
var:
.byte 5

.section .text

.globl _start
_start:

movq $10, var

movq $60, %rax
movq $0, %rdi
syscall

Using the techniques we learned in a previous post, you will see that the line

movq $10, var

changes the value of the byte of memory named var from 5 to 10.

If you want to define read only data in your binary, you can do so with the .rodata section. This works just like the data section. However, if you try to write to the memory here, you will get a seg-fault.

Assembly Tutorial – Advanced debugging Techniques

We have already seen the basics of debugging assembly code with GDB. We covered assembling our code with debug symbols, setting break points, stepping through the code as it executes and inspecting the contents of registers. Now it is time to learn some more advanced techniques!

Command Line Arguments

Sometimes when we run an executable we pass in command line arguments. We can also do this when debugging with GDB. There are two ways to do this. We can put the command line arguments after the run command r. So if echo_input was the name of a binary, and we wanted to debug it with the command line arguments “Hello World” we would, load it into GDB as normal, and then start execution with

r Hello World

Alternatively we can load the executable with our command line arguments directly by using the --args option. So, in our example we would execute:

gdb --args ./echo_input Hello World

This is very useful when debugging the code in our previous posts covering command line arguments!

Inspecting Memory

We know how to read the data stored in registers, but when we’re debugging we often want to read values stored in memory. Suppose, for example, we have a register that we are using as a pointer. We can use the info registers command to see what memory address is stored in the register. To see what data is stored in that memory address we can use the x command.

The x command prints out the value at a given memory address. We provide a suffix to specify how much memory to read and how to display it.

Let’s have a look at an example. Suppose we have defined a byte in our data section with value 12, and that we have moved the address of this byte into the register rax. In GDB we use the command

info registers rax

to read this value from rax. Let’s say that the output of this command is

rdi            0x40200b            4202507

So the address of the our data is 4202507, or 0x40200b in hex. We can read the value stored at this memory address with the command

x/bd 0x40200b

The output of this will be 0x40200b: 12, that is, the address followed by the value 12, as we would expect.

The suffix bd tells gdb to read a byte (b) of memory and display the result as a decimal (d). We can display value as hexadecimal with x, octal with o, binary with t and unsigned decimal with u. We can specify the size to read as a byte with b, 2 bytes with h, 4 bytes with w and 8 bytes with g.

Let’s say we have defined a short in our data section named numShort that has value 256. This will take up more than one byte, it will appear in memory as the byte 00000000 followed by 000000001. So the command x/bt will read the first byte, 00000000, and the command x/ht will read two bytes giving 0000000100000000.

We can also output more than one value at once, by adding a multiplier to the suffix. So if we apply the command x/2bt to the memory address referenced by the name numShort our output will be:

0x40200c:	00000000	00000001

We can also read and output character values. Suppose we have declared a string in our data section with the value “outputFile”. If we inspect the address of this memory with x/bx, we will get 0x6f. If you look this value up in an ascii table, you will see this is the hex value of the character ‘o’. This is, or course, the first character of our string.

To output these values as characters directly we use the c suffix like so:

x/bc 4202496

the output will be: 111 'o'. That is the decimal ascii value of the character ‘o’ and the character ‘o’. We can even read multiple values at once. For example the command

x/5bc 4202496

will output:

111 'o'	117 'u'	116 't'	112 'p'	117 'u' 

that is, we have read five characters starting at the memory address 4202496. Of course reading individual characters like this would be a little tedious, and we can just read entire strings out of memory. The command:

x/s 4202496

will output the entire string: "outputFile".

The x command doesn’t work just on the data section, we can use it to inspect any memory address! For example we can check the values in our buffers defined in the .bss section with x.

Our instructions are also stored in memory, and we can read those with x as well. The register rip is the instruction pointer and contains the address of the next instruction that will be executed. So, we can get that address in GDB with the command:

info registers rip

If we use x/i on the returned address we will see something like:

=> 0x401000 <_start>:	mov    $0x32,%rax

Once we specify a particular size and format, when we execute x without a suffix, it will use the same size and format. The default is to read 4 bytes and display in hexadecimal.

In our example we read the memory addresses from registers. However, we can also use the names of memory addresses directly with the & operator. For example:

x/s &filename

will print the content of the memory address named filename. This works for buffers defined in the .bss section and for data declared in the .data section. We can also use arithmetic expressions when specifying the memory address to inspect. For example to read the byte that is located 3 bytes pas the memory address 0x40200b we would use the command

x/bx 0x40200b + 3

The units we are counting in are the same as the units we are reading, so if we wanted to read the fifth 2 byte value after the memory address 0x40200b we would use:

x/hx 0x40200b + 5

Conditional BreakPoints and Watches

To help with the monotony of debugging we use conditional breakpoints. These are breakpoints that come with a logical condition on a register. The breakpoint will only stop execution when the condition is satisfied. Suppose our source code is in a file named, echo_input.s, and we would like to break at line 12 whenever the register rbx has value 4. We can do that with the following command:

b echo_input.s:12 if $rbx == 4

Note, we put a dollar sign $ before the name of the register. This is a little confusing, as we usually use dollar signs for constant values. We can use any of the typical binary comparison operators, ==, !=, <, >, ,<= and >= when specifying the condition. One thing that can catch you out here, is that the break point condition is evaluated before the instruction on that line executes.

We can also set watches, which are very similar to conditional breakpoints but more general. With a watch, we specify a condition on a register and code execution will halt whenever that condition is satisfied. We do not have to specify a particular line of code. To set a watch that will halt whenever the rcx is greater than 5, we use the command:

watch $rcx > 5

note, we don’t specify the file name, and we still use the dollar sign.

I have covered some pretty technical stuff in this post, I’d recommend you experiment with it all yourself to get a feel for these techniques!

Assembly Language – Command Line Parsing Part 2

So, we’ve already seen how the stack works and how we can read the name of the currently executing binary off the stack. Now it’s time to actually parse command line arguments that come after the name of the binary. Reading the arguments is a little bit more complicated because we do not know in advance how many there will be.

Luckily for us, the kernel gives us a little help. Before execution of our program begins the kernel reads all the space separated arguments after the binary name. It puts these arguments into one chunk of contiguous memory as null terminated strings. Then it puts the address of this piece of memory on the stack. Finally, it puts the number of command line arguments it saw onto the stack.

So, the first thing we do is read the number of command line arguments off the stack. Then we read the address of the start of the strings. Then we iterate over this block of memory, printing each string as we see it. We know how many arguments to expect, so when we have seen that many we quit.

OK, it’s time to look at the code!

.equ NULL, 0

.section .data
NEW_LINE: .byte 10
.section .text

.globl _start
_start:

popq %rbx

decq %rbx

cmpq $0, %rbx
je exit_with_error

popq %r13 # The name of the currently executing binary
popq %r13

movq $0, %r8   # number of nulls seen so far
movq $0, %r9   # number of characters since last null 
movq $0, %r10  # number of characters up to last null
movq $0, %r12  # number of characters seen so far

loop_start:
movq (%r13,%r12,1),%rax

incq %r9
incq %r12

cmp $NULL,%al
jne loop_start 

movq %r9, %rdx
movq %r13, %rsi
addq %r10, %rsi
movq $1, %rax
movq $1, %rdi
syscall

movq $1, %rdx
movq $NEW_LINE, %rsi
movq $1, %rax
movq $1, %rdi
syscall

addq %r9, %r10
movq $0,%r9

incq %r8
cmpq %r8,%rbx
je exit

jmp loop_start

exit:

movq $60, %rax
movq $0, %rdi
syscall

exit_with_error:
movq $60, %rax
movq $-1, %rdi
syscall

First we pop the number of command line arguments off the stack into the rbx register. We decrement this value with decq because it will include the name of the binary itself. We check that there are a non-zero number of command line arguments. Then we pop the binary name, which we won’t be using. After that we pop the memory address of the actual arguments into r13.

Next we set up four different registers as counters we will use when looping over the command line arguments. As the strings in memory are null-terminated we can keep track of them via null characters. So, the counters are: the number of nulls we have seen so far, the number of characters we have seen since the last null, the number of characters that came before the last null and the total number of characters we have seen overall.

movq $0, %r8   # number of nulls seen so far
movq $0, %r9   # number of characters since last null 
movq $0, %r10  # number of characters up to last null
movq $0, %r12  # number of characters seen so far

Now the loop itself begins. The first part of this loop indexes into the memory location beginning at r13 until we see a null character, incrementing r9 and r12 as we go.

loop_start:
movq (%r13,%r12,1),%rax

incq %r9
incq %r12

cmp $NULL,%al
jne loop_start 

Whenever we do see a null character we proceed to the next section. This is where we write the current string to the terminal.

movq %r9, %rdx
movq %r13, %rsi
addq %r10, %rsi
movq $1, %rax
movq $1, %rdi
syscall

movq $1, %rdx
movq $NEW_LINE, %rsi
movq $1, %rax
movq $1, %rdi
syscall

The register r9 contains the number of characters since the last null, so that is the length of the current string. The memory address of the start of this block of memory is in r13, the number of characters that we saw up to the last null are in r10, so the memory address of the start of this string is the sum of those two values. We also print a newline so make our output a little prettier.

Once we have outputted the current string, we update our other counters and check to see if we have read all the command line arguments.

addq %r9, %r10
movq $0,%r9

incq %r8
cmpq %r8,%rbx
je exit

jmp loop_start

First we update the value in r10 to contain the number of characters up to the null we have just seen by adding on the value in r9. Then we reset r9 to zero. Register r8 keeps track of the number of nulls we have seen so far, so we increment it and compare against rbx. If they are equal, we jump straight to the exit. Otherwise we jump back to the start of the loop and carry on.

So, we now know how to parse our command lines. The kernel also copies the current environment variables into memory and leaves a pointer to them on the stack. These are a little bit harder to parse, so we will ignore them for now.

Assembly Language – Arithmetic Instructions

Before we continue with command line parsing, we will have a brief diversion covering how arithmetic instructions work. We have already seen the increment and decrement instructions, incq and decq. These add one and subtract one from the value in a register. Now we will be covering more general arithmetic operations. In a previous post we saw how to use comparison instructions. Arithmetic instructions are really quite similar.

If we wish to add two quad-word values, we use the addq instruction. The syntax is:

addq X, Y

where X is the name of a register or a constant value and Y is the name of a register. So the instruction addq $17, %rax adds 17 to the value in register rax and stores the result in rax. To subtract we use the subq instruction which uses the exact same syntax.

There are two different multiplication instructions. The first, imulq, works just like the addq and subq instructions. This performs signed multiplication. However, as the result is stored in a single 64 bit register this instruction can quite easily lead to an overflow. Indeed, if we try to use constant values that are too large the assembly step will fail. For example the instruction

imulq $0x8000000, %rax 

will cause an error when you try and assemble. This is, roughly, because max positive value you can store in 32 bits is 7FFFFFFF. However, you can still move this value into a register and multiply that way.

There is another multiply syntax that allows us to multiply 64 bit numbers without overflow. This syntax uses two instruction names, mulq and imulq, but it takes a single register value. The instruction imulq performs signed multiplication and the instruction mulq performs unsigned multiplication. These instructions multiply the value in the supplied register by whatever value is in the rax register and stores the result across rdx and rax. The lower 64 bits are stored in rax and the upper 64 bits in rdx.

To perform division we use idivq and divq. As before, idivq is signed division, and divq is unsigned division. The division instructions take a single argument, the name of a register. With these instructions, the CPU takes the values in rax and rdx as a single value, rax is the lower 64 bits and rdx is the upper 64 bits. It divides this value by the value in the register supplied. The result of this division is then stored in rax and the remainder is stored in rdx.

There is also a unary negation operation negq, that negates the value in a register.

Many of these instructions will overflow. And, unlike in some higher level languages, our program will continue to execute happily with whatever values the registers now contain. To avoid this behaviour we use a special instruction: jo. This is the jump on overflow instruction. Whenever an arithmetic operation that causes an overflow occurs the CPU sets the overflow flag. The jo instruction jumps conditioned on this flag. If the flag is set, execution jumps to the address supplied.

There are also versions of the above arithmetic operations for non-quad words. However we aren’t particularly interested in them right now.

Assembly Language – Command Line Parsing part 1

We know from a previous post that when our program starts the Linux kernel will have stashed some helpful values for us on the stack. At the top of the stack we have the number of command line arguments. This value includes the name of the binary being executed. So, for example, the command line “binaryName arg1 arg2” would give 3. Next in the stack is a pointer to the name of the binary. Finally we have a pointer to the command line arguments themselves, this doesn’t include the name of the binary.

We’re going to see how to parse these arguments. First we’re going to read and display the name of the binary that is currently executing. To do this, we will need to read a value off the stack, get the length of this string by searching for the null character and then print it.

Let’s look at some code:

.equ NULL, 0

.section .data
NEW_LINE: .byte 10
.section .text

.globl _start
_start:

popq %r13

cmpq $1, %r13
jne exit_with_error

popq %r13

movq $0, %r12  # number of characters seen so far

loop_start:

movq (%r13,%r12,1),%rax

incq %r12

cmp $NULL,%al
jne loop_start 

movq %r12, %rdx
movq %r13, %rsi
movq $1, %rax
movq $1, %rdi
syscall

movq $1, %rdx
movq $NEW_LINE, %rsi
movq $1, %rax
movq $1, %rdi
syscall

exit:

movq $60, %rax
movq $0, %rdi
syscall

exit_with_error:

movq $60, %rax
movq $-1, %rdi
syscall

Firstly, I should mentions that I’ve decided to use the higher numbered registers in this example as it will make it easier to modify this code for the next post.

The very first thing we do is define a constant with equ. The constant is named NULL and has value 0. Next we define a byte in our data section named NEW_LINE with value 10. It will not surprise you to learn that 0 is the ascii code for a null character and 10 is the ascii code for a new line.

An important subtlety here is the difference between an equ constant and a value defined in the .data section. The constant values defined with equ are filled in when the assembler runs. The data section becomes part of the binary, and is copied into memory when our program is run. So we can reference a value in the .data section by it’s memory address, whereas an equ constant is really just a special name for a value.

In our text section, our first instruction is:

popq %r13

This pops the top value off the stack into the r13 register. We know that this value is the number of command line arguments. So, we check if it is equal to 1, and if not, exit with an error:

cmpq $1, %r13
jne exit_with_error

Now we pop the next value of the stack into the r13 register. This will be a pointer to the name of the currently executing binary. To get the length of this string we loop over it looking for the null character like so:

movq $0, %r12  # number of characters seen so far

loop_start:

movq (%r13,%r12,1),%rax

incq %r12

cmp $NULL,%al
jne loop_start 

To index into the string that contains the name of the binary we use index addressing mode:(%r13,%r12,1). This reads the value stored in the memory address equal to the value in r13 plus 1 times the value in r12. The r12 register is our counter that keeps track of the current character we are looking at. So as we loop we are iterating through the string.

We check to see if the current character is null with: cmp $NULL,%al. The important point here is that we are reading characters which are represented as bytes. So if we just read the null character, we would expect the lowest byte of rax to be zero. There is probably lots of junk in the higher bytes of rax that we don’t care about. We know already that the lowest byte of rax is named al, so we compare this to 0. Also we put the register second in our comparison, if we do not we will get an assembler error. This is because the size of the second operand determines the memory size we are comparing, in this case byte.

Once we find the null character we output the string to the command line as normal:

movq %r12, %rdx
movq %r13, %rsi
movq $1, %rax
movq $1, %rdi
syscall

Then we use the NEW_LINE data value we defined earlier to output a new line:

movq $1, %rdx
movq $NEW_LINE, %rsi
movq $1, %rax
movq $1, %rdi
syscall

And finally we exit with exit code 0. If you assemble, link and run this code, you should see the name of your binary file printed to the command line. In our next post we will see how to parse the command line arguments that come after the name of the binary.

What Level of Object Orientation are you?

I’ve seen a lot of arguments, both in real life and on the internet, about object orientation. One of the problems with these arguments is that people don’t always agree on what object orientation actually is. In my mind, the phrase has three distinct meanings. I don’t claim that any of what follows is the strict academic definition, it is just what I think people are usually talking about when they say “object oriented”.

Simply Using Classes and Methods

Sometimes when people say object orientation, they really just mean using classes and methods. Classes allow us to define composite types and to define functions on those types which we call methods. This isn’t really object orientation. You’ll find user defined composite types in plenty of languages that aren’t object oriented like Haskell. It is nice to be able to define a function on your new type as a method. However the fact that methods can be called only with the appropriate object is really just static typing.

Mutating State

The property that really characterises object orientation for me is mutable state. Object orientation doesn’t really make sense without mutability. Objects allow us to expose an API and abstract away the underlying implementation. This means that the objects we create contain some inner state and expose methods that use and potentially change this data. Many functional languages discourage mutability. For example in F# all variables are immutable by default and you have to use the mutable keyword to make them mutable.

Inheritance and Polymorphism

The final flavour of object orientation is the deepest and the darkest. This is the use of inheritance and polymorphism. This really is unambiguously object orientation by anyone’s definition. Real, old school object orientation will involve vast and complicated hierarchies of classes. You will have to deal with an advanced taxonomy of user defined types.

Inheritance does not only allow for simple code re-use. It allows you to take advantage of polymorphism. This means that at run-time different methods will be called depending on the type of your variable. So you might define a virtual method in your abstract base class, and have a different implementation for that method in each of the subclasses.

This sort of stuff, complex inheritance hierarchies and runtime polymorphism has fallen out of fashion but you will still see a lot of it in enterprise software, particularly at large financial institutions.

The Tricky Little Differences Between C# and C++ – part 1

There are a lot of big differences between C++ and C#. Really, despite their similar syntax, they are completely different languages. C# has garbage collection, C++ has manual memory management. C# is compiled to a special intermediate language that is then just in time compiled by the run time. C++ is compiled to machine code.

But, when you transfer between the two languages you are probably not going to get caught out by obvious things like that. It’s much more likely that the small subtle differences will trip you up. I’m going to cover some of these little differences that I’ve come across.

In this post we’ll be looking at uninitialised strings. Suppose we create a string in C++ without instantiating it directly ourselves and then print its contents. Like this:

string s;
cout << s << endl; 

This will just print an empty line, because the string s is empty. However, if you try the same thing in C#, it won’t work quite the same. If we use C# and do something like:

string s;
Console.WriteLine(s);

we’ll get the following runtime error:

Program.cs(10,31): error CS0165: Use of unassigned local variable 's'

This is a little surprising, normally C# is the more user friendly language. But in this case in C++ we get a friendly default behaviour whereas in C# we get a nasty runtime exception. Why?

This is because, in our C++ example we created a string object on the stack and dealt with it directly. When we created it, C++ called the default constructor which creates an empty string. However, in C# strings are reference types. This means that whenever we create one we are creating an object on the (managed) head. So our C# is really equivalent to the following C++ code:

String* s = NULL;
count << *s << endl;

If you run this you’ll end up with a seg fault, that’s because a null pointer points to memory address 0 which is inaccessible to you.

Don’t Get Caught out by Covariance!

Downcasting is bad, you shouldn’t do it, it’s code smell and it’s an anti-pattern. Unfortunately, in real world code you will see plenty of it. So you need to be aware of the pottential pitfalls of using downcasting. One that caught me out recently is covariance.

In C++, you can quite freely cast between types. You can take a pointer to some memory and cast it to any type you like, and read out what you get. Generally speaking you won’t get anything useful. If you’re not careful you’ll get “undefined behaviour”. If you have data stored in memory, the assumption is that you know that type that data is, and you are responsible for using it appropriately.

In languages like Java and C#, things are different. Here, the runtime checks our casts and will throw an exception if it thinks you have gone wrong. The consequences of this difference can sometimes be surprising.

Let’s look at an example. Suppose we have a base class Security defined like so:

public class Security
{
    int Id;
    public Security(int id)
    {
        Id = id;
    }
}

and two subclasses, Stock:

public class Stock : Security
{
    string Name;
    public Stock(int id, string name) : base(id)
    {
        Name = name;
    }
}

and Bond:

public class Bond : Security
{
    double Rate;
    public Bond(string name, double rate) : base(id)
    {
        Rate = rate;
    }
}

Now if we have a reference of type Security, that actually points to a Stock, we can happily downcast it to a reference of type Stock, like this:

Security s1 = new Stock(1, "GOOG");
Stock AsStock = (Stock)s1;
Console.WriteLine(s1.Name);

Because, Stock is a subtype of Security we can cast a reference to a Stock to a reference to a Security. This works because every Stock will have all the fields of a Security, in the same relative locations in memory.

However, if you have a reference of type Security that points to a Security or a Bond, and try and cast it to a Stock, you’ll have trouble. If we run the following code

Security s1 = new Bond(1, 0.02);
Stock AsStock = (Stock)s1;
Console.WriteLine(s1.Name);

we will see a run time exception of the form:

Unhandled exception. System.InvalidCastException: Unable to cast object of type 'Casting.Bond' to type 'Casting.Stock'.

This makes sense, the runtime knows that the object we have a reference to is not a Stock. It knows that we can’t cast this object to a stock in a sensible way. So the runtime stops us by throwing an exception.

Let’s look at a slightly more complicated example. Suppose, instead of a single object, we had a whole array of them. The following code:

Security[] securities = new Stock[]{new Stock(1, "ABC"), new Stock(2, "DEF")};
Stock[] stocks = (Stock[]) securities; 
Console.WriteLine(stocks[0].Name);

will run happily. We can cast a reference of type Security[] that points to a Stock[], to a reference of type Stock[]. However if we try

Security[] securities = new Security[]{new Stock(1, "ABC"), new Stock(2, "DEF")};
Stock[] stocks = (Stock[]) securities; 
Console.WriteLine(stocks[0].Name);

we will get an InvalidCastException:

Unhandled exception. System.InvalidCastException: Unable to cast object of type 'Casting.Security[]' to type 'Casting.Stock[]'.

This might seem a little surprising. The objects in our array are still actually Stocks. We know that we can cast a reference of type Security that points to a Stock to a reference of type Security. Why can’t we cast the Security[] reference to a Stock[] reference?

It’s a subtle one. When we cast an array reference, we are not casting the objects in the array, we are casting the array itself. So in the first array example, we are casting a reference of type Security[] to a reference of type Stock[]. The runtime knows that the reference actually points to an object of type Stock[], so this is fine. There will only ever be Stock objects in this array. Even though we have a reference of type Security[] pointing to this array, we can’t do something like:

Security[] securities = new Stock[]{new Stock(1, "ABC"), new Stock(2, "DEF")};
securities[0] = new Bond(2, 0.02);

the runtime knows that our array is of type Stock[], to it throws an exception:

Unhandled exception. System.ArrayTypeMismatchException: Attempted to access an element as a type incompatible with the array.

However, in the second example, we have a reference of type Security[] that points to an array of type Security[]. Although this array only contains stocks, the runtime cannot in general say whether that is true or not. Suppose we had done something like this instead:

Security[] securities = new Security[]{new Stock(1, "ABC"), new Stock(2, "DEF")};
securities[0] = new Bond(2, 0.02);
Stock[] stocks = (Stock[]) securities; 

We can of course add a Bond to an array of type Security[], the runtime doesn’t keep track of all this, which is why it has no way of knowing if the securities array really does contain objects of type Stock or something else.

The name of this type of casting is Covariance. Eric Lippert, one of the original designers of C#, has a pretty good blog about it. The really important point is that when we have an array of some base type, even if we are able to downcast the individual members of that Array, the runtime might stop us from downcasting the entire array.

This tripped me up in my day job last week. I was performing a data load that returned an array of type Security[], I knew this array contained only objects of type Stock and so I cast it to Stock[]. I merged this into master, but then our regression tests failed. Thankfully my mistake was caught before making it into prod.

Don’t Use Objects in Python – part 3

So my previous blog posts, were posted on reddit and I got a lot of interesting feedback. I’ve decided to address the issues that were brought up all together in one place, and this is it! I’ve grouped all the different ideas that came up into their own sections below.

This is against community guidelines

Some people were upset that what I recommended was against the community guidelines. They’re correct, it is. However that isn’t really an argument against doing something. Community guidelines and standards aren’t some holy scripture that must always be obeyed. The entire point of my post was that the common usage of object orientation of Python is bad, I doubt there is anyway I could say that without also advocating against the normal Python standards. There is certainly some value in educating people about community standards, but the value of an idea cannot be reduced to how compliant it is with Python coding guidelines.

Dictionaries are Actually Objects

So a couple of people pointed out that internal to python there is inheritance, and that datatypes like Dictionary actually inherit from the Python Object type. Sure, that’s fine, I don’t have a problem with how the internals of Python implement the various data structures. Indeed the internals are overwhelmingly written in C, so it is not at all relevant, when we’re talking about actual Python code.

Objects are Abstractions of Python Dictionaries

Some commenters argued that Objects in Python are abstractions sitting on-top of dictionaries. This just isn’t true. An abstraction hides the details below it. For example, variables are an abstraction that hide the details of how registers and memory access actually work. Because they limit what you see and what you can do, they make using memory a lot easier. Python objects don’t hide any of the details of a dictionary, they are just syntactic sugar, all the operations of a normal python dictionary are available right their on the object with slightly different syntax, you can even access the dictionary itself with the __dict__ attribute. So this isn’t really true.

It’s all just an implementation Detail

So a few people took issue with the fact that I was talking about the specifics of how objects and classes are actually implemented, rather than talking about them in an abstract sense. I think this has missed the point. I am not quibbling with how objects are implemented. My point is that objects don’t give any extra functionality beyond using dictionaries. I brought up the implementation as a way to demonstrate this. My point doesn’t rest on the fact that there is an actual dictionary internal to objects. It rests on the fact that objects are really just a collection of attributes with string names, these attributes can be added, deleted and modified at run time, and that there really isn’t anything more of value in the Python Object. The fact that this collection of attributes is implemented by a dictionary specifically isn’t important.

In particular some people stressed that what I was talking about was just an implementation detail unique to CPython, and that somehow it is different in some other Python implementation. I can’t stress this enough, it doesn’t matter what the actual implementation of the attributes of an object are, it is still bad. But also, no-one actually uses PyPy, Jython or IronPython. They are basically irrelevant.

Having methods defined on types is Convenient

I think this was probably the best point raised. I concede that if you want to maintain some state, and mutate it at runtime, it is convenient to wrap that state in an object and have specific methods defined on that object. The alternative is to just store your state using some built in types and operate on it with functions. However, what it seems you really want to do, restrict functions to only work with specific types, is to use static types. This seems like a roundabout way to achieve that. If you want static typing, you’re not going to be happy with Python. Really what you are doing here, is associating certain bits of data and functionality, like in an object oriented language, but without any guarantee of what the data actually is.

Python Objects are Different if you use __slots__

A lot of people brought up __slots__. Yes, if you use the __slots__ attribute Python objects work differently and my criticism doesn’t apply exactly. But, by default, python objects don’t use __slots__. The default implementation is actually pretty important, because it’s the one that is going to get used most often. In fact I’ve never seen someone use __slots__, I’ve only ever come across it as an obscure trick that gets recommended for python experts. It’s not a robust defence of a language feature to say, “actually there’s this special thing you can do that totally changes how it is implemented”. That’s really a sign that the default implementation is not good.

So __slots__ does change how objects work, and makes them a lot less like a special wrapper around a simple dictionary. But I don’t think it resolves any of the fundamental problems with using objects in Python. They are basically just a collection of attributes, we cannot say anything about them apart from how many attributes there are and what their names are. In particular they don’t specify the shape of the data.

One benefit of __slots__ is that your objects will use memory more efficiently. I would say, however, that you shouldn’t be worrying about memory optimisations like this in a high level language. Also, if there are memory optimisations in your language they shouldn’t be controlled in an opaque unintuitive way like the __slots__ attribute.

Assembly Language Tutorial – The Stack

Unfortunately we’re going to have to be a little theoretical in this post. We’ll be covering how we use memory in our assembly programs, in particular how the stack works.

When we execute our code, it is loaded into a nice big block of contiguous memory. The very first address in this block, address 0, is kept inaccessible from our program. If you have programmed in C before you will be familiar with null pointers. A null pointer is a pointer that points to memory address 0. The fact that this address is inaccessible to our program is why we can define a pointer with numeric value 0 to be null.

After address 0, there are various other things loaded into memory. Our instructions and data is loaded at address 0x0804800. After this there is a big empty space, the last address before this empty space is called the system break. If you try to access either the memory before 0x0804800 or inside the empty space after our instructions you will get a segmentation fault.

At the very top of the memory, is the stack. This is a special region of expandable memory that our code can use to store values on the fly. Reading and writing to the stack is slower than reading and writing to registers, but sometimes we need to use it. We typically use the stack for two different reasons, we don’t have enough space in the registers or we are changing context and want to save the current registers.

When we add new values to the stack it grows downwards into the empty space after our instructions and data. There really is a lot of space there so you shouldn’t worry too much about filling it up. We maintain the stack with a special register rsp, this should always contain the memory address of the top of the stack. We can access the contents of the stack using the rsp pointer directly. When we do this we have to be careful to maintain the value in the register so that it always points to the top of the stack.

We can also access the stack with the popq and pushq instructions. The popq instruction copies the value that the stack pointer currently points at into the register supplied and moves the stack pointer up to the next memory address. Similarly pushq copies the value in the named register into the next address after the current top of the stack and moves the stack pointer forward to this new address.

The stack doesn’t start off empty. When our programs begins the stack will contain the following in order:

  • Various environment variables
  • Command line arguments
  • The name of the program
  • The number of command line arguments

So when our program begins executing, the stack register, rsp, will be pointing at the topmost value in the stack, that is, the memory location containing the number of command line arguments passed.

In our next post we will see how to read command line arguments off the stack.