Singly and doubly linked lists in python

Singly-linked list implementation with python:

class Node(object):
 
    def __init__(self, data, next):
        self.data = data
        self.next = next
 
 
class SingleList(object):
 
    head = None
    tail = None
 
    def show(self):
        print "Showing list data:"
        current_node = self.head
        while current_node is not None:
            print current_node.data, " -> ",
            current_node = current_node.next
        print None
 
    def append(self, data):
        node = Node(data, None)
        if self.head is None:
            self.head = self.tail = node
        else:
            self.tail.next = node
        self.tail = node
 
    def remove(self, node_value):
        current_node = self.head
        previous_node = None
        while current_node is not None:
            if current_node.data == node_value:
                # if this is the first node (head)
                if previous_node is not None:
                    previous_node.next = current_node.next
                else:
                    self.head = current_node.next
 
            # needed for the next iteration
            previous_node = current_node
            current_node = current_node.next
 
 
s = SingleList()
s.append(31)
s.append(2)
s.append(3)
s.append(4)
 
s.show()
s.remove(31)
s.remove(3)
s.remove(2)
s.show()

Doubly-linked list implementation in python

class Node(object):
 
    def __init__(self, data, prev, next):
        self.data = data
        self.prev = prev
        self.next = next
 
 
class DoubleList(object):
 
    head = None
    tail = None
 
    def append(self, data):
        new_node = Node(data, None, None)
        if self.head is None:
            self.head = self.tail = new_node
        else:
            new_node.prev = self.tail
            new_node.next = None
            self.tail.next = new_node
            self.tail = new_node
 
    def remove(self, node_value):
        current_node = self.head
 
        while current_node is not None:
            if current_node.data == node_value:
                # if it's not the first element
                if current_node.prev is not None:
                    current_node.prev.next = current_node.next
                    current_node.next.prev = current_node.prev
                else:
                    # otherwise we have no prev (it's None), head is the next one, and prev becomes None
                    self.head = current_node.next
                    current_node.next.prev = None
 
            current_node = current_node.next
 
    def show(self):
        print "Show list data:"
        current_node = self.head
        while current_node is not None:
            print current_node.prev.data if hasattr(current_node.prev, "data") else None,
            print current_node.data,
            print current_node.next.data if hasattr(current_node.next, "data") else None
 
            current_node = current_node.next
        print "*"*50
 
 
d = DoubleList()
 
d.append(5)
d.append(6)
d.append(50)
d.append(30)
 
d.show()
 
d.remove(50)
d.remove(5)
 
d.show()

Implementing a fake filesystem using a trie in python

A trie (also sometimes called a prefix tree or radix tree) is an ordered tree data structure. I was looking at ways to implement such a thing the other day, and came up with 3 possible applications, off the top of my head:

  • a URL shortening service
  • a virtual filesystem
  • a search engine

The idea behind a trie is that you can store shared parts of data in an ordered fashion. If you have a URL shortening service that needs to keep track of the url “http://baseurl.com/ABCDE” and “http://baseurl.com/ABCDEF”, you’re storing almost 2 identical strings, with a 1 character difference. How could we do this better? This is what a trie looks like:

secondTrie

It’s basically a map (or dictionary in python) of nested maps. It can be handy to have a delimiter when a node is the last one, but this is not necessary. To exemplify how it works in practice, I’ve started coding a quick and dirty fake filesystem in python. The idea behind using a trie to hold data structures if the same: if I have a directory /var, a directory, /var/www and another directory /var/www/mysite, it’s not necessary to repeat the “var” part 3 times, nor the “www” twice.

There are tons of tricky things about filesystems that complicates matters such as hidden files, creating intermediate directories when they don’t exist, permissions, etc. but I’ve ignored them for the most part here. I mainly wanted to implement an “add” function to see if this worked with a trie, and it sort of looks like this:

    @_input_path_sanitizer
    def add(self, path):
        logging.info("Adding directory {}".format(path))
        pieces = self.get_pieces(path)
        temp_trie = self._paths
        for piece in pieces:
            # if we're at the end of the trie, take out the delimiter.
            if piece in temp_trie and self.__end in temp_trie[piece]:
                del temp_trie[piece][self.__end]
            # setdefault() tries to get the key, if it exists. if it doesn't set it to the second parameter
            temp_trie = temp_trie.setdefault(piece, {})
        # set the delimiter here in all cases
        temp_trie.setdefault(self.__end, self.__end)

The whole thing is on github, but this function does most of the heavy lifting. Then you can do stuff like :

fs = FakeFs()
fs.mkdir("/poo/woo/soo/doo/poo")
fs.mkdir("/etc")
fs.mkdir("/")
fs.mkdir("/tmp")
fs.mkdir("/tmp/whatever")
fs.mkdir("/var/www/mysite")
fs.dir_exists("/sadsad")
fs.dir_exists("/etc")
print(fs.paths)

Which yields something along the lines of:

2014-08-23 12:15:42,449 Adding directory /poo/woo/soo/doo/poo
2014-08-23 12:15:42,449 Adding directory /etc
2014-08-23 12:15:42,449 Adding directory /
2014-08-23 12:15:42,449 Adding directory /tmp
2014-08-23 12:15:42,450 Adding directory /tmp/whatever
2014-08-23 12:15:42,450 Adding directory /var/www/mysite
2014-08-23 12:15:42,450 /sadsad does not exist
2014-08-23 12:15:42,450 /etc exists
['/', '/etc', '/poo/woo/soo/doo/poo', '/tmp/whatever', '/var/www/mysite']