Gaurav's Blog

Reversing Bits

The problem is, Given a number of n bits, reverse the bits. Normally, I would have used 2n bitwise operations to do this.

In CSE548, for a problem we reversed a string recursively, by reversing the two halves of the string, and then exchanging the positions of the two reversed halves. A string of bits, is a string after all.

So, we start with reversing the odd-even bits, then reversing pairs of bits, then reversing sets of 4 bits, and so on. So, this requires 3*log(n) bitwise operations, which is better. Here is an efficient implementation for reversing a 32-bit unsigned int.

unsigned int reverse(unsigned int x)
{
    x = ((x & 0x55555555)<<1)  | ((x & 0xAAAAAAAA)>>1);
    x = ((x & 0x33333333)<<2)  | ((x & 0xCCCCCCCC)>>2);
    x = ((x & 0x0F0F0F0F)<<4)  | ((x & 0xF0F0F0F0)>>4);
    x = ((x & 0x00FF00FF)<<8)  | ((x & 0xFF00FF00)>>8);
    x = ((x & 0x0000FFFF)<<16) | ((x & 0xFFFF0000)>>16);
    return x;
}

void wrapper()
{
    cout << reverse(0xF00BAAF0);
}
Read more

Heaps Demystified

I think the first time I learnt heaps was in an undergraduate course on Data Structures. Though I understood the basic idea, it was never crystal clear for me to code during a competition.

I took up the task of learning it afresh today. Here is a short gist.

A heap is a data structure that can be used to make fast (constant time) queries of the type ‘what is the maximum element in the array’, with fast updates (logarithmic time).

A heap is usually implemented as a balanced binary tree, but there are other variants like fibonacci heap, which I won’t be discussing here. In the below discussion and code, I explain how I would implement a max-heap. For min-heaps, use the exact same techniques with small changes like flipping the comparison operators.

We need the following operations when implementing a heap:

  • topElem()
  • insertElem(t)
  • deleteTopElem(t)

All the operations should be done in $O(\log{n})$ time. Internally in a binary-heap, the root element is always atleast as big as both of its children, and this property of the heap exists recursively. (In min-heaps, root is as small, or smaller than both of its children)

It is easy to see that topElem() can be executed in constant time, since the maximum element would be at the root. (In min-heaps the minimum element would be at the root)

Now, when we need to remove the maximum (or the minimum element in min-heaps), we need to remove the root and preserve the heap property. Removing the root is simple, but what it leaves behind is not so trivial. Either the left, or the right child would be the biggest (or the smallest, in min-heaps) element left now. So, we compare the two. The bigger (or smaller. You get the idea now, I guess) of the two is placed in the root, leaving a void at the place where it existed. So, now we need to correct either the left, or the right sub-tree of the root, until the sub-tree is empty. This operation will be executed $O(\log{n})$ times, (costing $O(1)$ time, each time). So the complexity is $O(\log{n})$. Thus, we are done with deleteTopElem().

insertElem(t) is even simpler, we place the new element in the lower-most level of the binary tree, where we have a place, or create a new level if we don’t. Now, it is possible that the parent of the newly inserted element has lost the heap property. Thus, if the parent of the newly inserted element is greater than it, we swap the parent with the newly inserted element, and check if the new-parent has also lost the heap property. We continue this till we reach the root. This operation will be executed $O(\log{n})$ times, (costing $O(1)$ time, each time). So the complexity is $O(\log{n})$.

Here is a messy little implementation:

template < class T >
class Heap
{
	std::vector<T> store;
	int capacity, used;

	void bubbleUp();
	void pushDown();

	public:
	Heap();

	void insertElem(T elem);
	T topElem();
	T deleteTopElem();
};

template< class T >
Heap< T >::Heap()
{
	capacity = 2;
	used = 1;
	store.resize(2);
}

template< class T >
void Heap< T >::insertElem(T elem)
{
	if(used == capacity)
	{
		capacity = (((capacity-1)<<1) + 1) + 1;
		store.resize(capacity);
	}
	store[used++] = elem;
	bubbleUp();
}

template<class T>
T Heap<T>::topElem()
{
	// There is no element in the heap.
	if(used - 1 == 0)
		return 0;

	return store[1];
}

template<class T>
void Heap<T>::bubbleUp()
{
	int ptr = used - 1;
	while(ptr != 1)
	{
		if(store[ptr] > store[ptr>>1])
		{
			swap(store[ptr], store[ptr>>1]);
			ptr >>= 1;
		}
		else break;
	}
}

template<class T>
T Heap<T>::deleteTopElem()
{
	// Empty heap
	if(used - 1 == 0)
		return 0;

	T ret = store[1];
	swap(store[1], store[used - 1]);
	used--;

	if(used - 1 > 0)
		pushDown();

	return ret;
}

template<class T>
void Heap<T>::pushDown()
{
	int ptr = 1;
	while(ptr < used)
	{
		if(ptr<<1 >= used) break;
		T left = store[ptr<<1];

		T right;
		bool right_branch = true;
		if ((ptr<<1) + 1 < used) right = store[(ptr<<1)+ 1];
		else right_branch = false;

		if(store[ptr] >= left && (!right_branch || store[ptr] >= right))
			break;

		if(left >= store[ptr])
		{
			if(!right_branch || left >= right)
			{
				swap(store[ptr], store[(ptr<<1)]);
				ptr = (ptr<<1);
				continue;
			}
			else
			{
				swap(store[ptr], store[(ptr<<1)+1]);
				ptr = (ptr<<1) + 1;
				continue;
			}
		}
		else if(!right_branch)
		{
			swap(store[ptr], store[(ptr<<1)+1]);
			ptr = (ptr<<1) + 1;
			continue;
		}
	}
}

void wrapper()
{
	Heap<int> h;
	int arr[] = { 4, 3, 7, 1, 10};

	for(int i = 0; i <  5; i++)
	{
		h.insertElem(arr[i]);
		cout << h.topElem() << endl;
	}
	cout << endl;
	for(int i = 4; i >=  0; i--)
	{
		cout << h.deleteTopElem() << endl;
	}

	Heap<string> s;
	string sarr[] = { "abc", "def", "ghi", "xyz", "zzz"};
	for(int i = 0; i <  5; i++)
	{
		s.insertElem(sarr[i]);
		cout << s.topElem() << endl;
	}
	cout << endl;
	for(int i = 4; i >=  0; i--)
	{
		cout << s.deleteTopElem() << endl;
	}
}

If you notice the code, I am inserting elements one by one, costing $O(\log{n})$ on each insert. And hence, $O(n\log{n})$ for $n$ elements. However, if we know the entire set of elements before hand, heap construction can be done in linear time, in a manner similar to insertElem(t). Hint: Recursion.

You might also want to overload the comparison operators for custom data-structures.

I used this presentation (ppt) to refresh my memory.

Read more

Recursion is ubiquitous...

… and so simple and elegant. In some sense, it is a piece of art, and one needs artists to appreciate it. I had this semi-non-trivial question in my mind:

Print all the permutations of a given string S.

I had solved this before, but all I could recollect was that I had probably made it a little too dirty that time. The thing is, even with non-trivial problems, elegant solutions are possible with simple application of tools like Recursion / Divide & Conquer.

Prof. Bender asked us to visualize all the algorithmic techniques we learnt in CSE548 as ‘tools in our toolbox’. Recursion / D&C is a very important tool IMHO.

How do we break this problem into a smaller problem?

Simple.

We only decide the character that will appear in the beginning of the permutation, and find all the permutations of the remaining string, and just stick the first character in front of each string.

When we have only one character left, the first character is the only permutation.

The code is as follows:

string str;

void permute(int i)
{
  if(i == str.size()-1)
  {     
    cout << str << endl;
    return;     
  }     

  permute(i+1);
  for(int j = i+1; j < str.size(); j++)
  {
    swap(str[i], str[j]);
    permute(i+1);
    swap(str[i], str[j]);
  }

}

void wrapper()
{
  str = "abcd";
  sort(str.begin(), str.end());
  permute(0);
}

Isn’t it beautiful? Its magical how we apparently do no work in every step, and it all stitches together so beautifully.

Read more