Hash Table example essay topic

333 words
CS 5238 Combinatorial methods in bio informatics 2004/2005 Semester 1 Lecture 8: Finding structural similarities among proteins (II) Lecturer: Prof Jean-Claude LatombeScribe: Cheng Chi Kan, Lee Pern Chern and Moritz Buck 1 Voting scheme with hash table Many-to-many comparisons are evaluated when we align protein structures. In order to avoid repetition, a better organization of computation is necessary. This could be achieved by pre-computing the indexes of proteins and arranging them in a hash table. Then, queries are evaluated based on a voting scheme using the hash table.

This voting scheme replaces the seed generation process. In this lecture, we look into the voting scheme used in 3d SEARCH [2]. The algorithm is based on the concept of geometric hashing [1] developed in the eld of computer vision. The basic idea is to represent all secondary structure elements (SSEs) from all target proteins with a large, highly redundant hash (or index) table. Once the table has been constructed, every SSE from a given query structure can be compared simultaneously to the entire set of SSEs of the target structures, by indexing the SSE into the table. The hash table that consists of 3-dimensional regular grid of cube bins (2 A) is constructed as follows.

For each target structure, we compute a coordinate system P for every pair of vectors (i; j) (Figure 1). Then, we transform ally jixFigure 1: The coordinate systems for the vectors (i; j). The z-axis is coincident with i and the y-axis is parallel to j. remaining vectors in the protein to the coordinate system P. At last, for each remaining vector k, we place an entry (with the orientation, coordinate system and type of k) into the hash table at the location spec i ed by the coordinates of the mid-point of k (Figure 2). Since the grid is sparsely occupied, so does the hash table. A structure with n SSEs contributes n (n.