Data Structures Semester 2 2020: Summary Semester 1 Data Structures

LINKED LIST

Linked list is a collection of elements. Its difference with an array is that it uses nodes instead of the actual elements to record the collection of data. These nodes are not stored right next to each other but instead can be stored in different locations in the memory. The data stored can only be accessed in a sequential manner.

To allocate memory dynamically in C/C++, we use malloc.

To create a single link list in C/C++, we have to define the node. We do this by doing a making a struct and making a variable of a struct pointer to point to the next element in the list. We also have to initialize the head of the list, where we start reading the list from.

When we want to add a new value to the list, we have to first allocate the memory for the node, assign values to it, then connect it to the list. For the example of putting it in front of the head:

This will insert the new element in front of the head, making it the head or where the list starts.

To delete the value, we have to find the value we want to delete and use the free function in the malloc library. After that, we have to reconnect the elements so that the sequence is not broken. There are two conditions that it can be in: it is a head node, it is not a head node.

3 more types of linked list:

- Circular Single Linked List

The last node (tail) of the tail is connected back to the head so the sequence goes in a circular motion.

- Double Linked List

Instead of only having a head pointer, there is also a prev pointer to point to the node previous of the current one,

- Circular Double Linked List

Similar to the circular single linked list but there is also a prev pointer on each node.

STACK AND QUEUE

A stack is a collection of data with two important functions: push and pop. The data are stored as LIFO (last in first out). We can use both an array and a linked list to make the stack.

Array - There is a top variable to indicate the top of the stack and a max variable for the maximum number of elements in the array. If the top is at NULL, it means the stack is empty, if it is at max-1, the stack is full (arrays end at max-1).

Linked List - Like a regular single linked list, each node will have a value and a pointer to the next node. The head is used at the top. The new nodes are inserted before the head (push) and all deletions are always done at the head (pop). If top = NULL, the stack is empty.

Prefix - operator before operand [+ 1 2]

Infix - operator between operand [1 + 2]

Postfix - operator after operand [ 1 2 +]

We use prefix and postfix so we don't need to use brackets and it is easier for computers to evaluate.

To convert infix to postfix, search for highest precedence operator, put operator behind operands, keep repeating.

Using stack:

To convert infix to postfix, search for highest precedence operator, put operator before operands, keep repeating. To apply to stack, we do the same thing but read the input from right to left.

A queue is a collection of data with two important functions: push and pop. Its difference with stack is that its data is stored as FIFO (first in first out). All new elements are inserted from the rear and all deletion are from the front.

Array - there is a front and rear to indicate the front and back of the collection. We insert behind the rear index.

Linked List - Similar to stack but it has two variables, start and rear.

A circular queue is a queue that wraps around so that more elements can be inserted.

A deque is a queue where you can insert and deete from both the fron and rear.

A priority queue is a queue where the elements inserted have a priority to be processed and so are processed according to that.

TREE AND BINARY TREE

A tree starts with a root node and has lines called edges that connect to the children nodes called leaves. Nodes that come from the same parent are called sibling nodes. The degree of a node is the total sub tree of the node, The height of a tree is the max number of nodes.

A binary tree is a data structure where every node has at most two children.

Types of binary tree:

- Perfect

Every level are of same depth.

- Complete

All levels except for maybe the last are completely filled and all nodes are as far left as possible.

- Skewed

Every node has 1 child at most

- Balanced

No children are farther away from the root than other children.

Implemented using link list:

Implementation of pre/post/infix:

A threaded binary tree is a tree with NULL pointers.

A binary search tree is where there is fast searching, rapid sorting, easy insertion and easy deletion. The left subtree of every node are all elements that are lower value than the node and the right subtree are all the greater value nodes. Its three main functions are find, insert and remove.

HASH AND HASH TABLES

Hashing is method that is usually used to store and retrieve strings that are transformed to shorter length values or keys to represent them. These shortened strings act as indexes or addresses in the hash table for the original string. This method id used as it is easier to find shorter length items than their original length.

A hash table is where we store the string in an array format. The index of the table/array is the hashed key while the value of the index is the original string.

Hash function are functions to hash a string.

Types of hash functions:

- Mid-square
The string is squared and the a specified number of bits from the middle part is taken to get the hash key. If the key is a string, it is converted into a number.

- Division
Get the added up ascii of the string and modulo it by a specified number (usually the table size).

- Folding
Divide the value into equal digit parts, except for the last part which can contain less or more digits. Add all the parts together and ignore the last carry. The result is the hashed value.

- Digit Extraction
Obtain a predefined digit from the original value and the hashed value is obtained.

- Rotating Hash
Use another hash method then reverse the obtained value to obtain the hashed value.

If we continuously use the same hash method for many strings, some of the hash values might be the same.This is called a collision.

There are two way to handle a collision:
- Linear Probing
Look for the next available slot in the table and put the value there instead.

This method though makes it harder to find a specific string as when you calculate the hash key of a string, the hash key might not be where it is stored in the table, causing you to have to loop through the table until you find the wanted value. This makes it very ineffective if there are many collisions.

- Chaining
Make a link list in the slot where collisions occur.

When a collision happens a link list is used to store the next value with same hash key. This is so that when we want a specified string and it experiences collision, we only need to iterate through the link list in the slot, not the entire table.

Data Structures Semester 2 2020

Monday, April 6, 2020

Summary Semester 1 Data Structures

No comments:

Post a Comment