Intro of data structure and algorithm
A data structure is an expression of objects in computing, which describes an organization of data with rules of operations within. An algorithm is a finite collection of clear executions, composed of inputs, orders, repeats, conditions, and outputs. Programming is a realization of algorithms and data structures, and how to use them affects the performance of a system.
How to measure algorithm performance
- Method
- Experimental: measure the execution time of a program, machine-dependent
- Analytical: measure the number of steps of execution, machine-independent
- Evaluation metric of analyitical approach: Complexity
- Space complexity: Required memory size (static+dynamic)
- Static: predetermined size by a program
- Dynamic: varies depending on the number of input data
- Time complexity: The number of steps of execution. We care about both the average and the worst (longest).
- Both complexities are functions of the number of input data sizes, up to the leading order: Big O expression, O(n)
- Space complexity: Required memory size (static+dynamic)
Memory structure for a program
Once you execute a program, the program is loaded into the memory. Then the processor read the data and commands in the memory to operate. The memory allocated during this process has the following structure.
- Code: For code (execution commands). The memory size is fixed at compile. The memory is allocated (released) at the beginning (end) of the program.
- Data: For global, static variables. The memory size is fixed at compile. The memory is allocated (released) at the beginning (end) of the program.
- Heap: For dynamic memory allocation. The memory size changes over runtime.
- Stack: For local variables and parameters. The memory size is fixed at compile. The memori is allocated (released) when a function is called.
Where the choice of a data structure makes a difference?
- Memory size
- Speed at creation
- Speed of insertion/deletion of a node
- Speed of search
Array
- A set of an index and value pairs.
- Static memory allocation.
- Locate consecutively in memory, following the order of index.
- The size of the interval between two consecutive elements is equal to the size of the data type of the element.
- Pros
- Simple deployment.
- Fast access to data of any position using its index.
- Cons
- Allocated with a fixed memory size: potential waste of memory space when the input data size is unknown.
- Deletion and insertion of an element is inefficient: requires copy and paste of the entire subsequent elements.
Structure
- A collection of relevant data.
- Different members of a structure can have different data types.
- A structure can have a member whose data type is a pointer of itself (self referential structure). Such pointer member variable is called a “link” because it links the other structure instance(s) to its own.
Node
- An element of a data structure is called a node.
- Every node contains data and a link(s) to another node(s).
- Different data structures have different node structures.
Leave a comment