A data structure is an expression of objects in computing, which describes an organization of data with rules of operations within. An algorithm is a finite collection of clear executions, composed of inputs, orders, repeats, conditions, and outputs. Programming is a realization of algorithms and data structures, and how to use them affects the performance of a system.

How to measure algorithm performance

  • Method
    • Experimental: measure the execution time of a program, machine-dependent
    • Analytical: measure the number of steps of execution, machine-independent
  • Evaluation metric of analyitical approach: Complexity
    • Space complexity: Required memory size (static+dynamic)
      • Static: predetermined size by a program
      • Dynamic: varies depending on the number of input data
    • Time complexity: The number of steps of execution. We care about both the average and the worst (longest).
    • Both complexities are functions of the number of input data sizes, up to the leading order: Big O expression, O(n)

Memory structure for a program

Once you execute a program, the program is loaded into the memory. Then the processor read the data and commands in the memory to operate. The memory allocated during this process has the following structure.

  • Code: For code (execution commands). The memory size is fixed at compile. The memory is allocated (released) at the beginning (end) of the program.
  • Data: For global, static variables. The memory size is fixed at compile. The memory is allocated (released) at the beginning (end) of the program.
  • Heap: For dynamic memory allocation. The memory size changes over runtime.
  • Stack: For local variables and parameters. The memory size is fixed at compile. The memori is allocated (released) when a function is called.

Where the choice of a data structure makes a difference?

  • Memory size
  • Speed at creation
  • Speed of insertion/deletion of a node
  • Speed of search

Array

  • A set of an index and value pairs.
  • Static memory allocation.
  • Locate consecutively in memory, following the order of index.
  • The size of the interval between two consecutive elements is equal to the size of the data type of the element.
  • Pros
    • Simple deployment.
    • Fast access to data of any position using its index.
  • Cons
    • Allocated with a fixed memory size: potential waste of memory space when the input data size is unknown.
    • Deletion and insertion of an element is inefficient: requires copy and paste of the entire subsequent elements.

Structure

  • A collection of relevant data.
  • Different members of a structure can have different data types.
  • A structure can have a member whose data type is a pointer of itself (self referential structure). Such pointer member variable is called a “link” because it links the other structure instance(s) to its own.

Node

  • An element of a data structure is called a node.
  • Every node contains data and a link(s) to another node(s).
  • Different data structures have different node structures.

Leave a comment