Abstract data types

What is it?

  1. A mathematical model of a data type
  2. Specifies:
    1. The operations supported
    2. The type of data stored (but not how it’s stored)
    3. Argument types and return types of these operations (but not how they are implemented)

Examples:

Array, List, Map, Queue, Set, Stack, Table, Tree, and Vector are ADTs.

Reading material

  1. https://web.mit.edu/6.005/www/fa15/classes/12-abstract-data-types/
  2. https://brilliant.org/wiki/abstract-data-types/

Representation Independence

Critically, a good abstract data type should be representation independent. This means that the use of an abstract type is independent of its representation (the actual data structure or data fields used to implement it), so that changes in representation have no effect on code outside the abstract type itself. For example, the operations offered by List are independent of whether the list is represented as a linked list or as an array.

You won’t be able to change the representation of an ADT at all unless its operations are fully specified with preconditions and postconditions, so that clients know what to depend on, and you know what you can safely change.

Important classes of abstract data types such as containers, dictionaries, and priority queues, have many different but functionally equivalent data structures that implement them. Changing the data structure does not change the correctness of the program, since we presumably replace a correct implementation with a different correct implementation. However, the new implementation of the data type realizes different tradeoffs in the time to execute various operations, so the total performance can improve dramatically. - Skiena

What are ADTs?

Abstract Data Type(ADT) is a data type, where only behavior is defined but not implementation.

Opposite of ADT is Concrete Data Type (CDT), where it contains an implementation of ADT.

Examples:

Array, List, Map, Queue, Set, Stack, Table, Tree, and Vector are ADTs. Each of these ADTs has many implementations i.e. CDT. The container is a high-level ADT of above all ADTs.

Abstract Data Types Overview

Abstract Data Type Other Common Names Commonly Implemented with data structure
List Sequence Array List, Linked List
Queue Array, Linked List
Set Tree Set, Hash Set, Linked Hash Set, Red-black Tree, Hash Table
Map Tree Map, Hash Map
Double-ended Queue Dequeue, Deque Array, Doubly-linked List
Stack Array, Linked List
Associative Array Dictionary, Hash Map, Hash, Map** Hash Table
Priority Queue Heap Heap

(Implementations for almost all of these* are provided by the Java Collections Framework in the java.util package.)

Example by considering a primitive data type

To answer this question, we’ll take a look at something we already know about: an integer data type. Virtually every modern programming language has some representation for an integer type. In Java, we’ll look at the primitive type int. An initialized Java int variable holds a 32-bit signed integer value between (-2 power 32) and (2 power 32) -1. So, we’ve established that an int holds data.

Operations can be performed on an int. We can assign a value to an int. We can apply the Java unary prefix and postfix increment and decrement operators (i++, ++i, i–, –i) to an int. We can use an int in binary operation expressions, such as addition, subtraction, multiplication, and division. We can test the value of an int, and we can use an int in an equality expression.

In performing these operations on an int variable, the user does not need to be concerned with the implementation of the operation. The internal mechanism by which these operations work is irrelevant. Examine the following simple code fragment, for example:

int i = 0;
i++;

The user knows that after the second statement is executed, the value of the i variable is 1. It isn’t important to know how the value became 1—just that in performing the increment in this example, i always will equal 1.

The user also does not need to know how the value is represented and stored internally. Things such as byte order again are irrelevant to the user in the preceding code example.

To summarize the built-in int data type, an int does the following:

  1. An int holds an item of data.
  2. Predefined operations can be performed on an int.
  3. An int has an internal representation that is hidden from the user in performing these operations.

If we consider the primitive data types in this light, it is easy to understand the definition we will give to ADTs. An ADT is defined as the following:

  1. An ADT is an externally defined data type that holds some kind of data. 1
  2. An ADT has built-in operations that can be performed on it or by it.
  3. Users of an ADT do not need to have any detailed information about the internal representation of the data storage or implementation of the operations.

So, in effect, an ADT is a data construct that encapsulates the same kinds of functionality that a built-in type would encapsulate. This does not necessarily imply that ADTs need to have addition or increment operations in order to be valid or useful, and it does not mean that any of the built-in operators will work with an ADT. It only means that the appropriate operations for the type created will be transparently available and that the user does not need to be concerned with the implementation details.

Why Use Abstract Data Types?

The String class provides a mechanism by which string literals may be stored, accessed, and manipulated. It provides methods with which we can compare, concatenate, copy, and convert strings. If a String class did not exist, string operations would have to be implemented from scratch each time they were needed.

A robust and reasonably generic String class gives us the capability to use these string operations at any time without having to reinvent the wheel each time. So ADTs provide us with code reusability. After we encapsulate the operations required to make a useful String class, we can reuse those facilities at any time in the future, with little or no additional development effort.

This also is the case with other ADTs. All ADTs in Java are implemented as classes. By designing our ADTs to be as generic as possible, we can reuse them in various situations and over several projects. Any time we develop an object or a group of related objects that can be reused, we reduce the overall development time of a project.

Container Abstract Data Types

Bags (Abstract Data Type)

Queues (Abstract Data Type)

Stacks (Abstract Data Type)

Dictionary Abstract Data Types

Tree Data Structure

  1. How to formulaically solve tree challenges?
  2. Binary Trees
  3. Binary Search Trees
  4. Height and Depth of Binary Tree
  5. Balanced Search Trees
  6. Priority Queue Data Structure or Heap Data Structure

Priority Queue Data Structure or Heap Data Structure

Hash Tables

Priority Queues

Graphs

  1. A collection of nodes and edges
    1. Each edge joins two nodes
    2. Edges can be directed or undirected
  2. Real-world “graph”:
    1. Road map
  3. Some uses:
    1. Tracking links between web pages
    2. Facebook
Operations Provided Efficiency
Find O(n)
Add/remove O(1) or O(n) or O(n2) Depends on implementation (time/space trade off)

Networks

  1. Graph whose edges have numeric labels
  2. Examples (labels):
    1. Road map (mileage)
    2. Airline’s flight map (flying time)
    3. Plumbing system (gallons per minute)
    4. Computer network (bits/second)
  3. Famous problems:
    1. Shortest path
    2. Maximum flow
    3. Minimal spanning tree
    4. Traveling salesman
    5. Four-coloring problem for planar graphs
  4. Understanding network data structures https://bookdown.org/markhoff/social_network_analysis/understanding-network-data-structures.html
  5. Methods for Network Analysis https://bookdown.org/markhoff/social_network_analysis/

Implementing collections

We now have two ways to represent collections of objects, arrays and linked lists. Arrays are built in to Java; linked lists are easy to build with standard Java records. These two alternatives, often referred to as sequential allocation and linked allocation, are fundamental.

Often, one particular ADT and implementation is best for the problem

  1. Which ADT to use?
    1. It depends. How do you access your data? By position? By key? Do you need to iterate through it? Do you need the min/max?
  2. Which implementation to use?
    1. It also depends. How important is fast access vs fast add/remove? Does the data need to be ordered in any way? How much space do you have?
  3. But real life is often messier…

Tags

  1. Abstraction
  2. Autoboxing
  3. Generics