Abstract data types
- What is it?
- Examples:
- Reading material
- Representation Independence
- High level overview
- What are ADTs?
- Why Use Abstract Data Types?
- APIs for Abstract Data Types
- Bags (Abstract Data Type)
- Queues (Abstract Data Type)
- Stacks (Abstract Data Type)
- Trees (Abstract data type)
- Graphs
- Networks
- Implementing collections
- Often, one particular ADT and implementation is best for the problem
- Tags
What is it?
- A mathematical model of a data type
- Specifies:
- The operations supported
- The type of data stored (but not how it’s stored)
- Argument types and return types of these operations (but not how they are implemented)
Examples:
Array, List, Map, Queue, Set, Stack, Table, Tree, and Vector are ADTs.
Reading material
- https://web.mit.edu/6.005/www/fa15/classes/12-abstract-data-types/
- https://brilliant.org/wiki/abstract-data-types/
Representation Independence
Critically, a good abstract data type should be representation independent. This means that the use of an abstract type is independent of its representation (the actual data structure or data fields used to implement it), so that changes in representation have no effect on code outside the abstract type itself. For example, the operations offered by List are independent of whether the list is represented as a linked list or as an array.
You won’t be able to change the representation of an ADT at all unless its operations are fully specified with preconditions and postconditions, so that clients know what to depend on, and you know what you can safely change.
High level overview
Abstract Data Type(ADT) is a data type, where only behavior is defined but not implementation.
Opposite of ADT is Concrete Data Type (CDT), where it contains an implementation of ADT.
Examples:
Array, List, Map, Queue, Set, Stack, Table, Tree, and Vector are ADTs. Each of these ADTs has many implementations i.e. CDT. The container is a high-level ADT of above all ADTs.
Abstract Data Types Overview
| Abstract Data Type | Other Common Names | Commonly Implemented with data structure |
|---|---|---|
| List | Sequence | Array List, Linked List |
| Queue | Array, Linked List | |
| Set | Tree Set, Hash Set, Linked Hash Set, Red-black Tree, Hash Table | |
| Map | Tree Map, Hash Map | |
| Double-ended Queue | Dequeue, Deque | Array, Doubly-linked List |
| Stack | Array, Linked List | |
| Associative Array | Dictionary, Hash Map, Hash, Map** | Hash Table |
| Priority Queue | Heap | Heap |
(Implementations for almost all of these* are provided by the Java Collections Framework in the java.util package.)
What are ADTs?
To answer this question, we’ll take a look at something we already know about: an integer data type. Virtually every modern programming language has some representation for an integer type. In Java, we’ll look at the primitive type int. An initialized Java int variable holds a 32-bit signed integer value between (-2 power 32) and (2 power 32) -1. So, we’ve established that an int holds data.
Operations can be performed on an int. We can assign a value to an int. We can apply the Java unary prefix and postfix increment and decrement operators (i++, ++i, i–, –i) to an int. We can use an int in binary operation expressions, such as addition, subtraction, multiplication, and division. We can test the value of an int, and we can use an int in an equality expression.
In performing these operations on an int variable, the user does not need to be concerned with the implementation of the operation. The internal mechanism by which these operations work is irrelevant. Examine the following simple code fragment, for example:
int i = 0;
i++;
The user knows that after the second statement is executed, the value of the i variable is 1. It isn’t important to know how the value became 1—just that in performing the increment in this example, i always will equal 1.
The user also does not need to know how the value is represented and stored internally. Things such as byte order again are irrelevant to the user in the preceding code example.
To summarize the built-in int data type, an int does the following:
- An int holds an item of data.
- Predefined operations can be performed on an int.
- An int has an internal representation that is hidden from the user in performing these operations.
If we consider the primitive data types in this light, it is easy to understand the definition we will give to ADTs. An ADT is defined as the following:
- An ADT is an externally defined data type that holds some kind of data. 1
- An ADT has built-in operations that can be performed on it or by it.
- Users of an ADT do not need to have any detailed information about the internal representation of the data storage or implementation of the operations.
So, in effect, an ADT is a data construct that encapsulates the same kinds of functionality that a built-in type would encapsulate. This does not necessarily imply that ADTs need to have addition or increment operations in order to be valid or useful, and it does not mean that any of the built-in operators will work with an ADT. It only means that the appropriate operations for the type created will be transparently available and that the user does not need to be concerned with the implementation details.
Why Use Abstract Data Types?
The String class provides a mechanism by which string literals may be stored, accessed, and manipulated. It provides methods with which we can compare, concatenate, copy, and convert strings. If a String class did not exist, string operations would have to be implemented from scratch each time they were needed.
A robust and reasonably generic String class gives us the capability to use these string operations at any time without having to reinvent the wheel each time. So ADTs provide us with code reusability. After we encapsulate the operations required to make a useful String class, we can reuse those facilities at any time in the future, with little or no additional development effort.
This also is the case with other ADTs. All ADTs in Java are implemented as classes. By designing our ADTs to be as generic as possible, we can reuse them in various situations and over several projects. Any time we develop an object or a group of related objects that can be reused, we reduce the overall development time of a project.
There are certain guidelines that need to be followed to make ADT’s reusable. In this book, we are primarily concerned with container ADTs. A container object’s primary purpose is to hold other objects. The containers will hold various types of data.
To make our containers reusable, we need to make them generic. Generic, in this sense, means that the containers need to follow three rules:
- Containers need to be able to store data of any kind.
- Containers should provide a public interface that encompasses only behaviors that would be useful in a general sense.
- Containers should be kept insulated from application-specific considerations.
APIs for Abstract Data Types
Several fundamental data types involve collections of objects. Specifically, the set of values is a collection of objects, and the operations revolve around adding, removing, or examining objects in the collection. In this section, we consider three such data types, known as the bag, the queue, and the stack. They differ in the specification of which object is to be removed or examined next.
Bags, queues, and stacks are fundamental and broadly useful. We use them in implementations throughout the book. Beyond this direct applicability, the client and imple- mentation code in this section serves as an introduction to our general approach to the development of data structures and algorithms.
One goal of this section is to emphasize the idea that the way in which we represent the objects in the collection directly impacts the efficiency of the various operations. For collections, we design data structures for representing the collection of objects that can support efficient implementation of the requisite operations.
A second goal of this section is to introduce generics and iteration, basic Java constructs that substantially simplify client code. These are advanced programming-language mechanisms that are not necessarily essential to the understanding of algorithms, but their use allows us to develop client code (and implementations of algorithms) that is more clear, compact, and elegant than would otherwise be possible.
A third goal of this section is to introduce and show the importance of linked data structures. In particular, a classic data structure known as the linked list enables implementation of bags, queues, and stacks that achieve efficiencies not otherwise possible. Understanding linked lists is a key first step to the study of algorithms and data structures.
For each of the three types, we consider APIs and sample client programs, then look at possible representations of the data type values and implementations of the data-type operations. This scenario repeats (with more complicated data structures) throughout this book. The implementations here are models of implementations later in the book and worthy of careful study.
As usual, we begin our discussion of abstract data types for collections by defining their APIs, shown below. Each contains a no-argument constructor, a method to add an item to the collection, a method to test whether the collection is empty, and a method that returns the size of the collection. Stack and Queue each have a method to remove a particular item from the collection. Beyond these basics, these APIs reflect two Java features that we will describe on the next few pages: generics and iterable collections.
Bags (Abstract Data Type)
Queues (Abstract Data Type)
Stacks (Abstract Data Type)
Trees (Abstract data type)
Graphs
- A collection of nodes and edges
- Each edge joins two nodes
- Edges can be directed or undirected
- Real-world “graph”:
- Road map
- Some uses:
- Tracking links between web pages
| Operations Provided | Efficiency |
|---|---|
| Find | O(n) |
| Add/remove | O(1) or O(n) or O(n2) Depends on implementation (time/space trade off) |
Networks
- Graph whose edges have numeric labels
- Examples (labels):
- Road map (mileage)
- Airline’s flight map (flying time)
- Plumbing system (gallons per minute)
- Computer network (bits/second)
- Famous problems:
- Shortest path
- Maximum flow
- Minimal spanning tree
- Traveling salesman
- Four-coloring problem for planar graphs
Implementing collections
To address the issue of implementing Bag, Stack and Queue, we begin with a simple classic implementation, then address improvements that lead us to implementations of the APIs articulated above.
See FixedCapacityStackOfStrings.java
Generics:
See FixedCapacityStack.java
Array resizing:
See ArrayResizing.java
Iteration:
See ReverseArrayIterator.java
The implementations of bags, queues, and stacks that support generics and iteration that we considered in this package provide a level of abstraction that allows us to write compact client programs that manipulate collections of objects. Detailed understanding of these ADTs is important as an introduction to the study of algorithms and data structures for three reasons.
- First, we use these data types as building blocks in higher-level data structures throughout this book.
- Second, they illustrate the interplay between data structures and algorithms and the challenge of simultaneously achieving natural performance goals that may conflict.
- Third, the focus of several of our implementations is on ADTs that support more powerful operations on collections of objects, and we use the implementations here as starting points.
We now have two ways to represent collections of objects, arrays and linked lists. Arrays are built in to Java; linked lists are easy to build with standard Java records. These two alternatives, often referred to as sequential allocation and linked allocation, are fundamental. Later in the book, we develop ADT implementations that combine and extend these basic structures in numerous ways. One important extension is to data structures with multiple links. For example, our focus in Sections 3.2 and 3.3 is on data structures known as binary trees that are built from nodes that each have two links. Another important extension is to compose data structures: we can have a bag of stacks, a queue of arrays, and so forth. For example, our focus in Chapter 4 is on graphs, which we represent as arrays of bags. It is very easy to define data structures of arbitrary complexity in this way: one important reason for our focus on abstract data types is an attempt to control such complexity.
Our treatment of BAGS, queues, and STACKS in this section is a prototypical example of the approach that we use throughout this book to describe data structures and algorithms. In approaching a new applications domain, we identify computational challenges and use data abstraction to address them, proceeding as follows:
- Specify an API.
- Develop client code with reference to specific applications.
- Describe a data structure (representation of the set of values) that can serve as
- the basis for the instance variables in a class that will implement an ADT that
- meets the specification in the API.
- Describe algorithms (approaches to implementing the set of operations) that
- can serve as the basis for implementing the instance methods in the class.
- Analyze the performance characteristics of the algorithms.
Often, one particular ADT and implementation is best for the problem
- Which ADT to use?
- It depends. How do you access your data? By position? By key? Do you need to iterate through it? Do you need the min/max?
- Which implementation to use?
- It also depends. How important is fast access vs fast add/remove? Does the data need to be ordered in any way? How much space do you have?
- But real life is often messier…