Data Structures — Containers for Data
Data Structures — Containers for Data
After variables and functions, data structures are the next topic you meet most often when starting out in programming. The simplicity and speed of code shifts dramatically depending on which container you choose for the same data. This article covers nine commonly used data structures — their definition, behavior, time complexity, standard implementations per language, and the spots where beginners often get confused.
1. About data structures
A convention for how data is laid out in memory and how it is accessed. Most modern languages provide the core data structures via their standard library, so you rarely implement them by hand. Still, the more you know about what happens inside, the better you can pick the right tool.
The classic texts most often recommended for fundamental data structures are Introduction to Algorithms by Cormen, Leiserson, Rivest, and Stein (first edition 1990, abbreviated CLRS) and Robert Sedgewick's Algorithms (first edition 1983).
2. Array
A bundle of contiguous memory slots. Random access by index is fast.
| Operation | Time |
|---|---|
| Index access | O(1) |
| Append (amortized) | O(1) |
| Insert / delete in the middle | O(n) — everything behind shifts by one |
| Search (unsorted) | O(n) |
By language — JS Array · Python list · Java ArrayList · C++ std::vector.
3. Linked list
Each node holds a reference to the next. Comes in singly, doubly, and circular variants.
| Operation | Time |
|---|---|
| Index access | O(n) |
| Insert at end / front (with pointer) | O(1) |
| Insert / delete in the middle (with node) | O(1) |
| Search | O(n) |
Compared to arrays, random access is slow and cache friendliness is poor, so dynamic arrays often win in modern environments. Still useful in spots like queues and LRU caches.
By language — Java LinkedList · C++ std::list. JS and Python have no standard linked list; either roll your own or use collections.deque, which plays a similar role.
4. Hash table
A structure that derives an index by feeding a key into a hash function. Average O(1) lookup and insert.
| Operation | Average | Worst |
|---|---|---|
| Lookup | O(1) | O(n) — when all keys hash into one bucket |
| Insert | O(1) | O(n) |
| Delete | O(1) | O(n) |
Collision handling is usually chaining (linked list) or open addressing (probe the next empty slot). Most modern standard implementations are designed to keep good distribution even under collisions.
By language — JS Map · object ({}) · Python dict · set · Java HashMap · HashSet · C++ std::unordered_map.
5. Stack
LIFO (Last In, First Out). Function calls, undo, depth-first search.
push(1) → [1]
push(2) → [1, 2]
push(3) → [1, 2, 3]
pop() → [1, 2], returns 3
Easy to implement with an array. JS array.push / array.pop · Python list.append / list.pop.
6. Queue
FIFO (First In, First Out). Job queue, breadth-first search.
enqueue(1) → [1]
enqueue(2) → [1, 2]
dequeue() → [2], returns 1
Removing from the front of an array is O(n), so it is inefficient for large queues. Python collections.deque and Java ArrayDeque give O(1) on both ends.
7. Tree
An acyclic graph where each node has zero or more children. Common variants:
- Binary Tree — at most two children.
- Binary Search Tree (BST) — smaller values on the left, larger on the right. Average lookup O(log n), worst O(n) (skewed tree).
- Self-balancing trees — AVL and red-black trees. Always O(log n).
- B-tree / B+tree — the de facto standard for DB indexes.
- Heap — parent is always greater than (or less than) its children. Foundation of priority queues. Insert and extract-max are both O(log n).
- Trie — strings share common prefixes. Autocomplete, dictionaries.
8. Graph
The general form of nodes and edges. Categorized by directed / undirected and whether edges are weighted.
Representations:
- Adjacency matrix — N × N. Good for dense graphs. Space O(N²).
- Adjacency list — a list of neighbors per node. Good for sparse graphs. Space O(V+E).
Notable algorithms — BFS · DFS · Dijkstra · Bellman-Ford · topological sort · MST (Kruskal · Prim).
9. Time complexity at a glance
| Data structure | Access | Search | Insert | Delete |
|---|---|---|---|---|
| Array | O(1) | O(n) | O(n) | O(n) |
| Linked list | O(n) | O(n) | O(1)* | O(1)* |
| Hash table | — | O(1) avg | O(1) avg | O(1) avg |
| Binary search tree | O(log n) avg | O(log n) avg | O(log n) avg | O(log n) avg |
| Balanced tree (AVL · RB) | O(log n) | O(log n) | O(log n) | O(log n) |
| Heap | — | O(n) | O(log n) | O(log n) |
(* when you already hold a reference to the node)
10. Other paths
Specialized data structures:
- Bloom filter — probabilistic membership test. False positives allowed, no false negatives.
- Skip list — used instead of balanced trees. Redis sorted set.
- Segment tree · Fenwick tree (BIT) — range sum, range minimum, etc.
- Disjoint Set (Union-Find) — group merging. A part inside Kruskal.
- LRU cache — combination of hash + doubly linked list.
11. Common shapes
// JS
const arr = [1, 2, 3];
arr.push(4);
const map = new Map();
map.set("k", 1);
map.get("k");
map.has("k");
const set = new Set([1, 2, 2, 3]); // {1, 2, 3}
const stack = [];
stack.push(1); stack.push(2);
stack.pop();
# Python
arr = [1, 2, 3]
arr.append(4)
d = {"k": 1}
"k" in d
s = {1, 2, 2, 3}
from collections import deque
q = deque([1, 2, 3])
q.append(4); q.popleft()
import heapq
h = []; heapq.heappush(h, 3); heapq.heappush(h, 1)
heapq.heappop(h) # 1 (min heap)
// Java
ArrayList<Integer> arr = new ArrayList<>();
arr.add(1);
HashMap<String, Integer> map = new HashMap<>();
map.put("k", 1);
HashSet<Integer> set = new HashSet<>();
ArrayDeque<Integer> queue = new ArrayDeque<>();
queue.offer(1); queue.poll();
PriorityQueue<Integer> pq = new PriorityQueue<>(); // min heap
TreeMap<String, Integer> sorted = new TreeMap<>(); // sorted keys
12. Common stumbles
JS object vs Map — {} only takes string and symbol keys, can be polluted via the prototype, and the order guarantee is weak. For data-style key-value, prefer Map.
Python list pop from front — list.pop(0) is O(n). Use deque for queues.
Java Vector · Hashtable — synchronized legacy. Use ArrayList and HashMap instead. For concurrency, ConcurrentHashMap.
Hash collision assumptions — when keys come from user input, intentional collision attacks (HashDoS) are possible. Most modern languages mitigate with random hash seeds, but the assumption is worth knowing.
Mixing up trees and hashes — "hash tables are fast, why bother with trees?" — when you need sorted traversal, range queries, or consistent worst-case time, trees win.
Memory — fast data structures cost more memory. Mobile and embedded shift the balance.
Closing thoughts
Data structures are containers for data. You rarely implement them yourself, but the more you know what happens inside, the better tools you can pick. Arrays, hash tables, trees (heap · BST · B-tree), and graphs cover 90% of daily work. The memory vs speed trade-off shifts with the environment (server / mobile / embedded).
Next
- big-o
- design-patterns
We refer to Introduction to Algorithms (CLRS) · Algorithms (Robert Sedgewick) · VisuAlgo · Data Structure Visualizations (University of San Francisco) · LeetCode · HackerRank · Baekjoon · NeetCode 150 · MDN JS Map · Set · Python collections module · Java Collections Framework.