Binary Trees 101
An introduction to binary trees.
I was watching a friend stream the other day, and they were prepping for an upcoming interview. The topic being studied? Binary trees.
This got me thinking about the role binary trees play in today”s programming landscape. They”re no longer just academic exercises; they”ve become something of a rite of passage, popularized by coding challenge platforms like LeetCode and Codewars.
In reality, most developers won”t build a red-black tree every day on the job. Still, understanding their structures, properties, and traversal methods pays dividends across parsing, priority queues, file-system hierarchies, and beyond.
What is a Binary Tree?#
A binary tree is a non-linear and hierarchical data structure in which each node has at most two children, commonly called left and right. Unlike linear data structures (arrays/lists, linked lists, stacks, queues), binary trees organise data in a tree-like structure that allows for efficient searching, insertion, and deletion operations.
- Node: Holds a value (data)
- Root: The topmost node (entry point)
- Edge: The link between any two nodes
- Child: A node reachable directly from another node
- Leaf: A node with no children
- Subtree: Any node plus all its descendants
- Height of a Node: Number of edges from the node to the deepest leaf
- Depth of a Node: Number of edges from the root to the node.
- Height of a Tree: Height of the root node or the depth of the deepest node.
Root (A), two children (B,C), leaves (D,E,F).
Representation#
Before we take a look at the many types and algorithms for binary trees, it”s important to understand how they are represented in code. The representation you choose can have a major impact on memory usage, performance, and which algorithms are practical.
There are two common ways to represent a binary tree in practice:
- Pointer (Node) Representation:
Each node is an object or struct, and has explicit references (pointers) to its left and right child nodes. - Array (Implicit) Representation:
The tree is mapped onto an array or list, with parent/child relationships determined by index calculations. This approach is especially efficient for “complete” binary trees, such as heaps.
Let”s see how these look.
Pointer#
Each node contains a value, and pointers or references to its left and right children.
This is the most general and flexible way to represent binary trees of any shape. It is also easy to insert, remove, or restructure the tree.
The trade off is that extra memory is used for pointers and may be slower for large trees.
Array#
A binary tree can also be stored as an array, where for any node at index i:
- The left child is at 2i + 1
- The right child is at 2i + 2
This is especially efficient for complete binary trees (e.g. heaps). Array representation does not require pointers, so memory is saved, and are cache friendly.
They are not so good for sparse, skewed, or highly unbalanced trees. They are also hard to restructure dynamically.
Which representation to use?#
- Use a pointer for general purpose trees or when you know the tree can be irregular.
- Use an array when you know the tree will always be complete or balanced.
Types of Binary Tree#
By Shape#
The shape of a binary tree has a major impact on performance and on which algorithms apply:
Shape | Definition | Example Use |
---|---|---|
Full | Every node has 0 or 2 children (no nodes with just one child). | Some balanced tree invariants |
Complete | All levels are completely filled except possibly the last, which is filled left to right. | Heaps (priority queues) |
Perfect | All internal nodes have 2 children; all leaves are at the same depth/level. | Theoretical analyses |
Skewed | Every node has only one child; all left (left-skewed) or all right (right-skewed). | Worst-case BST degeneracy |
Degenerate | Every node has only one child; can be left or right. | Linked lists; unbalanced BSTs |
Balanced | Heights of left/right subtrees at every node differ by at most 1 (or a constant). | AVL, Red-Black, fast search |
Note: Trees can satisfy multiple properties (e.g. a perfect tree is also full, complete, and balanced).
By Property or Use#
Binary trees are also classified by the properties of their nodes or how they are used in algorithms:
Property/Type | Definition | Example Use |
---|---|---|
Binary Search Tree (BST) | For each node, all values in the left subtree are less than the node, and all values in the right subtree are greater. Fast search, insert, delete. | Dictionary/map, database |
AVL Tree | Self-balancing BST (height difference at most 1 at each node). | Faster, guaranteed search |
Red-Black Tree | Self-balancing BST with red/black color rules. | Sets/maps in C++, Java |
Threaded Tree | Unused pointers point to in-order predecessor/successor for fast traversal. | Fast in-order traversal |
Heap | Complete tree with heap property (min or max at the root). | Priority queues, heapsort |
Expression Tree | Internal nodes are operators, leaves are operands. | Arithmetic evaluation |
Huffman Tree | Binary tree for optimal prefix code (compression). | Compression algorithms |
Basic Properties#
When working with binary trees, there are two fundamental properties you’ll regularly compute:
Size#
The size of a binary tree is simply the total number of nodes it contains. A straightforward way to compute this is using recursion, counting the current node plus all nodes in its left and right subtrees:
Height#
The height of a binary tree is defined as the length of the longest path from the root down to a leaf. The height can be measured either by counting nodes or edges. Most commonly, nodes are counted. Here’s a simple recursive implementation:
These basic properties serve as the foundation for more advanced tree operations and analyses you’ll encounter.
Other Useful Properties#
-
Depth (of a node): Number of edges from the root node to a given node (root has depth 0).
-
Level: Nodes at the same depth belong to the same level (root at level 0).
-
Width: Maximum number of nodes at any single level of the tree.
-
Diameter: Length of the longest path between any two nodes (doesn’t necessarily pass through the root).
- Root: A (level 0, depth 0)
- Children: B, C (level 1)
- Leaves: D, E, F (level 2)
- Width: 3 nodes at the lowest level (D, E, F)
- Diameter: Longest path is D → B → A → C → F (4 edges)
Practical Analogy: File-System Structure#
One of the best practical analogies is your computer’s file-system hierarchy.
- Directories are nodes that can contain both files and other folders.
- Files are leaves; they cannot contain anything else.
While actual directories can contain many children, we can still model this structure using a binary tree. The classic way to do this is called the left-child, right-sibling representation.
Left-Child, Right-Sibling Representation#
When a directory contains more than two entries (files or subdirectories), we use a trick:
The left child points to the first item (file or directory) inside the directory.
The right child points to the next sibling at the same directory level.
So, you’d traverse left from the parent for the first child, then follow right children to reach the rest.
What’s Next?#
Now that you’ve got a solid grasp on binary tree shapes, properties, and representations, you’re ready for the next step: tree traversals.
In the upcoming post, we’ll explore:
- Recursive vs iterative approaches
- Pre-order, in-order, post-order, and level-order traversals
- When, and why, each traversal shines
If you’re comfortable with nodes and tree structures, you’ll breeze through traversal patterns and see how these foundational concepts play out in real algorithms.
Stay tuned for Part 2: Tree Traversals!