Grouping image tokens is an intermediate step needed to arrive at meaningful image representation and summarization. Usually, perceptual cues, for instance, gestalt properties inform token grouping. However, they do not take into account structural continuities that could be derived from other tokens belonging to similar structures irrespective of their location. We propose an image representation that encodes structural constraints emerging from local binary patterns (LBP), which provides a long-distance measure of similarity but in a structurally connected way. Our representation provides a grouping of pixels or larger image tokens that is free of numeric similarity measures and could therefore be extended to nonmetric spaces. The representation lends itself nicely to ubiquitous image processing applications such as connected component labeling and segmentation. We test our proposed representation on the perceptual grouping or segmentation task on the popular Berkeley segmentation dataset (BSD500) that with respect to human segmented images achieves an average -measure of 0.559. Our algorithm achieves a high average recall of 0.787 and is therefore well-suited to other applications such as object retrieval and category-independent object recognition. The proposed merging heuristic based on levels of singular tree component has shown promising results on the BSD500 dataset and currently ranks 12th among all benchmarked algorithms, but contrary to the others, it requires no data-driven training or specialized preprocessing.