Introduction

Block construction tasks, such as the original Kohs Block Design Test (Kohs, 1923) and numerous generations of the Block Design subtests of the Wechsler cognitive batteries for adults and children—Wechsler Adult Intelligence Scales (4th ed.; WAIS-IV; Wechsler, 2008) and Wechsler Intelligence Scales for Children (5th ed.; WISC-V; Wechsler, 2014)—have long played a role in neuropsychological and psychoeducational assessment of visual-spatial abilities and visual-motor integration. The WAIS-IV and WISC-V Block Design subtests use 1-in. plastic cubes with two red sides, two white sides, and two sides that are diagonally divided half-red and half white (see Fig. 1). Four or nine blocks are arranged to produce a 2 × 2 or 3 × 3 design depicted in a stimulus book. Design construction is timed in the standardized administration, but time limits may be adjusted to explore the contribution of motor speed or other factors to test performance (Flanagan & Alfonso, 2017; Kaufman & Lichtenberger, 1999; Sattler, 2008; Sattler & Ryan, 2009). Psychometric properties of block design tasks make them appealing to clinicians and researchers alike—for example, WAIS-IV Block Design is highly reliable (rxx = .87), loads highly on g (r = .73), and retains ample or adequate specificity at all age ranges (Sattler, 2008).

Fig. 1
figure 1

Block Design task

Stimulus parameters that contribute to block design performance have been long and well documented. These include perceptual cohesiveness, set size uncertainty, and partial components (i.e., local gestalts). These are described below and illustrated in Fig. 2.

Fig. 2
figure 2

Known Block Design stimulus parameters: Response uncertainty (a), perceptual cohesiveness (b), and partial components (c)

Uncertainty (U)

Informational uncertainty (U) has been defined as the sheer amount of information (number of “bits”) in a display, which is determined by the total number of its possible variants (Royer, 1977; Royer et al., 1984). Consider a 3 × 3 block design, as depicted in Fig. 2a, with four solid block elements (all red and all white) and five half-red elements. Choice of any solid block is constrained by the fact that only solid red and solid white faces are available; therefore, the number of binary choices, or “bits” of information, represented by this choice is log2 (two possible choices) = 1 bit. Choice of a half-red block is constrained by the number of orientations possible with this type of block face—red in the upper left, upper right, lower left, or lower right. Thus, the information represented by this type of block element is equal to log2 (four alternatives), or 2 bits of information. The number of bits may also be thought of as the number of sequential binary choices necessary to completely define the display in question. In Fig. 2a, the sum of the four 1-bit elements and the five 2-bit elements is 4(1) + 5(2) = 14 bits, meaning the entire design may be completely defined by 14 binary decisions. Uncertainty, then, may be thought of as the inverse of the probability of selecting the given display at random from among all of its possible (214) variants—the larger the uncertainty, the greater the amount of information that must be defined in order to reproduce the display.

Perceptual cohesiveness (PC)

The block design task has long been conceptualized as a measure of the solver’s capacity for visual analysis and synthesis (Kaufman, 1990; Sattler, 2008). To select the correct block face, the solver must first visually isolate one block element from the rest of the design, and then rotate the block and place it appropriately. If the edges of the target block elements are obscured by an abutting same-colored block (see Fig. 2b), the work of segregating the target element is made more effortful—further, lacking the capacity for visual analysis renders the task nearly impossible. It has been long established that designs with more coherent edges (CEs, or the number of same-colored abutting edges in a design) tend to produce greater error rate and latency in producing a copy with blocks (Akshoomoff & Stiles, 1996; Royer, 1984; Royer et al., 1984; Royer & Weitzel, 1977). Further evidence is provided by the effects of “cueing” cohesive edges. When a design with high “perceptual cohesiveness” (PC—i.e., with numerous CEs) is overlaid with a grid corresponding to the edges of constituent block elements, effects of PC attenuate (Akshoomoff & Stiles, 1996). This effect manifests as an interaction between PC and cueing in studies that manipulate both variables (e.g., Miller & Skillman, 2008).

Partial components

Rozencwajg (1991) and Rozencwajg and Corroyer (2001) have explored the effects of multiblock subcomponents of block design stimuli (e.g., two-block “trapezoids” or four-block “diamonds”; see Fig. 2c), on test performance. These researchers demonstrate the presence of these visually salient subcomponents (e.g., stripes, squares, diamonds, chevrons) may guide the construction strategies of the most adept solvers, who tend to construct these smaller design elements first, and then integrate them into the larger construction.

Partial components excluded, little research on block design has focused on differences in “appearance” as stimulus-based determinants of performance. Both U and PC are operationalized independently of design appearance (i.e., designs which “look” very different may be identical with respect to U and PC). Partial components are defined by same-colored regions, meaning PC correlates with the presence/absence of partial components. As currently defined, partial components are absent where PC is nil. However, it is possible to identify “shapes” within block design stimuli that involve no coherent edges. For example, Design M (see Fig. 3), with no coherent edges (PC = 0), includes two shapes (upper left and lower right) that might be described as “pinwheels.” These distinctive shapes seem to “pop-out” based on their symmetry rather than the contours suggested by coherent regions of red or white.

Fig. 3
figure 3

Designs with no perceptual cohesiveness (PC = 0) and high uncertainty (U = 18 bits). PC is defined as the number of same-colored abutting edges, or coherent edges (CE). Uncertainty is the sum of the log2 of the number of possible placement choices for each of the blocks in the design. Multiple designs were generated by rotating the seven prototypes

An unexplored subset of block designs, those with no coherent edges (PC = 0) and only half-red block elements (maximum U), may provide the means for identifying other parameters impacting performance. By holding PC and U constant, we would expect all mean performance differences between designs to disappear if the current set of parameters is adequate to fully explain stimulus demands of the task. However, if different designs with identical U and PC elicit different test performance, it would be reasonable to conclude that parameters other than PC and U influence task difficulty. Further, with PC minimized, cueing should have little impact on performance, unless cueing serves a role beyond that of assisting the visual analysis process (i.e., counteracting effects of PC). Therefore, if edge cueing elicits different performance where there are no coherent edges, it would be reasonable to conclude that cueing does something beyond the disambiguation of block edges.

Methods

Apparatus

To determine whether U, PC, and partial components comprehensively explain block design performance, we developed a set of 3 × 3 block designs that were identical with respect to U and PC, and precluded partial components (see Fig. 3). Seven nine-block designs were developed with high uncertainty (18 bits), low perceptual cohesiveness (no coherent edges), and, therefore, no partial components as defined by Rozencwajg (1991). Using all possible rotational variants of these initial seven designs, 28 designs were generated (see Fig. 3). To assess the impact of edge cueing, a duplicate set of these 28 designs were developed with a grid overlay, for a total of 56 designs. All designs were printed 3 × 3 in black and red on laminated cards, and presented in a binder. Blocks were 1-in. red and white plastic cubes, identical to those used in the Wechsler tests.

Participants

As part of a larger study of block-design performance, either cued or uncued designs were administered to 65 undergraduates at a large Midwestern university. Assignment to cued or uncured condition was random. Participants were screened for history of cognitive impairment, history of neurological insult, and visual-motor impairments. Gender was balanced across groups.

The number of participants was considered more than adequate for detecting main effects of the repeated measures factors of the analysis of variance (ANOVA) described below (i.e., seven different designs, four different rotations). G*Power 3 (https://download.cnet.com/G-Power/3000-2054_4-10647044.html) indicated that for an a priori effect of eta2 = .10 (Cohen’s f ~.33), 20 participants would be sufficient for, or exceed, power = .80 with a Type I error rate of .01, for both the seven-level and four-level repeated-measures factors.

Procedure

Participants completed either cued (N = 31) or uncued (N = 34) designs in random order, and were assigned to the cued or uncued condition randomly. Participants were tested individually by either the lead author or an undergraduate researcher trained by the lead author. Participants were introduced to the blocks and the printed design models, and instructed regarding construction. Each participant constructed one sample 3 × 3 block design, to verify comprehension of the instructions. Participants were told to work as quickly and accurately as possible and to alert the examiner when they had completed a design. Latency for a correct assembly was recorded (i.e., the participant was asked to keep working if their finished construction was incorrect). Timing began as soon as the examiner turned the page to a new design, and stopped immediately once the participant declared they had finished, and the design was correctly constructed. These data were log-transformed to correct for significant skew (Tabachnick & Fidell, 2007).

Results

A mixed-design 2 × 7 × 4 ANOVA was performed with between-subjects factor of cueing condition (cued vs. uncued) and within-subjects factors of design (seven different designs displayed in columns of Fig. 3) and rotation (four rotational variants per design; rows of Fig. 3). The log-transformed latency scores served as DV. Results appear in Table 1.

Table 1 Source table for ANOVA used to assess effects of design appearance across seven high U, low PC designs with four rotational variants each (see Fig. 3)

Main effects of cueing condition were nonsignificant, consistent with the absence of same-colored abutting edges in the designs. Main effects of design type, however, were significant, F(6, 378) = 12.252, p < .001, while main effects of rotation were not, F(3, 162) = 0.682, p = .564. Though main effects of rotation were not significant, a significant Design × Rotation interaction was observed, F(18, 1134) = 1.777, p = .023. No other main effects or interactions were significant.

Post hoc Tukey HSD tests were performed, all with α = .01. Tests comparing design means (seven means, N = 65) revealed the main effect of design is due to the significant (p < .01) difference between the means for designs W and M. For the Design × Rotation interaction, Tukey tests (four means, N = 65) indicate the interaction between design and rotation is due solely to the variation among the rotation means for design X (see Fig. 4)—that is, unlike the other six designs, design X elicits significant (p < .01) performance differences when rotated. Best performance occurs at 0° and 180° for this design.

Fig. 4
figure 4

Estimated marginal means from the model. Each line represents one of the seven designs. Markers, from left to right, represent mean performance at 0, 90, 180, and 270 degrees of rotation. Differences between rotational means are only significant (p < .01) for Design X. and mean latencies are only significantly (p < .01) different when Designs W and M are compared

Post hoc analyses

The prior analysis indicated that, despite the matching of the seven designs (and their rotational variants) for PC and U, performance differences were elicited on the basis of the designs’ appearance. We therefore hypothesized possible additional stimulus parameters which might account for the observed effects.

Inspection of the seven designs suggested two possible sources of performance difference: (a) the presence of local and global symmetries within some of the designs and (b) coherent regions that may have served to reduce overall uncertainty by introducing informational redundancy.

Local and global symmetry

Garner and Clement (1963) defined symmetry in visual dot-matrix patterns by quantifying their rotation and reflection set sizes—that is, the number of unique patterns that could be produced by a set of geometric transformations (e.g., rotation or reflection) of a pattern. This was refined by Palmer (1991), who identified symmetry subtypes (e.g., right diagonal axial symmetry, 90-degree rotational symmetry) by the type and number of geometric transformations that leaves the original pattern unchanged. Palmer demonstrated these various symmetry subtypes, when measured at the local or global level, corresponded to rater’s mean assessment of the pattern’s “figural goodness,” defined as the extent to which the observed pattern is “regular, orderly, and simple.” Figural goodness is of interest, because “good” figures are more easily matched, learned, remembered, and verbally described (Palmer, 1999).

While Royer (1977) explored the effects of symmetry in his block designs, there are two limitations to his approach. First, designs were constructed based on Garner and Clement’s (1963) dot stimuli, and therefore varied by their reflection and rotation subset size. This resulted in only variation in global symmetries with local regions of symmetry ignored. Further, groups of different rotation and reflection subset sizes also varied with respect to their mean perceptual cohesiveness, which may have accounted for the significant linear effects found. The designs used in the current study account for both local and global symmetries, and control for the potential effects of PC.

Palmer’s (1991) model for the additive contribution of various symmetry types at the local or global level to judgments of figural goodness is defined by:

$$S=k+\sum^{n} {W}_{o} \cdot {r}_{o},$$
(1)

where S is the individual’s subjective rating of figural goodness, w is the relative importance (weight) of a particular symmetry subtype, and r is the radius of the field of symmetry. When applied to a 3 × 3 block design (see Fig. 5), r is constrained by the circular radii possible with 2 × 2 or 3 × 3 block combinations (i.e., only square groupings of blocks accommodate a circular area, πr2, and only two sizes of square groupings of blocks are possible). As a result, there are only five possible locations for the symmetry subtypes described by Palmer: upper left (r1) upper right (r2), lower left (r3) and lower right (r4), and the entire display (rg). Because radius rg is 1.5× longer than the other four radii, Equation 1 predicts that symmetries defined within this radius will be proportionately more influential.

Fig. 5
figure 5

Loci of potential symmetry in 3 × 3 block designs

Palmer (1991) has also estimated the relative weights (w in Equation 1) of specific symmetry subtypes. It is therefore possible to identify local and global symmetries within each of the seven block designs (see Fig. 6) and code each location by the weight of symmetry at each location; the sum should provide a rough estimate of the subjective impact of symmetry in block designs.

Fig. 6
figure 6

Examples of local symmetry subtypes appearing in designs and their mean figural goodness ratings (in parentheses) found by Palmer (1991) and adapted to the current block designs. R = right-diagonal axial symmetry; L = left diagonal axial symmetry; R-L = right and left diagonal axial symmetry; C4 = 90-degree, 180-degree, and 270-degree rotational symmetry; I = identity (no symmetry). Global symmetries were either I, R, or L

Coherent regions and uncertainty

Some block designs are trivially simple to construct. Take for example a 3 × 3 design with no PC (CE = 0) and high uncertainty (U = 18), but with all of the half-red constituent blocks identically oriented (e.g., the red triangle is in the upper right for all nine elements of the design). Based on the previous definition of U—that is, summation of the uncertainty for each block type—this design should have a high uncertainty. However, the solution is trivial because once the orientation of one element has been identified; the orientation of the other elements is redundant. In effect, this design has U = 2, rather than U = 18, because the identity of all the blocks may be determined by choosing the orientation of one block.

While none of the designs depicted in Fig. 3 fit this description, many of them incorporate regions in which block elements are identical. For example, design M2 has four identical elements at upper right. We might consider this coherent region of 2-bit elements as representing 2 bits overall, since knowing the identity of one of the four blocks renders the identity of the other three redundant, thereby reducing the overall uncertainty of the design. By defining coherent regions as two or more adjacent and identical blocks, three regions may be found for Design M2—the four-block region already identified, and two 2-block regions at lower right and upper left. Summing the reduced uncertainties, based on these coherent regions, yields a “Least-U” score of 2 (four blocks in the upper right) + 2 (two blocks at lower right) + 2 (two blocks at upper left) + 2 (the lone block in the lower left corner) = 8 bits (cf. the uncertainty as defined by Royer, 1977, is 18 bits). The Least-U estimate reflects the relative simplicity of this design, given the informational redundancy of many of its parts.

Exploratory analysis of coherent region and local/global symmetry effects

Each of the seven designs and their rotational variants was recoded for (a) uncertainty corrected for coherent regions (“Least-U”), and (b) local/global symmetry as defined by Palmer. For symmetry scores, mean ratings of figural goodness (Palmer, 1991; see Fig. 6) were assigned to each of five locations within a design (rg, r1, r2, r3, r4), and summed. For symmetries at the rg location, the assigned score was multiplied by 1.5, reflecting the larger radius at the global level. The coded scores for Least-U and local-global symmetry are displayed in Table 2.

Table 2 Examples of calculated total symmetry and Least-U scores for the seven designs (scores for rotational variants not shown)

A hierarchical regression analysis was performed to explore the unique contribution of local and global symmetries to block design performance, relative to the unique contributions of coherent regions and cueing effects. Mean latency scores for each design served as DV in the regression, while the total symmetry score and least uncertainty (Least-U), based on coherent regions served as IVs, along with the cueing condition of the design. Cueing condition was added in Step 1, with cueing and Least-U entered together in Step 2. To determine the effects of symmetry above and beyond that of cueing and Least-U, symmetry was entered last. Controlling for effects of Least-U was necessary given the observed correlation between symmetry and Least-U (r = .82). Cueing effects were included in the analysis because of the observed differences in mean block design performance elicited by cued and uncued designs, t(54) = 2.38, p = .022.

As indicated in Table 3, the model including both Least-U and cueing condition yielded good prediction of mean performance scores (R2 = .449). However, the addition of symmetry scores to the model yielded no improvement in prediction. Further, symmetry was the only predictor yielding nonsignificant β weights, while all other predictors were significant (p < .05) in all models. These data suggest that while coherent regions contributed to performance differences, symmetry, based on previous estimates of figural goodness, did not add any predictive power distinct from the contribution of Least-U. Figure 7 displays best-fit lines for cued and uncued designs when Least-U is the only parameter in the model. Because cueing was a significant predictor in the regression analysis, separate fit lines are displayed for cued and uncued conditions (28 cases each).

Table 3 Comparison of three linear models; DV is the mean latency for each of 28 cued and 28 uncued designs with no PC and high U
Fig. 7
figure 7

Two slopes—one for edges cued (solid line) and one for edges uncued (dotted line) designs—with Least-U, based on coherent regions, as IV and mean latency scores for 28 designs as DV. In the cued designs, lines were superimposed to indicate where block edges should be in the assembled design

Discussion

In the current study, we sought evidence of unidentified stimulus-based performance differences in block design testing. Performance differences were observed even holding perceptual cohesiveness and uncertainty constant (main effects of design). Moreover, these performance differences cannot be explained by the presence of partial components, since perceptual cohesiveness was nil, and no partial components, as defined by Rozencwajg (1991) were possible.

Coherent regions

Post hoc analyses revealed that main effects of design type were due to the significant differences between Designs W and M. Referencing Table 2, Design W has the smallest Least-U score of all seven designs (4), while M has the highest (18)—that is, when coherent regions are accounted for, Design M contains the most nonredundant information, while W contains the least. Similarly, the symmetry score for Design M is the highest among all designs (30.1), while the symmetry score for W (10.1) is the lowest, except for Design X (3.3).

The regression analysis indicated that symmetry score does not aid in prediction of performance above and beyond Least-U based on coherent regions. These analyses suggest the apparent performance differences are due to the presence of larger coherent regions in the “easiest” designs (e.g., W, X). The interaction between Design and Rotation, resulting from the effects of rotation on latency for Design X, also support this conclusion. Design X’s coherent regions are defined by three horizontal bands when rotated 0° and 180°. However, when rotated 90 and 270 degrees, these coherent bands are vertical (see Fig. 3). Prior research on solution styles and partial components (Rozencwajg, 1991) indicates that when partial components are absent, most solvers default to a left-to-right block placement sequence, an observation confirmed in the current study. In other words, solvers tend to place blocks in the same sequence in which they would read words on a page, starting at the left of the top row and finishing at the right side of the lowest row. It is possible that when the coherent regions of Design X were oriented parallel to the left–right placement sequence (i.e., in the 0° and 180° positions), solvers using the left–right strategy would readily benefit from the redundancy within the bands. However, when the coherent bands were vertically oriented (in the 90° and 270° positions), the left–right strategy would be at odds with the coherent bands’ orientation. The significantly better performance on Design X at the 0-degree and 180-degree positions is consistent with this explanation, though further experimentation would be needed for confirmation.

While symmetry, defined by previous ratings of the figural goodness of various symmetry types, did not seem to relate to performance differences in the designs with no cohesiveness and maximum uncertainty, symmetry scores were derived from prior research using a different task. Further experiments, in which local and global symmetry are systematically varied may more conclusively address the possible contribution of symmetry at the local and global level, if any.

Implications for the block design task

A complete understanding of the stimulus parameters underlying block design difficulty has several advantages. Test developers can more accurately posit the proper sequencing of block design items in multi-item presentations. Where block designs appear in published cognitive tests, they are, to optimize efficient administration, typically ordered by their difficulty. Should, for example, 90-degree rotated versions of a design with obvious coherent regions yield different item difficulty, this should be accounted for when developing test items. On the other hand, if local and global symmetry does not, in fact, contribute to item difficulty, this gives test developers more flexibility in designing items that “look” different, but produce identical difficulty.

The advantage of developing nonidentical block designs with identical difficulty is most apparent in the test–retest context. Often, neurocognitive tests are administered repeatedly to evaluate changes in abilities owing to development, degenerative disease, and so forth. Error due to practice effects could be minimized if the researcher, practitioner, and test developer were able to rapidly develop “novel” designs with identical effects on task performance (Miller et al., 2009), or more cheaply validate sets of nonidentical items, whose psychometric equivalence has been previously established on the basis of known parameters.

Given that the analysis of symmetry and coherent regions were post hoc, the effects of coherent regions, as defined in this study, should be experimentally verified. Other possible parameters should be rigorously pursued as well. The primary finding that designs with identical informational content (uncertainty) and identical numbers of coherent edges (perceptual cohesiveness) can produce measurably different task performance, even where no partial components are present, suggests that we do not yet fully understand what the block design stimulus contributes to item difficulty.