Working memory is often referred to as a system that actively maintains and operates the information necessary for current goals and tasks (Baddeley, 1986; Baddeley & Hitch, 1974). One of the most important attributes of working memory is its limited capacity (Cowan, 2001; Miller, 1956). This seems to be true for visual working memory (VWM) as well, a subsystem holding visual information in the working state (Alvarez & Cavanagh, 2004; Luck & Vogel, 1997), though for real-world objects the capacity is not fixed (Asp et al., 2021; Brady et al., 2016; Brady & Störmer, 2021).

Apart from the limited capacity and the limited time of storage, interactions between specific contents of VWM can also influence an ability to retrieve these contents. One of the strong determinants of these interactions sources of interference is distinctiveness (Hunt, 2006) referring to the ability of an item to produce a reliable memory relative to other items or the context. It is usually described and measured as the likelihood of successful recall or the recognition of an item as a function of its similarity or dissimilarity with other memory items. In the current study, we will address effects of item distinctiveness/similarity on two different kinds of VWM contents, namely, memories for objects themselves and object-location memories, that is, memories for both the objects and where they have been originally presented (which refers to an ability to store and retrieve richer information about objects and context under which these objects are encoded).

Distinctiveness in memory for objects

Early research in this area focused on how the distinctiveness/similarity affects retrieval from long-term memory for semantically meaningful material. Indicative examples include the von Restorff isolation effect, a greater recall probability for semantically odd items studied among multiple semantically related items (Hunt, 1995), or the Deese-Roediger-McDermott (DRM) effect, a greater probability to false alarm a never presented foil if it is semantically related to a number of presented items from the same category (Deese, 1959; Roediger & McDermott, 1995). The role of picture distinctiveness was also shown in visual long-term memory (Hall et al., 2021; Konkle et al., 2010; Standing, 1973). Similarly, early studies in the field of working memory were focused on the distinctiveness effects on verbal memories (Conrad, 1964; Baddeley, 1966a; Baddeley, 1966b) and later research extended this focus to visual working memory for objects (Avons & Mason, 1999; Cohen et al., 2014; Jalbert et al., 2008). Retrieval decrements in VWM caused by the low distinctiveness of remembered objects, similar to those in long-term memory (Konkle et al., 2010), were documented in various tasks (Avons & Mason, 1999; Jalbert et al., 2008; Cohen et al., 2014; Jiang, Remington, Asaad, Lee, & Mikkalson, 2016b; Yang & Mo, 2017, etc). Another landmark study (Awh, Barton, & Vogel, 2007) also pointed to a critical role of categorical similarity between encoded sample items and test items in capacity estimates of VWM. The effects of item distinctiveness and similarity on VWM for these items can be linked to the degree of separation between neural representations within the higher levels of the visual cortex, such as the occipito-temporal cortex (Cohen et al., 2014). However, the existing data regarding object distinctiveness in VWM are controversial. There are studies reporting that low item distinctiveness can increase rather than decrease performance in VWM tasks (Jiang, Lee, et al., 2016a; Lin & Luck, 2009; Sims et al., 2012). Jiang, Lee, et al. (2016a) suggested that the low distinctiveness advantage is usually observed for stimuli with feature variation along basic continuous feature dimensions (RGB for color, 360° for orientation, facial morphs, etc.), while the high-distinctiveness advantage could be observed for complex stimuli whose differences are categorical and cannot be presented on a continuum (for example, faces and scenes). Avital-Cohen and Gronau (2021) suggested that the high distinctiveness advantage can be explained by an attentional bias that affects the allocation of VWM resources to certain categories, which is, perhaps, is the reason why this advantage was not observed for all types of stimuli in Jiang et al.’s (2016b) study. Thus, the low distinctiveness advantage in VWM rather works for stimuli whose differences are defined perceptually, whereas the high distinctiveness advantage occurs in memory for categories and could be referred to as conceptual distinctiveness. The idea of conceptual and perceptual distinctiveness/similarity independently contributing to object memory is supported by studies using images of real-world objects. For example, Konkle et al. (2010) have demonstrated that conceptual distinctiveness (how different remembered objects are in kind) rather than perceptual distinctiveness (how different the objects are in color, shape, etc.) mediates subsequent object recognition in a LTM test. Furthermore, Brady and Störmer (2020, 2021) demonstrated that the meaningfulness of to be remembered stimuli, that is, the ability of these stimuli to be recognized as everyday objects, human faces, etc., boosts VWM capacity and that this boost cannot be explained solely by perceptual differences between meaningful and meaningless stimuli (Asp et al., 2021). Based on these findings, we suggest that conceptual distinctiveness, along with perceptual similarity (Hovhannisyan et al., 2021; Hu et al., 2020; Liu et al., 2020; Naspi et al., 2021; Otsuka et al., 2013; Xie & Zaghloul, 2021), can also be an important determinant of VWM for real-world objects.

Object-location memory

Apart from storing the information about objects themselves, VWM is often considered as a strongly spatially referenced system (Logie, 2003; Magen & Emmanouil, 2019). This is the case for some of the paradigms used for VWM research (for review, see Brady et al., 2011; Suchow et al., 2014). For example, in a change detection task, observers have to report whether there is a difference between two consecutively presented sets of items. The whole essence of this task is the idea that if an observer has a critical item in VWM, they will be clearly aware of its change in a certain location (Luck & Vogel, 1997; Pashler, 1988; Phillips, 1974). In a continuous report task, observers also memorize a set of spatially distributed items, and then the location of one of them is directly used as a cue to recall a particular item presented exactly in that location (e.g., Wilken & Ma, 2004; Zhang & Luck, 2008). Remembering certain objects at certain locations (what was where) in the VWM has received much attention and was recently formalized by theories of binding in working memory (Oberauer & Lin, 2017; Schneegans & Bays, 2017; Swan & Wyble, 2014).

Along with errors that occur at recognition tests for object memory, object-location is associated with a specific kind of error termed swaps. They refer to a failure to correctly retrieve both an object and the location of that object, or one given another (Dent & Smyth, 2005; Hollingworth & Rasmussen, 2010; Pertzov et al., 2012; Postma & De Haan, 1996; Toh et al., 2020; but see Pratte, 2019). For example, a report can be identified as a swap when an observer is asked to report an item presented at a cued location but reports instead an item presented at another location (Bays et al., 2009). The swaps can be observed in change detection (Donkin et al., 2015) and continuous report (Bays et al., 2009) tasks. They are perhaps the most frequent and introspectively obvious binding failures (Treisman, 1996) that can be found in the everyday memory experience. For example, when a person places a smartphone and a wallet in their left and right pockets, respectively, the person can later recollect which items are in the pockets but can swap their locations in memory and look for the wallet in the left pocket. As we recently showed, object-location swaps are quite common in VWM tasks with everyday real-world objects (Markov et al., 2021). Due to the relevance of object-location memory for everyday performance, it is extensively studied from the individual differences (Cohen-Dallal et al., 2021; Liang et al., 2016; Lu et al., 2020; Pavisic et al., 2018; Pertzov et al., 2015; Pertzov et al., 2013; Zokaei et al., 2017) and clinical perspectives (Pavisic et al., 2020). From the fundamental perspective, some theories view location-based binding as a critically important mechanism to maintain and recall bindings between internal features (e.g., color and orientation) that make a whole object (Schneegans & Bays, 2017).

Object identity and object location appear to be appropriate representational “units” for object-location binding in VWM. The object-location binding problem has a solid neuronal background. The information about which objects we see and the information about where we see these objects are processed by separate pathways of the visual system (Haxby et al., 1991; Mishkin & Ungerleider, 1982). A number of behavioral (Lee & Chun, 2001; Li et al., 2015; Wood, 2011) and neurophysiological (Darling et al., 2006; Smith et al., 1995) studies showed that memory for objects and memory for locations have separate capacities. Moreover, there is also evidence that object-location memory can be a separate process from storing only objects or only locations (Postma & De Haan, 1996; Postma, Kessels, & van Asselen, 2008). Pertzov et al. (2012) suggested that object-location swaps play an important role in delay-related forgetting in VWM. They showed that while objects themselves are still in VWM, their binding to correct locations can suffer from a delay between encoding and retrieval.

How does distinctiveness/similarity come to play in object-location-memory and swaps? Assuming both objects and locations are encoded and stored imprecisely, there can be some overlap between these representations. These overlaps can cause erroneous recall of an object given a cued location or vice versa. For example, even if an object and its location are successfully bound (which means the association between representations of that object and that location is stronger than between that object and any other location, Oberauer & Lin, 2017), an object from a different location can be reported because of overlap between the representation of the cued location and that of another location. The degree of overlap is a matter of distinctiveness. For example, Bays et al. (2009) found that the swap rate in a standard color continuous report task increases with memory set size. They suggested that the imprecision (representational indistinctiveness) of both color and locations grows with set size, and this causes less efficient of a location cue to retrieve a correct color. Similarly, Oberauer and Lin (2017) suggested that presenting a recall cue (e.g., cueing a certain location) probabilistically increases activation in the to be reported feature space (e.g., color), and the amount of this activation for each feature is modulated by the similarity between the cued location and that associated with a given color. Having said that, object-location swaps are not always considered to reflect binding itself. Alternatively, swaps can be thought of as a result of the way memory gets access to an object representation based on location (or vice versa). This leads to more frequent swaps between items presented at closer locations or having more similar features (Emrich & Ferber, 2012; Oberauer & Lin, 2017).

Our study

Our study aims to investigate how object distinctiveness affects both object recognition and object-location memory when observers have to remember a set of real-world objects and their spatial positions. Although some of the previous studies have reported observing some negative correlations between item distinctiveness and the probability of swap errors (e.g., Dent & Smyth, 2005; Postma & De Haan, 1996; Oberauer & Lin, 2017), they did not address the nature of their correlation as a major issue. Does poor distinctiveness cause less likely object retrieval and the inaccessibility of its location as a consequence? Or is it possible that location memories become noisier under the increased demands for object distinction, which makes the representations at these locations more penetrable to neighbor representations (Emrich & Ferber, 2012)? Or does low object distinctiveness somehow impair object-location memory itself, although object memories and location memories are intact? To answer these questions, all three components should be comparatively tested under low and high object distinctiveness. Moreover, specific “swap” errors should be thoroughly analyzed in terms of how likely they are caused by a failure associated with object recognition or object-location memory. Also importantly, previous studies that tested the role of item distinctiveness or similarity in various object-location tasks were based on either simple objects and continuous perceptual features (e.g., colors, Oberauer & Lin, 2017) or complex but unfamiliar objects whose differences are defined mostly visually (e.g., Dent & Smyth, 2005). However, it is less clear how object-location memory is affected by the distinctiveness of real-world objects that have a strong semantic component in them, given that object recall can behave quite differently for objects whose differences are defined by visual and conceptual features. We implemented this program in three experiments. They were designed to carefully test how the requirements to distinguish between real-world objects and remember them at certain locations affect memory for object-location conjunctions. In Experiment 1, we directly tested the influence of distinctiveness (that we operationalized as all studied objects being from the same or from different basic categories) on object recognition memory and object-location memory. In Experiment 2, we compared object-location memory (that is, when the task requires to actively remember object identities and which locations these identities belonged to) against “pure” location memory (when demands on object memory and object-location memory are diminished by the high consistency of object identities and their spatial order, but location themselves change unpredictably). To anticipate, Experiments 1 and 2 showed that observers commit more object-location swap errors when their memory set is low-distinctive, although their recognition memory for objects was not affected by item distinctiveness. In Experiment 3, we tested two plausible accounts of the distinctiveness effect on swaps. One account suggests that low distinctiveness impairs object-location memory in general, leading to more likely forgetting of object-location information in an all-or-none manner, which should increase the number of swap errors regardless of their similarity to the target. Another account suggests that observers can maintain reliable object-location information when the studied objects have low distinctiveness, but swaps occur more often because of a stronger competition between the target and nontargets when they are called by a retrieval cue.

Experiment 1

Experiment 1 aimed to investigate how object distinctiveness interacts with object-location memory. We presented sets of real-world objects to our participants, asking them to remember all of the objects and where each object was located. We subsequently tested object recognition and how precisely the object location was adjusted. This gave us estimates of all constituents of object-location memory: object memory, location memory, and object-location conjunction memory. We manipulated object distinctiveness via conceptual (categorical) homo/heterogeneity of object sets which is relevant for complex, real-world objects. In accordance with the previous findings (Konkle et al., 2010), objects belonging to the same basic category have low conceptual distinctiveness and are expected to be harder to reсognize against a novel foil, whereas objects from all different categories have high conceptual distinctiveness and are expected to be easier to recognize. Therefore, we ask whether the need to remember low-distinctive objects from the same category also affects an ability to remember objects at correct locations.

Method

Participants

Twenty-two psychology students from the Higher School of Economics (19 female; age M = 19.5 years) took part in the experiment for extra course credits. All participants reported having a normal color vision, normal or corrected to normal visual acuity, and no neurological problems. Before the beginning of the experiment, they signed an informed consent form. In this and subsequent experiments, sample sizes were determined based on similar studies addressing the issue of feature storage and binding in VWM and using a continuous report task (from 10 to 16; for example, Fougnie & Alvarez 2011; Fougnie et al., 2010; Bays et al., 2009; Pertzov et al., 2012). The planned sample size also included a few extra participants taking into account the possibility of technical problems or poor performance in some participants.

Apparatus and stimuli

Stimulation was developed and presented through PsychoPy (Peirce, 2007; Peirce et al., 2019) for Linux Ubuntu. Stimuli were presented on a standard VGA monitor with a refresh frequency of 75 Hz and 1024×768-pixel spatial resolution. Stimuli were presented on a homogeneous white field. Participants sat approximately 47 cm from the monitor. From that distance, the screen subtended approximately 42.4 × 32.5 degrees of visual angle.

Objects

We used the real-world object database created by Konkle and colleagues (Konkle, Brady, Alvarez, & Oliva, 2010). 200 unique categories and 16 unique exemplars from each category were chosen from the database. Two example categories (“apple” and “toy soldier”) with four exemplars are shown in Fig. 1. The objects were scaled to subtend approximately 4.34° of visual angle.

Fig. 1
figure 1

Examples of two categories (“apple” and “toy soldier”) and four exemplars from each category (from the database originally developed by Konkle et al., 2010)

Spatial layout

Each sample screen contained three objects. The centers of the objects lay on an imaginary circle with a radius of 12.8°. The only parameter defining the position of each object was the rotational angle on the imaginary circle. These angles were chosen randomly for each object in each trial. The only restriction was that the minimum distance between the centers of any two objects was 60° of rotation. This was done to avoid overlap or clustering between the objects.

Object distinctiveness

The three sample objects could have low or high distinctiveness. In low-distinctiveness samples, all three items were different exemplars drawn from the same object category. In high-distinctiveness samples, the items were drawn from three different categories. 100 unique categories were used to make 100 sample trials in the low-distinctiveness condition, so that each category was presented only once per experiment. Another 100 unique categories were used to make 100 samples in the high-distinctiveness trials, so that each category could appear three times per experiment, but each time using a new exemplar. Overall, this warranted that each particular exemplar could appear only once in the entire experiment as a sample. To test recognition memory for object identity, one of the sample (“old”) objects was presented paired with a “new” object. Both old and new objects were always from the same category, regardless of sample distinctiveness. This allowed keeping the test difficulty fixed while manipulating the difficulty of encoding and storage in working memory (Awh et al., 2007). Target sample items for subsequent testing were randomly assigned across participants.

Our claim that sample sets consisting of objects from the same category represent low distinctiveness and sample sets consisting of objects from different categories represent high distinctiveness is not based only on a-priori assumptions but also empirically grounded. In their original study of massive memory, Konkle et al. (2010), whose stimuli we use in this study, showed that remembering several exemplars from the same category impairs subsequent recognition memory for any such exemplar compared to remembering unique exemplars.

Procedure

Figure 2 shows the organization of a trial in Experiment 1. At the beginning of each trial, three sample objects were presented for 2 seconds. Then, after a 1-second blank interval, participants were shown two objects, one “old” and one “new”. Participants had to choose an object that they had thought to be “old”, that is, presented in the sample set (recognition task). The new and the old test objects were presented randomly either to the left or to the right of the center of the screen. Participants answered pressing a left or a right arrow key indicating the location of the old object. The chosen object then remained on the screen and participants had to set its remembered location (localization task). To do that, the participants had to drag the object along a positional ring with a mouse (Fig. 2). The initial position of the probed item was at the center of the screen until a first mouse click, which moved the object to an imaginary sample location circle. When the location was set, the participants pressed “SPACE” to confirm the response and terminate the trial. After response confirmation, feedback informed the participants how close their response had been to the true object location and their accuracy regarding object recognition. The location set by the participant was shown by a red cross centered at the set angle; the true location was shown by the probed object presented at that location (Fig. 2).

Fig. 2
figure 2

The time course of a trial in Experiment 1

100 low-distinctiveness and 100 high-distinctiveness trials were presented in total in random order. Each particular object (exemplar from a category) was presented only once during the experiment.

During the entire experiment, participants were instructed to repeat a syllable “ba” aloud at a rate of about 3 Hz to prevent verbal encoding of stimuli. The experiment was preceded by ten practice trials in order to familiarize participants with the task. The total time of the experiment was between 30 and 45 minutes.

Data analysis

For the object recognition task, the percentage of correct answers was calculated. For the localization task, localization errors were calculated in each trial. The error was estimated as the angular difference between a participant’s response and the true location of a probed object. The distribution of errors was then analyzed using the mixture model (Zhang & Luck, 2008) with a modification suggested by Bays and colleagues (Bays et al., 2009), the “swap” model. For modeling, we used MemToolbox for MATLAB (Suchow, Brady, Fougnie, & Alvarez, 2013). The model has three parameters derived from the three decomposed components of the error distribution. The first parameter is the standard deviation (SD) of the von Mises distribution built around the mean of 0, which is supposed to reflect responses made about the items whose locations are present in memory with some noise. In representational terms, SD estimates the precision of memory trace for the location of a probed item. The second parameter is the probability of random guess (Pguess) estimated as an area below the uniform component of the mixed distribution that reflects the random picking of locations in the absence of memory for the location of a probed item. The third parameter is the probability of a “swap” (Pswap), estimated as the area of a second von Mises distribution component. This second von Mises distribution is assumed to have a mean that equals the location of a distractor item (each of the sample objects that are memorized but not probed) and the same SD as the first von Mises distribution. Pswap accumulates the responses originating from a misreport of a distractor location instead of a probe location, which is the object-location binding error. Knowing Pguess and Pswap, it is easy to calculate the probability of correctly bound objects and locations held in VWM (Pmemory) using the following formula: Pmemory = 1 – (Pguess + Pswap).

Since the quality of parameter estimation in a mixture model is sensitive to the amount of data, we included trials with both correctly and incorrectly recognized objects in this analysis. It is obvious that misrecognized items could differently contribute to the overall probabilities of the localization outcomes. For example, these misrecognized items could inflate the swap rate, as a failure to recognize an object might lead to a random choice among three locations remembered as being occupied by some object. Therefore, further analysis within the mixture model approach was designed to test how often various types of object-location outcomes occurred in trials where a tested object was recognized correctly. This required each trial to be classified in terms of two outcomes, object recognition (correct vs. incorrect) and object localization (correct location vs. swapped location vs. guessed location). To identify the latter type of the outcome, we transformed continuous localization errors into discrete labels indicating that a given pointed location is correct, swapped, or randomly guessed. The rule for labeling was taken from Fougnie & Alvarez (2011) and is based on the parameters of the mixture model. If a localization error fell within ±3 SD around a true target location, it was labeled as a likely correct localization response. If that error fell within ±3 SD around one of the distractor locations, it was labeled as a swap response. The rest of the errors falling beyond ±3 SD around both target and distractor locations were labeled as likely random guesses of a location.

To statistically estimate the effect of distinctiveness on object recognition and object localization, we applied the standard frequentist and Bayesian t-tests to the percentage of correct object recognition, as well as SD, Pmemory, Pswap, and Pguess of object localization. The Bayesian t-test is a direct way to estimate evidence for H1 against H0 (Rouder, Speckman, Sun, Morey, & Iverson, 2009). The Bayes factor (BF10), is the odds between the relative likelihoods of H1 and H0 under the observed data, was calculated using JASP 0.8.2 (JASP Team, 2017; Wagenmakers et al., 2017). The Cauchy distribution with a default width of .707 was used as a prior distribution of effect sizes under H0 (JASP Team, 2017; Wagenmakers et al., 2017).

Results and discussion

In general, participants showed reasonably good memory both for object identities (percent correct recognition: M = 87% for high-distinctiveness trials, M = 88.6% for low-distinctiveness trials) and object locations (Pmemory: M = .764 for high-distinctiveness trials, M = .685 for low-distinctiveness trials). We found no evidence for the effect of distinctiveness on object recognition (t(21) = 1.550, p = .136, BF10 = .629, Cohen’s dz= .330, Fig. 3a). For localization, we found that low distinctiveness decreased Pmemory (M = .685) compared to high distinctiveness (M = .764; comparison: t(21) = 4.148, p <.001, BF10 = 71.738, dz = .884, Fig. 3b). By contrast, low distinctiveness increased Pswap (M = .203) compared to high distinctiveness (M = .129; comparison: t(21) = 5.257, p <.001, BF10 > 103, dz = 1.121, Fig. 3c). Finally, there was no evidence of any effect of distinctiveness on Pguess (M = .108 for high-distinctiveness trials, M = .112 for low distinctiveness trials; comparison: t(21) = .348, p = .731, BF10 = .236, dz = .074, Fig. 3d) and on the SD (M = 15 for high-distinctiveness trials, M = 14.8 for low distinctiveness trials; comparison: t(21) = .327, p = .747, BF10 = .234, dz = .070, Fig. 3e).

Fig. 3
figure 3

The results of Experiment 1: The effect of distinctiveness on (a) percentage of correct answers on the object recognition task, (b) Pmemory, (c) Pswap, (d) Pguess and (e) SD in the binding task. Error bars depict 95% CIs

Object-location memory of correctly recognized items

One participant was excluded from this analysis (their SD of localization error was too large to classify trials). Our analysis showed that the probability of swap errors in trials with correct recognition responses was greater in low-distinctive sets (M = .19) than in highly-distinctive sets (M = .13; comparison: t(20) = 4.634, p <.001, BF10 = 182, dz = 1.011). The proportions of location guesses (Pguess) did not differ between the conditions (low-distinctive sets: M = .023, highly-distinctive sets: M = .029; comparison: t(20) = 1.101, p =.284, BF10 = .388, dz = 0.240). The proportion of correctly localized objects was greater in the highly-distinctive sets (M = .71) than in the low-distinctive sets (M = .68; comparison: t(20) = 2.117, p =.047, BF10 = 1.434, dz = 0.462). These differences are quite similar to those observed using the overall mixture model for all trials. Therefore, we conclude that the distinctiveness effect on object-location memory is preserved for correctly reported items and cannot be ascribed merely to object forgetting.

The results of Experiment 1 showed that the distinctiveness of sample objects did not affect object recognition. At the same time, we observed a specific effect on localization. Object distinctiveness did not change memory for locations themselves in terms of precision (SD) and the probability of remembering or forgetting the location (Pguess). Rather, it affected the probability of ascribing an object to its true location or to a location occupied by another object (swap error). There were fewer correctly localized items and more swap errors when distinctiveness was low.

Experiment 2

In Experiment 1, we combined tests for object recognition and object-location memory. There are two potential limitations of the testing method we used. First, the order of tests was fixed: recognition always came first, and object-location report always came second (as it was logically not possible to ask observers to localize a previously seen object before asking which of the objects the observers had seen). Therefore, the first order of the object recognition task could interfere with the subsequent object-location test. Second, whereas Experiment 1 allowed us to measure object memory as a necessary component of object-location memory, it did not allow us to measure another necessary component in isolation, spatial memory. Although object distinctiveness did not affect the SD or Pguess, that is, an ability to remember which locations were occupied by any objects, we cannot say that the need to remember the objects and their locations did not affect memories for locations (e.g., by making them less precise). In Experiment 2, we tested object-location memory without the preceding object recognition task. We also compared object-location memory for low-distinctive and high-distinctive objects (as in Experiment 1) with location memory in a task where object and object-location binding are not particularly relevant. In this version of the task, we used a consistent set of objects and located them in a consistent spatial order. Therefore, observers did not actually need to remember object identities and their relative locations (“bindings”). Absolute locations were the only features that unpredictably changed between trials and had to be encoded into working memory.

Method

Participants

Twenty-one psychology students of the Higher School of Economics (19 female; age: 18-21, M = 20.07) took part in the experiment for extra course credits. All participants reported having a normal color vision, normal or corrected to normal visual acuity, and no neurological problems. The sample size was determined based on the same rules as in Experiment 1. Before the beginning of the experiment, they signed an informed consent form.

Apparatus and stimuli

Apparatus and stimuli were the same as in Experiment 1. For the new task requiring localization without a strong need for remembering objects and object-location bindings, we made three pictures of hands depicting numbers from one to three. We used this kind of stimuli instead of showing regular Arabic or Roman numbers because they looked more like real-world objects and, thus, were more similar to Konkle et al.’s (2010) objects used in Experiment 1. On each sample display, all three “hand numbers” were located the same way as in the object-location version of the task. Additionally, their order was fixed: the “numbers” followed clockwise from “one”. We assumed that this sort of display reduced the demands on memory for objects and object-location binding. The reduced demands on object VWM were provided by the fact that the “hand numbers” repeated consistently across the experiment and could be easily learned during a practice session. Moreover, we used the straightforward association between the number of raised fingers and the well-trained numerical representation in long-term memory. The reduced demands on object-location binding were provided by the consistent order of the “hand numbers”. It allowed the recovery of the location of any given hand from memory for only one object-location combination. Therefore, by having reduced the uncertainty about object identities and about which locations they belong to, we kept the uncertainty regarding the locations themselves.

Procedure

The time course of a trial was the same as in Experiment 1, but no object recognition test was used after the retention interval. Instead, observers were shown by a single object from the sample immediately after the retention interval, and they had to localize that object with a mouse click. The object-location block of the experiment had the same design as in Experiment 1 (low-distinctive samples vs. highly-distinctive samples, 100 trials per condition). The location memory block (hereinafter - “hand localization” task) consisted of 200 trials. The number of trials was equated with the object-location task to control for serial position effects. However, only 100 random trials were drawn for the subsequent analysis, which was equal to the number of trials per condition in the object-location blocks. The trials were organized the same way as in the object-location task: Observers were presented with three “hand numbers” and had to recall the original location of a single probed hand (Fig. 4). The serial order of the object-location and the hand localization blocks was counterbalanced across participants.

Fig. 4
figure 4

The time course of a trial in the hand localization task

Data analysis

For both the object-location and the hand localization tasks, we analyzed localization errors using the swap model (Bays et al., 2009), as described in Experiment 1. For each dependent variable (SD, Pmemory, Pguess, Pswap), we ran the following planned comparisons. First, we compared localization performance between highly-distinctive and low-distinctive trials, which was a direct replication of the analysis for the same task in Experiment 1. Second, we compared object-location memory under each of the distinctiveness conditions with that obtained from the hand localization task. A Holm correction was made for multiple comparisons in calculating the statistical significance level. For Bayesian t-tests, the same prior, as in Experiment 1, was used.

Results and discussion

The participants were above chance at remembering the locations of the hands (Pmemory: M = .92) and of object-location conjunctions (Pmemory: M = .95 for high-distinctiveness trials, M = .89 for low distinctiveness trials).

The results obtained in the object-location task of Experiment 1 were well replicated in Experiment 2. Low object distinctiveness decreased Pmemory compared to high distinctiveness (t(20) = 7.317, p <.001, Bonferroni-Holm corrected α = .017, BF10 > 103, dz =1.597, Fig. 5a). Symmetrically, low distinctiveness increased Pswap (M = .07) compared to high distinctiveness (M = .03; comparison: t(20) = 6.164, p <.001, Bonferroni-Holm corrected α = .017, BF10 > 103, dz = 1.345, Fig. 5b). Distinctiveness had no effect on Pguess (for low-distinctive objects M = .04; for highly distinctive objects M = .02; comparison: t(20) = 2.030, p = .056, Bonferroni-Holm corrected α = .017, BF10 = 1.253, dz = .443, Fig. 5c) and SD (t(20) = .583, p = .566, Bonferroni-Holm corrected α = .05, BF10 = .265, dz = .127, Fig. 5d).

Fig. 5
figure 5

The results of Experiment 2: The effect of distinctiveness on memory parameters in the binding task and parameters of hand localization task: (a) Pswap, (b) Pmemory, (c) Pguess, and (d) SD. Error bars depict 95% CIs

The comparison between the object-location and the hand localization tasks showed no differences for Pmemory (low distinctive objects vs. “hand numbers”: t(20) =.944, p = .36, Bonferroni-Holm corrected α = .025, BF10 = .338, dz = .206; highly distinctive objects vs. “hand numbers”: t(20) = .677, p = .51, Bonferroni-Holm corrected α = .05, BF10 = .28, dz = .148; Fig. 5a), Pguess (low distinctive objects vs. “hand numbers”: t(20) = .302, p = .77, Bonferroni-Holm corrected α = .05, BF10 = .237, dz = .066; highly distinctive objects vs. “hand numbers”: t(20) = .899, p = .38, Bonferroni-Holm corrected α = .025, BF10 = .326, dz = .196) and SD (low-distinctive objects vs. “hand numbers”: t(20) = .730, p = .38, Bonferroni-Holm corrected α = .017, BF10 = .318, dz = .159; highly distinctive objects vs. “hand numbers”: t(20) = .865, p = .47, Bonferroni-Holm corrected α = .025, BF10 = .289, dz = .189; Fig. 5c).

The proportion of swap errors (Pswap) in the hand localization task was very low (M = .03). This finding demonstrates that our hand localization task was easy in terms of remembering object-location bindings and probably can serve a proper tool for measuring spatial memory alone. However, proportions of swaps for locating low distinctive objects were greater than this baseline (low-distinctive objects vs. “hand numbers”: t(20) = 3.316, p = .003, Bonferroni-Holm corrected α = .025, BF10 = 12.421, dz = .724; Fig. 5b), but not highly distinctive objects (high-distinctiveness objects vs. “hand numbers”: t(20) = .333, p = .743, Bonferroni-Holm corrected α = .05, BF10 = .239, dz = .073; Fig. 5b).

Our results show that memory for locations is basically not affected by the need to also remember which object goes with which location. Critically for this conclusion, the precision (SD) and overall capacity (reverse Pguess) parameters did not change across the tasks and conditions. Object distinctiveness also did not affect the precision and capacity of location representations in memory. These findings suggest that additional demands on memory associated with harder object distinction do not impair location memory per se. As the results from both Experiments 1 and 2 show, observers demonstrate more swap errors (Pswap) for low distinctive objects. These conclusions replicate the conclusions from Experiment 1. In addition to Experiment 1, we ruled out a possibility that the effect of distinctiveness on the probability of the swap errors could be caused by the preceding object recognition test. We addressed the nature of the distinctiveness effect on object-location swaps more rigorously in Experiment 3.

Experiment 3A

To address the differential effects of object distinctiveness on object recognition and object-location memory observed in Experiment 1 and Experiment 2, we considered two plausible explanations. The first potential explanation is a non-specific impairment of memory caused by more difficult object distinction. To recall different object location-conjunctions, one should memorize objects themselves, locations, and, finally, which object goes with which location. Since low-distinctive objects are harder to discriminate (Konkle et al., 2010), the maintenance of a consistently high recognition rate may require more effort. In other words, this can cause a trade-off between remembering objects and object-location conjunctions biased in favor of the former (perhaps, because it is hardly possible to store conjunction if object memory fails). This kind of trade-off can yield a greater rate of swaps in the low-distinctive condition when observers were trying to better remember subtle differences between various exemplars more often fail to remember where each of the exemplars had been located. The second potential explanation of the advantageous effect of high distinctiveness on object-location memory is that the retrieval cue acts more effectively to tease apart targets from distractors when the items are more distinct. That is when a target object is probed, the observer checks their object-location associations for how familiar they seem given the target. If nontarget locations are associated with objects highly distinct from the probed target, then they provide a weaker familiarity signal (Oberauer & Lin, 2017; Schneegans & Bays, 2017; Swan & Wyble, 2014; Schurgin et al., 2020) and, thus, a swap is less likely. The critical difference between the two accounts is the representational “fate” of objects that are swapped at the report. The first account suggests that object-location recall in an all-or-none fashion and that remembering low-distinctive objects increases the probability of such a failure non-specifically. That is, when the observers do not remember where exactly an object belonged, they randomly guess, choosing one of the locations (which they remember per se) as a “home” for the object. The second account suggests that when observers fail to reсall correct object-location conjunction, they still have some noisy representation that can be separated from nontarget conjunctions, and the degree of separation is a matter of distinctiveness. In this second case, we predict that the observers will more likely swap between more similar objects (same category) than between more distinct objects (different categories).

In Experiment 3A, we tested these two accounts against each other. We presented observers with sets of four objects at four different locations and then asked them to identify which of the four objects had been presented at one particular location. No object recognition and no continuous report were required in this task. As in Experiments 1 and 2, all objects could be highly distinctive (drawn from four different categories) or low-distinctive (drawn from one category). Critically, we added a third condition with two categories and two exemplars in each category. Here, the distribution of responses between test alternatives, in particular between incorrect alternatives (non-targets), was a sensitive measure to distinguish between the two accounts of the distinctiveness effect from Experiment 1. If distinctiveness affects an ability to store object-location information in an all-or-none fashion (either remember the exact conjunction, or pick it randomly), then we can expect that all of the nontargets would be chosen with the same probability. But if distinctiveness affects object-location memory via the relative familiarity of nontargets given a retrieval cue, then we can expect that observers would choose a nontarget from the same category as the target more frequently than the foils from the different category.

Method

Participants

Nineteen psychology students of the Higher School of Economics (19 female; M = 20.07) took part in the experiment for extra course credits. All participants reported having normal color vision, normal or corrected to normal visual acuity, and no neurological problems. The sample size was determined based on the same rules as in Experiment 1. Before the beginning of the experiment, they signed an informed consent form.

Apparatus and stimuli

Apparatus and stimuli were basically the same as in Experiment 1. An important difference was that sample sets consisted of four rather than three objects. Four objects were placed on an imaginary circle (radius of 12.8°) at the positions of 45°, 90°, 135° and 270° with a random jitter within ±2.17° along a radius for each object. The size of each object was 4.34°. The four objects presented in a sample set could be drawn from one category (all four objects were different exemplars of this category), from two categories (two exemplars from one category and two exemplars from another category), or from four categories (all exemplars belonged to different categories).

Procedure

Four objects were presented for 2 seconds. After a 1-second blank interval, all four sample objects were presented in a 2X2 square matrix around the center of the screen. At the same time with the matrix of four objects, a square box indicating one of the original locations of the sample set appeared. Participants had to click on an object that had been presented at the cued location in the sample. Feedback appeared after each trial as to whether the object had been correctly ascribed to the cued location (Fig. 6).

Fig. 6
figure 6

The time course of a trial in Experiment 3A

Data analysis

We analyzed the percentage of correct answers for the three conditions: one, two, or four categories. Importantly, in the condition with two categories, we also analyzed the percentages of choices broken down by all alternatives: correct object, incorrect object belonging to the same category as the correct one, and two incorrect objects belonging to a different category. This analysis was critical to test whether observers prefer a correct category over incorrect one, even when they fail to determine the exact identity of an object presented at a cued location. A Bonferroni correction was made for multiple comparisons in calculating the statistical significance level. For Bayesian t-tests, the same prior, as in Experiment 1, was used.

Results

Accuracy.We found a significant effect of the number of categories on the overall accuracy (F(2,36) = 13.97, p < .001, η2 = .437, BF10 = 3.125). The percentage of correct answers was overall lower when participants had to remember all items from one category compared to two or four categories (one category, M = 60.05%, vs. two categories, M = 68.5%: t(18) = 5.056, p < .001, Bonferroni corrected α = .017, BF10 = 124.077, Cohen’s d = 1.16; one category, M = 60.05% vs. four categories, M = 66.5%: t(18) = 3.863, p < .001, Bonferroni corrected α = .017, BF10 = 29.434, Cohen’s d = .886; two categories, M = 68.5% vs. four categories, M = 66.5%: t(18) = 1.193, p = .241, Bonferroni corrected α = .017, BF10 = 0.548, Cohen’s d = .274; see Fig. 7a). This result replicates one of the principal findings from Experiment 1, where participants also committed more localization errors for objects belonging to the same category.

Fig. 7
figure 7

The results of Experiment 3A: percentage of correct object localizations for three conditions (a); percent of correct answers for the condition with 2 categories (b)

Within-category vs. across-category localization errors

Our analysis of responses broken down by outcomes in the two-categories condition has found a strong effect (F(2,36) = 136.444, p < .001, η2 = .883, BF10 > 1022). The strength of the effect was predominantly provided by the high prevalence of correct answers (M = 68.5 vs. foil from the same category, M = 17.6: t(18) = 12.784, p < .001, Bonferroni corrected α = .0167, BF10 > 106, Cohen’s d = 2.933; correct answer vs. foil from the different category, M = 6.95: t(18) = 15.452, p < .001, Bonferroni corrected α = .0167, BF10 > 107, Cohen’s d = 3.545). More interestingly, the foils were also chosen with different probabilities: the foil from the same category as target was chosen more often than any of the foils from the different category (foil from the same category vs. foil from the different category: t(18) = 2.67, p = .011, Bonferroni corrected α = .017, BF10 = 3746, Cohen’s d = .612 see Fig. 7b).

Discussion

The results of Experiment 3A replicated one of the major findings from Experiments 1 and 2 using a different version of object-location memory test. We found that an ability to correctly localize a remembered object was impaired when all objects had low distinctiveness (all items belonged to the same object category). The novel finding from Experiment 3A was that incorrect localizations were distributed non-uniformly across the rest of the items. The nontarget from the same category as the target was chosen about twice as frequently as each of the two foils from the different category. That is, even when participants failed to remember which particular object had been presented at a certain location and, thus, committed a swap, they still could rely on some memory about a category presented at that location. This can account for the greater swap rate in the displays consisting of objects from the same category compared to the displays consisting of objects from different categories. It can be easier to store multiple categories related to certain locations than multiple within-category variations provided by different exemplars of the same category.

Experiment 3B

In Experiment 1, we tested both object recognition memory and object localization for sample sets of three objects. In Experiment 3A, we tested only localization accuracy for sample sets of four objects, but we did not test recognition. Experiment 3B aimed to test object recognition alone, without any test of object-location memory, in the same setting as was used in Experiment 3A.

Method

Participants

Eighteen psychology students from the Higher School of Economics (17 female; age: 18-39, M = 20.88) took part in the experiment for extra course credits. All participants reported having normal color vision, normal or corrected to normal visual acuity, and no neurological problems. The sample size was determined based on the same rules as in Experiment 1. Before the beginning of the experiment, they signed an informed consent form.

Apparatus, stimuli, spatial layout and procedure

The sample stimuli were similar to Experiment 3A in terms of the number of objects, their layout and distinctiveness. In Experiment 3B spatial memory was not tested, observers were asked to remember only objects. On the test display following the one-second retention interval, two objects were presented to the right and to the left from the center. As in the recognition task of Experiment 1, one of the objects was “old” and the other object was “new”. Both the old and the new objects were drawn from the same category. The spatial positions of these two objects were randomized across trials. The observers had to click on the object they considered to be old.

Data analysis

The percentage of correct recognition was calculated. In other aspects, the data analysis was similar to Experiment 1.

Results and discussion

We found no significant differences between the three conditions (4 categories: M = 75.3%; 2 categories: M = 76.2%; 1 category: M = 76.2%; comparison: F(2,34) = .251, p = .779, η2 = .015, BF10 = .169). This result replicates the finding of equal recognition accuracy in all distinctiveness conditions (Experiment 1). It also suggests that the effect of distinctiveness on localization accuracy found in Experiment 3A is not caused by object recognition decrement. Together with Experiment 3A, Experiment 3B also suggests that distinctiveness affected object-location reports, but not object recognition.

One might argue that the difference between the effects of distinctiveness on object recognition (Experiment 3B) and object-location memory (Experiment 3A) could be due to the difference in the ways distinctiveness was manipulated at tests. Indeed, at the recognition test, a foil item was always an item from the same category as a target, which was necessary to keep test difficulty fixed (Awh et al., 2007). In contrast, object-location memory was tested using only items from a sample set as test options (otherwise, the object-location task would have been turned into another version of the recognition task, when observers would have just chosen familiar objects instead of trying to recall where these objects belonged to). As a consequence, the options in the object-location test inherited the distinctiveness from the sample set. However, there is an important argument against the idea that the differential effects of distinctiveness were due to the difference between the distinctiveness manipulations used in the two tasks. The basic pattern replicates the pattern observed in Experiment 1, which we consider to be condition-invariant for both object and object-location memories. To remind, in Experiment 1 observers were tested for object memory using the same 2-AFC task as in Experiment 3B. As for object-location memory, the observers had to locate a single item remaining on the screen after they had performed the 2-AFC; so, there were no other test options that could provide contextual differences in terms of distinctiveness.

General Discussion

Our main goal was to test the effects of distinctiveness on object recognition and object-location memory. The question of principal interest was how the distinctiveness of real-world objects stored in VWM affects the ability to recognize the objects, remember locations and report object-location conjunctions, and whether these effects are similar. To this end, we tested recognition memory for high-distinctive vs. low-distinctive objects in Experiments 1 and 3B. Our results from both experiments showed equally good performances regardless of the distinctiveness. It suggests that our participants had reasonably good visual memory for different objects, even when they belonged to the same category and therefore were more similar and potentially more mutually interfering (Cohen et al., 2014; Konkle et al., 2010). To remind, the previous research with the same stimulus set has found a detrimental, though the not dramatic effect of within-category similarity on recognition in massive long-term memory (Konkle et al., 2010). Experiments 1 and 2 showed that spatial memory per se was good in all distinctiveness conditions, as shown by mixture modeling of localization errors (Bays et al., 2009; Zhang & Luck, 2008). From consistently very low Pguess, we conclude that observers had no substantial problem with storing all three locations (Experiment 1), which is in line with the previous estimates of spatial VWM (Postma & De Haan, 1996) and VWM in general (Alvarez & Cavanagh, 2004; Cowan, 2001; Luck & Vogel, 1997). More importantly, we found no evidence that spatial memory suffered from the need to store categorically similar objects compared to distinct objects, as the Pguess did not depend on object distinctiveness. This suggests that the requirement to store less distinct objects in VWM did not cause more location forgetting. Similarly, we found practically no effect of object distinctiveness on the precision of memory for locations. Experiment 2 additionally showed that the representations of locations were not strongly affected by the need to remember object identities and their relative positions (“bindings”). This pattern is basically in line with a claim that object memory and location memory have separate capacities and appear to be independent, as has been shown in a number of previous studies (Lee & Chun, 2001; Li et al., 2015; Wood, 2011). However, although there was no effect of object distinctiveness on object recognition and location memory, object-location memory was affected by distinctiveness (Experiments 1 and 2), as we found more swap errors when the objects were low-distinctive.

To account for the differential effects of object distinctiveness on the different aspects of tested memories, we can turn to existing models of recognition and binding in VWM, that rely on the idea of noisy representations or noisy familiarity judgments in continuous feature spaces (e.g., Oberauer & Lin, 2017; Schneegans & Bays, 2017; Swan & Wyble, 2014; Schurgin, Wixted, & Brady, 2020). From this perspective, the differential effects of distinctiveness on object recognition and object-location swaps can reflect some important differences in the accessibility and discriminability of object or location representations depending on the task. Since simple object recognition in a 2-AFC requires only a familiarity judgment (which of the alternatives looks more familiar) it should naturally depend on target-foil distinctiveness (Awh et al., 2007; Schurgin et al., 2020) that we objectively kept fixed across conditions, as we always used foils from the same category as a target. That is, the familiarity of the target compared to the foil was about the same. We should note, however, that when all studied items belong to the same category (low distinctiveness) the subjective target-foil distinctiveness still can decrease in theory (for example, by increasing the familiarity of the whole category including the foil), causing more false alarms to foils (as in the DRM effect). We did not observe this in our VWM recognition task. This finding is different from the existent (though not dramatic) distinctiveness effect on LTM for the same stimulus set (Konkle et al., 2010). This interesting difference between distinctiveness effects on two memory systems can be a subject of further research.

While object recognition is not affected by the distinctiveness due to the fixed familiarity ratio between the target and the foil, object-location memory is tested using a cued recall task where one of the studied “items” (object as in Experiments 1 and 2 or location as in Experiment 3A) is presented as a cue and another one is to be reported. Here, both the cues and the to-be-reported items are familiar. Hence, the critical discrimination here is not between more familiar and less familiar features but between equally familiar features that have to be correctly linked with the cued feature. Here, target distinctiveness plays a greater role. The probability that a certain item (target or nontarget) is recalled will depend on the similarity between the cued and uncued features and/or similarity between the items these cues address. This idea is most clearly illustrated by the two-category condition from Experiment 3A (Fig. 7b). When observers memorize four items from categories A and B in four locations, and then one of the items from category A is tested for its location memory, the observers will more frequently misreport the location of another exemplar from category A because this location is associated with an item that is more similar to the cue. Figure 8 depicts this as probabilistic familiarity judgments using a signal detection model (Macmillan & Creelman, 2005; Schurgin et al., 2020). Here, a location cue (Fig. 8A) makes each item in a 4-AFC array produce a familiarity signal randomly drawn from a normal distribution whose mean is defined by the reliability of object-location binding. If this binding is reliable, then the target item distribution is the rightmost (having the strongest familiarity on average, Fig. 8b). Target-nontarget similarity defines how much nontarget distributions are shifted to the left relative to the target (that we can term Δd` which is the difference between the means of the target and the nontarget distributions measured in the units of standard deviation). Obviously, this shift is greater if the nontarget is highly distinctive from the target. On each trial, the observer chooses an item that produced the highest familiarity signal. The overlap between the distributions predicts that there will be trials when one or several distractors produce stronger signals than the target, in which case an observer will commit a swap error (Fig. 8c). The proportion of swaps depends on the distance between the distributions – therefore, it will be greater for more similar items. Our model fits showed that the percentages of swaps we observed in this condition of Experiment 3A (17.6% for the same category and 6.95% for the different category) are accomplished under Δd` = 1.08 and 1.67, respectively (Fig. 8d). The same logic can be applied to other conditions, that is when all objects are from the same or different categories.

Existing quantitative models of binding in VWM also predict that feature similarity and distinctiveness should affect performance in an object-location task (Oberauer & Lin, 2017; Schneegans & Bays, 2017; Swan & Wyble, 2014). However, as these models are mostly built to account for recall of simple continuous features (such as continuous color reports based on item location), they should be applied to our data with meaningful objects with caution. Indeed, our data confirm some of the predictions from these models but are at odds with others. Specifically, the models predict that cue similarity increases competition between associated items, which should result in a greater proportion of swaps. This is exactly what was observed in Experiments 1 and 2 when the objects were used as cues and locations were to be reported. On the other hand, the models predict that similarity between to-be-reported features should decrease interference between them (Oberauer & Lin, 2017; Swan & Wyble, 2014). This seems to be not the case for our data, especially in Experiment 3A, when more similar to-be-reported objects yielded more swaps. To remind, the discrepancy between the directions of distinctiveness effects on simple visual features and semantically meaningful real-world objects has been observed in the previous literature (e.g., Jiang et al., 2016b). A possible explanation for this discrepancy is continuous vs. discrete nature of the objects (Jiang et al., 2016b, consider this to be one of the crucial factors mediating the direction of distinctiveness effects). In our Experiment 3, targets and nontargets were discrete items, so reporting one instead of another was unambiguously interpreted as swaps. In contrast, the binding models (Oberauer & Lin, 2017; Swan & Wyble, 2014) make their predictions for continuous features for which error distributions can be built, and their precision (SD) can be estimated with the mixture model. If targets and nontargets are similar and both influence current responses, observers usually show sharper error distributions compared to dissimilar targets and nontargets. This occurs because a similar nontarget will not pull the response away from the correct one as strongly as a distinct target. Moreover, when the target and nontarget are similar, it is harder to decompose a corresponding error distribution into correct answers and swaps (Bays et al., 2009). Therefore, the discrepancy between our data and predictions of the existing models can reflect a difference between approaches to data analysis and interpretation, whereas the true direction of the effect can be the same. Future research can focus on further figuring out what other mechanisms can cause the difference between the simple features and real-world objects in terms of distinctiveness effects. This also poses a request to the existing quantitative binding models for a unified account of VWM for simple features and complex objects.

Up to this point, we discussed the effects of object distinctiveness on object and object-location memory in terms of a single recognition mechanism that takes into account only quantitative differences in familiarity produced by targets and foils. However, since our stimuli were meaningful real-world objects, there is a possibility that object-location memories could be affected by the use of specific encoding and/or retrieval strategies working on the conceptual level. We suggest that when objects from different categories are presented, observers can rely on coarse category-location knowledge even when they fail to rely on the precise object-location knowledge. For example, looking at Fig. 6, observers can remember which two out of four locations contained backpacks even if they fail to remember which particular backpack was in which of these two “backpack locations”. As a result, when one of these “backpack locations” is probed, the observers would choose a random backpack (sometimes it will be a correct answer, and sometimes it will be a swap) more often than a random bottle. This will cause within-category swaps more often than between-category swaps. Overall, this coarse knowledge of category-location associations can be useful in all cases when different categories are shown in different places, which provides an advantage to object-location reports in all high-distinctiveness conditions of our experiments. Noteworthy, this advantage of different categories does not necessarily mean that observers verbally label locations with category names, especially given the articulatory suppression task we employed in our experiments to prevent observers from verbal labeling. This can be more abstract, conceptual labeling less dependent on encoding modality. This is an intriguing possibility that can be tested in future research (for example, comparing effects of different categories with and without articulatory suppression, as in Dent & Smyth, 2005; Postma & De Haan, 1996).

Fig. 8
figure 8

Object-location report as a noisy familiarity judgment. a An example two-category trial from Experiment 3A. b A signal-detection model of object retrieval given the location cue from (A). Each distribution corresponds to overall internal representations of the four tested objects along the familiarity axis. Relative shifts between the distributions (Δd`) are a function of similarity between an item and a target. c In each trial, each object produces a random familiarity value from a corresponding distribution. The item producing the maximum value is chosen as an answer. If the target produces the maximum familiarity, then the response counts as correct, otherwise the response counts as swap. d Best-fit predictions of different response outcomes from the two-category condition of Experiment 3A.

In sum, in our experiments, we observed that simple object recognition in VWM does not suffer from the low distinctiveness of studied objects (although previous research has shown the opposite for LTM – e.g., Konkle et al., 2010). However, distinctiveness did have an effect on object-location memory, so more swaps occurred when the objects were more similar. These differential effects can be indicative of important differences between two ways of access to the contents of VWM. In simple recognition, the observer decides which items look more familiar (old) and which items look less familiar (new), whereas in an object location-task, the observer chooses which of equally familiar representations better matches a retrieval cue (which is the essence of binding). The object-location retrieval, therefore, involves more competition between representations which, as we assume, is mediated by distinctiveness (Oberauer & Lin, 2017; Schneegans & Bays, 2017; Schurgin et al., 2020; Swan & Wyble, 2014). In addition, the nature of distinctiveness in objects we tested keeps a possibility of using categorical labeling of locations to maintain coarse object-location information even when finer information about a particular object at a particular location fails. Further research is required to figure out the potential role of conceptual encoding of categories in object-location memory.