1 Introduction

With the advance of technology, robots have progressed rapidly and are increasingly common in hospitality and tourism, providing social services for customers such as housekeeping and check-in/out in hotels and food ordering/preparation in restaurants [1]. Especially, the threat of COVID-19 drives robotics adoption in hotels and restaurants to enhance sanitation and physical distancing [2,3,4], which is favored by customers [5].

However, among various robot services in restaurants and hotels, cooking food is not perceived as an appropriate activity by robots [6], possibly because people hold low quality predictions for robot-cooked food [2, 7, 8]. For the chefs, they thought that “during the COVID-19 outbreak, robots can be used partially to reduce workload, but they cannot be permanent” because “cooking is directly proportional to hand taste (the taste of some food keeping a trace of the loving hand that prepared it)” and “the dishes will turn into ordinary fabricated products without human touch” [2]; they also thought robotic chefs could not make high quality food as human chefs, because robots lack tacit knowledge and cooking skills [8]. Similarly, the customers also perceived robot-cooked food as less tasty than human-cooked food, and this perceived inferiority was stronger in luxury than non-luxury restaurants (e.g., fast food, casual restaurants) [7].

Due to the benefit of robotic chefs for the hospitality and tourism industry, especially under the threat of the COVID-19, it will be helpful to further understand people’s perception of the quality of robot-cooked food and factors influencing their perception. Specifically, most potential customers have no experience with food made by robotic chefs; therefore, they might make food quality predictions based on extrinsic cues rather than intrinsic qualities of food. Among various extrinsic cues influencing food quality prediction [9, 10], novelty/familiarity and production processes might be related to people’s quality prediction on robot-cooked food, because food cooked by robotic chefs is novel to most people, and its production processes are different from tradition (i.e., cooking by robots rather than humans). However, the effects of these extrinsic cues on the quality prediction of robot-cooked food are inconsistent, which might be either negative or positive.

Firstly, robots are basically machines; therefore, people might regard robot-cooked food as machine-made production, which is less attractive than human-cooked food. Previous studies have shown that people favored handmade products over machine-made ones: they rated the machine-made grape juice as less natural [11] and machine-made products as less attractive [12]. In parallel, as noted before, an interview with human chefs revealed that they regarded robotic chefs as machines and predicted robot-cooked food as inferior to human-cooked food, because robots would turn dishes “into ordinary fabricated products” [2].

Secondly, robotic chefs are high-tech creatures, and their novel high-tech cues might facilitate people to accept food made by them. Studies on food expectation have shown that the novel cues related to food can promote people’s preference for it [9, 10]. For example, calling a chocolate-flavored liquid “space food” will dramatically promote people’s acceptance of it [13]. Likewise, a qualitative case study on customers’ reviews showed that customers were fascinated by the “robotic cooking” of a fast-food restaurant [14].

Moreover, the anthropomorphism appearances of robotic chefs might also influence food quality prediction, but different theories suggest different influencing patterns. On the one hand, the ‘like-me’ hypothesis in human–robot interactions suggests that the more humanlike a robot is perceived, the more it is regarded as ‘like-me’ (i.e., a human) [15, 16]; therefore, a robotic chef with more anthropomorphic appearance might be regarded as more like a human chef, which in turn might increase the perceived similarity between robot-cooked food and human-cooked food, and thus promote food expectation by increasing familiarity and reducing uncertainty [10, 17, 18]. On the other hand, the well-known but controversial uncanny valley hypothesis suggests that people’s attitude toward the humanlike robot would abruptly shift from empathy to revulsion when it approached but failed to attain a lifelike appearance [19,20,21]; therefore, the uncanny valley might also appear in robotic chefs, and people’s affinity for robotic chefs might further affect their food quality prediction, that is the uncanny valley of food quality prediction.

However, little is known about the effect of robotic chefs’ anthropomorphism on food expectations. The current robotic chefs are majorly designed in low or non-anthropomorphic appearances, such as a machine with two humanoid arms (sometimes also with a mechanic head) [22, 23] or with a drum-like wok [24, 25]. An empirical study has shown that people predict the food cooked by the low anthropomorphic robotic chef is better than that cooked by the non-anthropomorphic robotic chef [26]. Nevertheless, it remains unknown whether further increasing robotic chefs’ anthropomorphic appearance can further promote or reduce people’s expectations towards them and their food, as the ‘like-me’ hypothesis or the uncanny valley hypothesis respectively suggested.

Furthermore, when comparing people’s food quality prediction between human-cooked and robot-cooked dishes, the cooking difficulty level of dishes should be taken into account because (1) food with low cooking difficulty level might be regarded as less tasty than food with high cooking difficulty level, and (2) preparing food with high or low cooking difficulty level requires advanced or basic cooking skills [27, 28], which might be expected to be varied between human and robotic chefs. The effect of cooking difficulty level might be preliminarily supported by a study on people’s different evaluations on robot- and human-cooked food in luxury and casual restaurants [7]: people predicted food in casual restaurants as less tasty than food in luxury restaurants; they also predicted that robot-cooked food as less tasty than human-cooked food in both luxury and casual restaurants, and this inferiority was stronger in luxury than casual restaurants. These results might be explained as that people expected (1) food provided in luxury restaurants of higher cooking difficulty level than the food provided in casual restaurants, and (2) robotic chefs as inferior to human chefs in both basic and advanced cooking skills [8], and this inferiority was even larger in advanced cooking skills. The effect of cooking difficulty level might also explain the current discrepant attitudes towards robot-cooked food: customers liked “robotic cooking” in the fast-food restaurant [14], possibly because the cooking difficulty level was low (i.e., a pan cooking the meat for bowl meal and then the humans adding some vegetables and dressing); whereas people thought robots were not appropriate for cooking food in restaurants and hotels [2, 6, 8], possibly because they referred dishes with high cooking difficulty level (e.g., the famous Chinese dish “Buddha Jumps Over the Wall” involving about 20 materials and 30 procedures, costing seven days to prepare). However, to the best of our knowledge, no study directly compares robot-cooked and human-cooked food across different cooking difficulty levels; thus, empirical evidence is needed to verify the effect of cooking difficulty level.

2 The Current Study

As noted before, previous studies have shown that some customers were fascinated by the “robotic cooking” in a fast-food restaurant [14], but other customers thought robots were not appropriate for cooking food in restaurants and hotels [2, 6, 8]; these inconsistent findings might be due to (1) fast-food and luxury restaurants provided dishes with low or high cooking difficulty level respectively, and (2) despite being fascinated by “robotic cooking”, customers of the fast-food restaurant did not explicitly compare robot-cooked food with human-cooked food. Therefore, we establish the following hypotheses:

H1

Robot-cooked food will be predicted as above average quality but inferior to human-cooked food.

H2

Lower cooking-difficulty food will be predicted as inferior to higher cooking-difficulty food.

Moreover, because the novel cues related to food can promote people’s preference for it [9, 10, 13], the following hypothesis is proposed:

H3

Chinese participants will predict robotic chefs are better at Western food (i.e., novel food) than Chinese food.

At last, the anthropomorphic appearance of robotic chefs might also influence food quality prediction. If familiarity affects quality prediction to robot-cooked food, it is possible to increase people’s expectations by increasing their familiarity with robotic chefs. According to the ‘like-me’ hypothesis [15, 16], the following hypothesis is proposed:

H4a

A robotic chef with a more anthropomorphic appearance might be regarded more like a human chef, which might increase people’s familiarity with it and thus promote food quality prediction.

However, according to the uncanny valley hypothesis [19,20,21], an alternative hypothesis is proposed:

H4b

A robotic chef with too much anthropomorphic (but failed to attain a lifelike) appearance might induce people’s revulsion, which might further decrease the quality prediction of robot-cooked food.

In this study, with three experiments, we compared people’s food quality predictions about human and robot cooked foods based on images of dishes that were informed as cooked by a human or a robotic chef (H1), and the effects of the cooking difficulty, novel cues, and anthropomorphism of robotic chefs were investigated. In Experiment 1, the effect of cooking difficulty level was examined (H2); participants rated both human-made and robot-made dishes, and the dishes’ cooking difficulty was varied as low, medium, and high. In Experiment 2, in order to examine the effect of novel cues in food and chefs (H3), Chinese participants rated Chinese and Western dishes cooked by Chinese, Western, and robotic chefs. Finally, in Experiment 3, the effect of anthropomorphism (H4a, H4b) was examined by varying robotic chefs’ appearances as low, medium, and high anthropomorphism.

3 Experiment 1: Cooking Difficulty Level

3.1 Participants

Ninety-one Chinese university students (73 females and 18 males, aged 18–27 years, M = 22.13, SD = 0.25) participated in this study for monetary compensation. None of them have engaged in chef-related work. Since most participants have never tasted food made by robot chefs, participants who had tasted robot-cooked food were excluded in this and following experiments. Two participants were excluded from this experiment. Power analysis for ANOVA was conducted with G*Power 3.1 (effect size = 0.15, based on a pilot study,Footnote 1 power = 0.8), indicating that the minimum sample size was 49. The present sample size exceeded this minimum.

3.2 Materials

3.2.1 Pictures of Chefs

As shown in Fig. 1, screenshots from cooking videos of four human chefs (two women and two men) and four robotic chefs were used. The robotic chefs were three different versions of Moley cooking robots and one Samsung kitchen robot, all of which were with two humanoid arms. For each chef, three pictures were taken to represent different cooking stages.

Fig. 1
figure 1

Example pictures of human and robotic chefs used in Experiment 1

3.2.2 Pictures of Dishes

Pictures of six Chinese dishes were used. As shown in Fig. 2, there were two dishes of low cooking difficulty (Sauteed cabbage with vinegar sauce, and Hot and sour potato shreds), two of medium (Braised eggplants, Kung Pao Chicken), and two of high (Sweet and sour pork ribs, Braised pork). Since there was no recognized cooking difficulty level of Chinese dishes, we visited various Chinese recipe websites and selected fifteen typical dishes. Then, a novel group of forty-four Chinese University students (28 females and 16 males, aged 18–31 years, M = 23.36, SD = 0.41; not engaged in cooking-related work) were invited to evaluate the cooking difficulty of the fifteen dishes (“I think this dish is difficult to cook”) with nine-point scales (1 = “strongly disagree” to 9 = “strongly agree”). Based on their evaluation score, six dishes were selected, with two for each cooking difficulty level (M = 2.93, 4.06, and 5.28).

Fig. 2
figure 2

Six Chinese dishes used in Experiment 1, with cooking difficulty levels as low, medium, or high

3.2.3 Questionnaire on Food Quality Prediction

Adapted from [26], the food quality prediction was evaluated from four aspects: presentation (“I think this dish is visually attractive”), smell (“I think this dish has enticing smell”), taste (“I think this dish is tasty”) and safety (“I think this dish is safe to eat”). All items were measured on nine-point scales (1 = “strongly disagree” to 9 = “strongly agree”).

3.2.4 Questionnaire on Chefs

Subjective perception of chef’s competence was measured with the item adapted from [26], that is, “Besides above dishes, I think this human/robotic chef can cook a variety of foods” (1 = “strongly disagree” to 9 = “strongly agree”). Moreover, the Godspeed questionnaire [29] was adapted to measure subjective perceptions of chefs’ anthropomorphism, animacy, likeability, and intelligence. Under each perception, participants rated on 9-point semantic differential scales between two bipolar words, such as “fake-natural”, “dead-alive”, “unfriendly-friendly”, or “unintelligent-intelligent”.

3.3 Procedure and Design

Each participant rated a human chef, a robotic chef, and three dishes cooked by each chef. Participants first viewed three pictures of a human (or a robotic) chef and then viewed pictures of three dishes (one for each cooking difficulty level). They were informed that the dishes were cooked by that chef and had to finish the food quality prediction questionnaire after viewing each dish. After rating three dishes, they finished the questionnaire on chefs. Then, in the same way, they viewed and rated a robotic (or a human) chef and the other three dishes made by that chef. For each participant, the human and the robotic chefs were randomly selected from the eight human and robotic chefs candidates. The order of human/robotic chefs and the order of dishes were counterbalanced across participants. The same pictures of dishes were presented to all the participants, but whether the dishes were informed as cooked by the human or the robot chef was counterbalanced across participants.

3.4 Results

3.4.1 Food Quality Prediction

In general, participants predicted the foods cooked by robotic and human chefs were above average quality. One sample t-tests showed that the rating scores of dishes cooked by robotic chefs (low, medium, high cooking difficulty level: M = 6.03, 6.46, 6.85; SD = 1.45, 1.64, 1.58) and human chefs (M = 6.61, 7.19, 7.70; SD = 1.44, 1.36, 1.25) were significantly greater than the intermediate value (i.e., 5): robot-cooked food with low, medium, high cooking difficulty level, t(88) = 6.67, 8.40, 11.07, ps < 0.001; human-cooked food, t(88) = 10.55, 15.17, 20.34, ps < 0.001.

Then, the data were subjected to a 2 (Chef: human vs. robotic) × 3 (Cooking difficulty level: low vs. medium vs. high) repeated measures ANOVA. As shown in Fig. 3, there were three major findings. First, participants consistently predicted foods made by human chefs were of higher quality than those made by robotic chefs, as the main effect of the chef was significant, F(1, 88) = 29.55, p < 0.001, \(\eta_{p}^{2}\) = 0.25. Second, this prediction was not affected by the cooking difficulty level, as the interaction between chef and cooking difficulty level was not significant, F(2, 176) = 0.78, p = 0.46, \(\eta_{p}^{2}\) = 0.01. Third, the cooking difficulty level affect food quality prediction; dishes with higher cooking difficulty level were predicted as of higher quality, as evidenced by the significant main effect of cooking difficulty level, F(2, 176) = 32.63, p < 0.001, \(\eta_{p}^{2}\) = 0.27, and pairwise comparisons, Fs(1, 88) > 18.31, ps < 0.001.

Fig. 3
figure 3

Food quality prediction for Experiment 1, as a function of chef and cooking difficulty level. Error bars represent ± 1 SEM

3.4.2 Perceptions of Chefs

Consistent with the food quality prediction results, paired-samples t-tests showed that human chefs were regarded as of more competence than robotic chefs in cooking a variety of foods, t(88) = 5.19, p < 0.001; Human chefs were also perceived as superior to robotic chefs in anthropomorphism, animacy, likeability, and intelligence, ts(88) > 9.46, ps < 0.001 (Table 1).

Table 1 Means (SEM in parentheses) of competence, anthropomorphism, likeability, intelligence, and animacy perceptions about robotic and human chefs in three Experiments

3.5 Discussion

The results of Experiment 1 confirmed H1 and fit well with the seemly controversial previous findings: on the one hand, participants predicted the foods cooked by typical robotic chefs were above average quality, which is parallel with the previous finding that some customers were fascinated by the “robotic cooking” in a fast-food restaurant [14]; on the other hand, participants predicted robot-cooked food was of lower quality than human-cooked food, which is consistent with previous findings that other customers think robots were not appropriate for cooking food in restaurants and hotels [2, 6, 8].

The effect of cooking difficulty level confirmed H2 and is partially consistent with the findings between luxury and non-luxury restaurants [7]. Participants predicted that dishes with lower cooking difficulty level were inferior in quality than dishes with higher cooking difficulty level, which is similar to the previous finding that people rated food in the non-luxury restaurant as inferior in quality to food in the luxury restaurant. However, unlike the previous study, there was no significant interaction between chef and cooking difficulty level, although the quality difference between human-cooked and robot-cooked food was numerically increased as the cooking difficulty level increased (low 0.58, medium 0.73, and high 0.85). This discrepancy might result from (1) the dishes used in this study being of lower cooking difficulty level than the food provided in luxury restaurants, and (2) other extrinsic cues in luxury restaurants (e.g., service and ambiance).

The results of Experiment 1 seemed to suggest that the novel high-tech cue in robotic chefs was not enough to induce participants to predict the quality of robot-cooked food as high as human-cooked food. However, the Chinese dishes used in this experiment were very familiar to our Chinese participants, which might undermine the novelty of robot-cooked foods.

4 Experiment 2: Novel Cues in Food and Chefs

In Experiment 2, the effects of novelty on food quality prediction (H3) were further examined by including Western dishes. Western chefs were also included to investigate whether Chinese participants made similar predictions towards robotic and Western chefs.

4.1 Participants

Eighty-four Chinese university students (60 females and 24 males, aged 18–30 years, M = 22.06, SD = 0.24; not engaged in chef-related work) participated in this study for monetary compensation. Five participants were excluded from this experiment because they had tasted robot-cooked food. Power analysis for ANOVA was conducted with G*Power 3.1 (effect size = 0.15, based on a pilot study, power = 0.8), indicating that the minimum sample size was 33. The present sample size exceeded this minimum.

4.2 Materials

4.2.1 Pictures of Chef

The pictures of Chinese and robotic chefs in Experiment 1 were also used in this experiment. Besides, screenshots from cooking videos of four western chefs (two males and two females) were included (Fig. 4). Similar to Experiment 1, for each chef, three pictures were taken to represent different cooking stages.

Fig. 4
figure 4

Example pictures of western chefs used in Experiment 2

4.2.2 Pictures of Dishes

Pictures of six Chinese dishes and six typical Western dishes were used. The Chinese dishes were selected based on the evaluation of 15 dishes in Experiment 1, including three dishes of low cooking difficulty (Sauteed cabbage with vinegar sauce, Hot and sour potato shreds, and Scrambled eggs with tomatoes) and three dishes of high cooking difficulty (Sweet and sour pork ribs, Braised pork, and Sweet and sour fillet). The Western dishes were also selected based on the evaluations of 18 typical western dishes from 30 Chinese participants, who were asked to evaluate each dish’s cooking difficulty (“I think this dish is difficult to cook”) and popularity (“I think it is a common western dish”). The finally selected six dishes were well-known for Chinese participants (Fig. 5), three of low cooking difficulty (Vegetable salad, Fish and chips, and Spaghetti) and three of high cooking difficulty (Cheese baked rice, Fried steak, and Fried foie gras).

Fig. 5
figure 5

Six Western dishes used in Experiment 2, with cooking difficulty levels as low or high

4.2.3 Questionnaires on Food Quality Prediction and Chefs

The questionnaires in Experiment 1 were used in this experiment.

4.3 Procedure and Design

Each participant rated a Chinese chef, a Western Chef, a robotic chef, and two dishes cooked by each chef. Similar to Experiment 1, participants first viewed three pictures of a chef and then viewed pictures of two dishes (one of low and one of high cooking difficulty level) cooked by that chef. Participants made food quality predictions after viewing each dish and then rated the chef. For each participant, the Chinese, Western, and robotic chefs were randomly selected from twelve candidate chefs. The order of chefs and the order of dishes were counterbalanced across participants. The same pictures of dishes were presented to all the participants, but whether the dishes were informed as cooked by the Chinese, the Western, or robotic chef was counterbalanced across participants.

4.4 Results

4.4.1 Food Quality Prediction

Similar to Experiment 1, participants predicted that the foods cooked by robotic and human chefs were above average quality. One sample t-tests showed that the rating scores for low and high cooking-difficulty-level Chinese and Western dishes cooked by robotic and human chefs were significantly greater than the intermediate value (i.e., 5), ts(78) > 4.43, ps < 0.001.

Then, the data were subjected to a 3 (Chef: Chinese vs. Western vs. robotic) × 2 (Food type: Chinese vs. Western) × 2 (Cooking difficulty level: low vs. high) repeated measures ANOVA. As shown in Fig. 6, there were three major findings. First, consistent with the results of Experiment 1, dishes with higher cooking difficulty level were predicted as of higher quality, as the main effect of cooking difficulty level was significant, F(2, 156) = 51.58, p < 0.001, \(\eta_{p}^{2}\) = 0.40. Second, Chinese participants generally predicted that food cooked by Chinese chefs were of the highest quality, followed by the food cooked by Western chefs, and then the food cooked by robotic chefs, as evidenced by the significant main effect of cook, F(2, 156) = 21.60, p < 0.001, \(\eta_{p}^{2}\) = 0.22, and pairwise comparisons, Fs(1, 78) > 9.23, ps < 0.005. Third, the food type affected food quality prediction, and interacted with chefs and cooking difficulty level respectively. The main effect of food type was significant, F(1, 78) = 4.09, p < 0.05, \(\eta_{p}^{2}\) = 0.05; Chinese participants predicted Chinese foods were of better quality than Western food. The interaction between food type and cooking difficulty level was significant, F(1, 78) = 14.89, p < 0.001, \(\eta_{p}^{2}\) = 0.16. Planned contrasts showed that Chinese participants predicted the low-cooking-difficulty Chinese and Western foods were of equal quality, F(1, 78) = 0.91, p = 0.34, \(\eta_{p}^{2}\) = 0.01; but for high-cooking-difficulty foods, they predicted that Chinese dishes were better than Western ones, F(1, 78) = 18.94, p < 0.001, \(\eta_{p}^{2}\) = 0.20. The interaction between chef and food type was significant, F(2, 156) = 12.50, p < 0.001, \(\eta_{p}^{2}\) = 0.14. Planned contrasts showed that Chinese participants predicted Chinese chefs were better at cooking Chinese dishes, F(1, 78) = 27.48, p < 0.001, \(\eta_{p}^{2}\) = 0.26; they also tended to predict Western chefs were better at cooking Western dishes, F(1, 78) = 3.55, p = 0.06, \(\eta_{p}^{2}\) = 0.04; however, they predicted robotic chefs cooked Chinese and Western foods in equal quality, F(1, 78) = 1.79, p = 0.19, \(\eta_{p}^{2}\) = 0.02.

Fig. 6
figure 6

Food quality prediction for Experiment 2, as a function of chef, food type, and cooking difficulty level. Error bars represent ± 1 SEM

4.4.2 Perceptions of Chefs

Repeated measures ANOVAs with the factor of chef (Chinese vs. Western vs. robotic) and planned contrasts showed that (Table 1): (1) consistent with the results of Experiment 1, human chefs (both Chinese and Western) were perceived as superior to robotic chefs in competence, anthropomorphism, animacy, likeability, and intelligence, Fs(1, 78) > 20.98, ps < 0.001. (2) Participants predicted Chinese and Western chefs were equal in anthropomorphism, animacy, and intelligence, Fs(1, 78) < 3.62, ps > 0.06. (3) However, participants predicted Chinese chefs are of more competence and likeability than Western chefs, Fs(1, 78) > 6.47, ps < 0.05.

4.5 Discussion

The results of Experiment 2 showed that participants predicted robotic chefs could cook Chinese and Western foods in equal quality, but both were inferior to the foods cooked by Chinese and Western chefs. Although robotic and Western Chefs were both novel for Chinese participants, foods made by robotic chefs were predicted as inferior to those made by Western chefs. Besides, participants did not predict robotic chefs were better at novel (i.e., Western) foods than Chinese foods, thus H3 was not supported. These results consistently suggested that the effects of novel cues in robotic chefs or Western foods were not strong enough to make people predict that the quality of robot-cooked food was equivalent to human-cooked food. Rather, participants seemed to make predictions based on familiarity [10, 17]: they predicted Chinese foods and Chefs better than Western ones, possibly because they had more experience with Chinese than Western chefs and foods. Therefore, it is also possible that participants held the lowest food quality expectation to robotic chefs and their food because people had little experience with them.

5 Experiment 3: The Anthropomorphism of Robotic Chefs

Similar to Experiment 1, participants rated dishes cooked by human chefs and low anthropomorphic robotic chefs. Besides, they also rated food cooked by medium and high anthropomorphic robotic chefs. According to the ABOT (Anthropomorphic roBOT) database [30], the medium anthropomorphic robotic chefs were defined as robots anthropomorphized with mechanic humanlike heads, bodies, and arms, and the high anthropomorphic robotic chefs were defined as androids with human appearances.

5.1 Participants

Seventy-nine university students (44 females and 35 males, aged 18–28 years, M = 22.25, SD = 0.23; not engaged in chef-related work) participated in this study for monetary compensation. Four participants were excluded from this experiment because they had tasted robot-cooked food. Power analysis for ANOVA was conducted with G*Power 3.1 (effect size = 0.15, based on a pilot study, power = 0.8), indicating that the minimum sample size was 33. The present sample size exceeded this minimum.

5.2 Materials

5.2.1 Pictures of Chef

Pictures of human and low anthropomorphic robotic chefs were identical to those in Experiment 1. As shown in Fig. 7, pictures of medium anthropomorphic robotic chefs were screenshots from cooking videos of four robotic chefs; for each chef, three pictures were taken to represent different cooking stages. Pictures of high anthropomorphic robotic chefs were edited from the pictures of human chefs by modulating the brightness of chefs’ skin. This method has been proved to be successful in creating high anthropomorphic robots [31].

Fig. 7
figure 7

Example pictures of the low, medium, high anthropomorphic robotic chefs and human chefs used in Experiment 3

5.2.2 Pictures of Dishes

Based on the evaluation of 15 dishes in Experiment 1, pictures of twelve typical Chinese dishes were selected, four dishes for each low, medium, and high cooking difficulty level.

5.2.3 Questionnaires on Food Quality Prediction and Chefs

The questionnaires in Experiment 1 were used in this experiment.

5.3 Procedure and Design

Each participant rated a low, a medium, a high anthropomorphic robotic chef, a human chef, and three dishes (one of the low, medium, and high cooking difficulty respectively) cooked by each chef. The procedures were similar to Experiment 1. For each participant, the chefs were randomly selected from sixteen candidate chefs. The order of chefs and the order of dishes were counterbalanced across participants. The same pictures of dishes were presented to all the participants, but whether the dishes were informed as cooked by the low, medium, high anthropomorphic robotic chef or the human chef was counterbalanced across participants.

5.4 Results

5.4.1 Food Quality Prediction

Similar to Experiment 1, participants predicted that the foods cooked by robotic and human chefs were above average quality. One sample t-tests showed that the rated scores for each cooking-difficulty-level dish cooked by each anthropomorphic robotic and human chef were significantly greater than the intermediate value (i.e., 5), ts(74) > 6.51, ps < 0.001.

Then, the data were subjected to a 4 (Chef: low vs. medium vs. high anthropomorphic robotic vs. human) × 3 (Cooking difficulty level: low vs. medium vs. high) repeated measures ANOVA. As shown in Fig. 8, the main effect of chef was significant, F(3, 222) = 14.97, p < 0.001, \(\eta_{p}^{2}\) = 0.17. Planned contrasts showed that: (1) foods made by human chefs were predicted as of higher quality than those made by low, medium or high anthropomorphic robotic chefs, Fs(1, 74) > 10.91, ps < 0.001. (2) foods made by high anthropomorphic robotic chefs were predicted as better than those made by low and medium anthropomorphic robotic chefs, Fs(1, 74) > 3.92, ps < 0.05. (3) quality predictions on foods made by low and medium anthropomorphic robotic chefs were not statistically different, F(1, 74) = 1.43, p = 0.24. The main effect of cooking difficulty level was significant, F(2, 148) = 16.23, p < 0.001, \(\eta_{p}^{2}\) = 0.18. Pairwise comparisons showed that low cooking difficulty level dishes were predicted as of lower quality than medium and high cooking difficulty level dishes, Fs(1, 74) > 14.77, ps < 0.001, and the latter two kinds of dishes were not statistically different, F(1, 74) = 2.48, p = 0.12. The interaction between chef and cooking difficulty level was not significant, F(6, 444) = 0.77, p = 0.60, \(\eta_{p}^{2}\) = 0.01.

Fig. 8
figure 8

Food quality prediction for Experiment 3, as a function of chef and cooking difficulty level. Error bars represent ± 1 SEM. (AR = anthropomorphic robotic)

5.4.2 Perceptions of Chefs

Repeated measures ANOVAs with the factor of chef (low vs. medium vs. high anthropomorphic robotic vs. human) and planned contrasts showed that (Table 1): (1) Competence, likeability, and intelligence: consistent with the results of food quality prediction, human chefs were perceived as superior to low, medium, and high anthropomorphic robotic chefs, Fs(1, 74) > 19.57, ps < 0.001, high anthropomorphic robotic chefs were perceived as superior to low and medium anthropomorphic robotic chefs, Fs(1, 74) > 6.34, ps < 0.05, and the latter two chefs were not statistically different, Fs(1, 74) < 2.43, ps > 0.12. (2) Anthropomorphism and animacy: participants regarded human chefs were the highest, followed by the high anthropomorphic robotic chefs, then the medium anthropomorphic robotic chefs, and the low anthropomorphic robotic chefs were the lowest, Fs(1, 74) > 5.20, ps < 0.05.

5.5 Discussion

Consistent with previous experiments, the results of Experiment 3 showed that participants predicted human chefs and their foods were better than robotic chefs and their foods, regardless of the degrees of anthropomorphism of robotic chefs. Increasing the anthropomorphic appearance of robotic chefs can promote their perceived anthropomorphism and animacy. However, increasing anthropomorphism from low to medium did not statistically promote people’s prediction of food quality, as well as robotic chefs’ competence, likeability, and intelligence. These predictions could only be promoted by increasing the appearance of robotic chefs to high anthropomorphism.

The high anthropomorphic robotic chefs were perceived as more anthropomorphic than medium anthropomorphic robotic chefs. Their food was also predicted as of higher quality than that cooked by medium anthropomorphic robotic chefs. These findings supported the ‘like-me’ hypothesis (H4a) [15, 16] rather than the uncanny valley hypothesis (H4b) [20, 21], suggesting the familiarity cues in the high anthropomorphic robotic chefs might facilitate food quality prediction. However, it should be noted that the current finding did not disprove the uncanny valley hypothesis because there were only three levels of anthropomorphism for robotic chefs in the current study, which might be too coarse to examine the fine-grained curve. The high anthropomorphic robotic chefs were perceived as more likable than the low and medium anthropomorphic robotic chefs, suggesting the current high anthropomorphic robotic chefs were not perceived as too much anthropomorphic and induced people’s revulsion.

6 General Discussion

With three experiments, this study revealed people’s perceptions towards the products (i.e., food) provided by the robotic chefs, a kind of social service robot which is promising but currently not favored in restaurants and hotels. The results showed that participants predicted the foods cooked by robotic chefs were above average quality; however, they consistently hold lower food quality predictions for robotic chefs than human chefs, regardless of dishes’ cooking difficulty level, novel cues in chefs and food, or the anthropomorphism level of robotic chefs. For both human- and robot-cooked food, dishes with higher cooking difficulty level were predicted as of higher quality than dishes with lower cooking difficulty level. Unlike human chefs, who were expected to be better at cooking food of their own culture, robotic chefs were expected to cook Chinese and Western food of equal quality. Increasing the appearance of robotic chefs from low or medium to high anthropomorphism could promote food quality prediction, but which was still inferior to the prediction of human-cooked food.

6.1 Academic and Practical Implications

Across the three experiments, the relative food quality prediction between dishes made by human and robotic chefs was not affected by dishes’ cooking difficulty level. Robotic chefs were regarded as less capable than human chefs in preparing both low and high cooking difficulty dishes, suggesting that people predicted robotic chefs as inferior to human chefs in both basic and advanced cooking skills. This prediction was different from our common sense to humans, as we often assume people who have basic cooking skills are competent in cooking low difficulty dishes. Considering cooking skills involving a number of varied components [28], people might have regarded robotic chefs as lacking some basic and advanced cooking skills components. Further investigation on these critical cooking skills components might be helpful to improve people’s expectations of robotic chefs’ competence and food quality prediction.

However, for a given robotic or human chef, participants consistently predicted that food with higher cooking difficulty level was of higher quality. It might be due to that the lower cooking difficulty level food majorly includes vegetables (e.g., cabbages and potatoes), whereas higher cooking difficulty level food involves ingredients with higher caloric density (e.g., pork and steak). Previous studies have shown that people have an evolutionary bias in detecting and remembering locations of high-calorie food [32,33,34,35], and rated high-calorie food more palatable than low-calorie food [36]. Therefore, the bias towards the high-caloric ingredient may make people predict that the high cooking difficulty level food is of higher quality than low cooking difficulty level food, although both foods are made by the same chef.

The results of Experiment 2 suggested that familiarity rather than the novel cues in robotic chefs or food played an important role in food quality prediction. On the one hand, these results were consistent with previous studies and theories on food expectation, which have underlined the role of familiarity [10, 17, 18], suggesting increasing peoples’ familiarity with robotic chefs and their food might be a promising way to promote peoples’ expectations towards them. However, on the other hand, the results of Experiment 2 did not exclude the possible effect of novel cues, as the quality prediction of robotic-cooked food might be perceived as better than the ordinary machine-made food.

Increasing the anthropomorphic appearance of robotic chefs, to some extent, is a way to increase peoples’ familiarity, which is proved to be effective in promoting food quality prediction in Experiment 3. These results are consistent with the ‘like-me’ hypothesis [15, 16], suggesting that increasing the perceived similarity between robotic chefs and human chefs could promote the perceived competence, likeability, intelligence, anthropomorphism, and animacy of robotic chefs. Further, the results showed that the ‘like-me’ anthropomorphism affected not only the perceptions of robots but also the predictions towards the products (i.e., food) made by robots, suggesting the familiarity cues in robotic chefs are critical to food quality prediction.

6.2 Limitations and Future Research

In the current study, photos rather than real robotic chefs and dishes were used to investigate the effects of extrinsic cues on the food quality prediction, because most potential customers have no experience with food made by robotic chefs, and they might make their judgments based on the glance or even their imaginary pictures of robotic chefs and dishes. However, it is possible that the customers will be fascinated by the experience of the dynamic robotic cooking process [14], which enables them to experience more novel high-tech cues in robotic chefs and thus facilitate their food quality prediction. Moreover, real tasting rather than predicting based on viewing might adjust people’s prejudice toward robot-cooked food. Besides, the photos of dishes in this study were adopted from recipe websites, where photos are generally made to “look good”; therefore, these good-looking photos might positively shift participants’ food quality prediction.

In this study, participants were simply asked to make food quality predictions on the nine-point scales, without instructions or implications about the reference food, such as food cooked by myself, my mother, cooks in my university cafeteria, or cooks in luxury restaurants. The dishes used in this study are very common to our participants, who have many experiences with these dishes cooked by various people; therefore, participants may make food quality predictions based on their general experiences with the dishes. Further studies specifying whose food to compare with might help us better understand people’s food quality prediction between robotic chefs and a specific cook, such as myself, my mother, or the professional cook.

With the progress of the food industry, foods we eat in restaurants become more and more machine prepared, and human cooks intervene towards the end of the cooking process. In contrast, robotic chefs will turn this mixed machine-human cooking process into a fully machine-cooking process. Thus, it will be interesting to investigate whether people perceive the mixed machine-human cooking food as superior in quality to robot-cooking food, and if it is, what factors influence people’s perception.

As humanlike machines, anthropomorphic robot chefs have the attributes of both humans and machines. Previous studies on the perceptions between human-made and machine-made products suggested the perceived naturalness [11], love [12], and uniqueness [37] are the critical factors in people’s preference for human-made products over machine-made products. Moreover, this preference varied in different consumption contexts, such as buying gifts for close and distant friends, or buying a poster for decoration or demonstration. Therefore, people’s prediction and preference for foods cooked by robotic chefs might also be affected by the perceived naturalness, love, uniqueness, and other possible factors in foods, and vary in different consumption contexts. Furthermore, the anthropomorphism of robotic chefs might interact with these factors and consumption context, which is worthy of further investigation.

7 Conclusion

In summary, this study showed that people predicted the food made by robotic chefs was of good quality, confirming the promise of adopting robotic chefs as social service robots in hotels and restaurants. However, further efforts are needed to increase the quality prediction of robot-cooked food, as it was consistently lower than human-cooked food. Increasing robotic chefs’ appearance to high anthropomorphism and enabling robotic chefs to cook high cooking difficulty level foods might be effective ways to promote food quality prediction. Different from past studies and theories (e.g., the ‘like-me’ and uncanny valley hypothesis) majorly focusing on peoples’ perception towards robots, the current study revealed factors that might affect peoples’ perception towards robots’ products (i.e., food), which might be informative for understanding peoples’ perception towards other robots’ products or social services (e.g., arts and floral arrangements). Theoretically, the current study extended the ‘like-me’ hypothesis from the perception of robots to robots’ products, indicating that increasing robots’ anthropomorphism can also promote peoples’ prediction of robots’ products.