Introduction

Autism spectrum disorder (ASD) is a neurodevelopmental disorder (NDD) characterised by differences in social interaction and communication, intense and restrictive interests, and repetitive behaviours (Buescher et al., 2014; Leekam et al., 2011; Marotta et al., 2020). Autistic children typically present with differences in social reciprocity, aversion to change and unfamiliar environments, and social anxiety. This is often accompanied by speech disorders (e.g., language delay, echolalia, and lack of speech); limited ability to perform activities of daily living (e.g., learning, motor skills and memory); and other behavioural features (e.g., excitability and sensory hyper- or hyposensitivity) (Guthrie et al., 2013; Kim & Lord, 2013). Identifying and supporting these specific features early in life can augment a child’s ability to thrive.

A well-established characteristic of ASD is its heterogeneity in manifestations, severity, course, and therapeutic response of both diagnostic and associated psychiatric and behavioural presentations (Bryson et al., 2007; Zheng et al., 2020). Autistic children show clinical variability in behaviour, communicative functioning, rate of development, level of intellect, co-occurring conditions, and adaptive functioning (Masi et al., 2017; Zwaigenbaum et al., 2015). While heterogeneity is a recognised feature of the autism spectrum, its underlying aetiological mechanisms remain unclear (Uljarević et al., 2017; Vivanti et al., 2014). The proposed etiological pathway of ASD is not singular and involves a combination of genetic influence and environmental factors (Masi et al., 2017). The lack of clarity as to the underpinnings of the clinical diversity in autism has impeded the development of clinically effective treatments and supports (Uljarević et al., 2017; Vivanti et al., 2014). Alternatively, recognising distinct ASD subtypes could lead to streamlined education and therapy for specific needs and complexity levels (Charman, 2014). However, the usage of comprehensive sampling data gathered from multiple behavioural and etiological domains to identify overall subgroups in ASD is scarce.

This study aimed to identify meaningful subgroups in autistic children by assessing their behavioural, sensory, and perinatal factor profiles using the Australian Autism Biobank (AAB) - the largest Australian national data repository of detailed biological and clinical information about children on the autism spectrum, created to further autism discovery and research (Alvares et al., 2018). Our research questions were: (1) Can children with ASD be subtyped based on their behavioural, sensory and perinatal factor profiles? and (2) Comparing the subtypes, are there differences in clinical profile vis-à-vis exposure to perinatal factors? It is expected that identifying homogenous subgroups with similar clinical profile and exposure to perinatal determinants will ultimately help to bring targeted interventions and supports for distinct subgroups of autistic children, specifically tailored to the unique profile of each subgroup. Further, finding precision medicine approaches for autistic children may result in improved course, treatment outcomes and quality of life in adulthood (Loth et al., 2017; Woolfenden et al., 2022).

Methods

Study Design and Participants

This study is a secondary data analysis of the AAB, a collection of phenotypic and biological data of children on the autism spectrum (aged 2–17 years) along with their siblings, parents, and unrelated controls without autism (Alvares et al., 2018). Participants were recruited between 2013 and 2018 across four states of Australia: Telethon Kids Institute, Perth, Western Australia; Olga Tennison Autism Research Centre, La Trobe University, Victoria; University of New South Wales, Sydney, NSW; and Lady Cilento Children’s Hospital, Brisbane, Queensland (Alvares et al., 2018). All children with an ASD diagnosis and when feasible, family members were invited to participate in the AAB. Clinical data and biological samples were collected at clinical facilities in each site or by completed assessments and samples sent by mail for remote families (Alvares et al., 2018). For the purpose of this study, our analysis pertained to information obtained from detailed history, clinical and behavioural data of two participant groups, (1) ASD probands: Children with a clinically confirmed diagnosis of ASD, and (2) ASD queries: Children suspected to have ASD but did not meet the Diagnostic and statistical manual of mental disorders-5® criteria (American Psychiatric Association, 2013). Two participant groups in the AAB, namely siblings of ASD probands and controls were not included in this analysis. No exclusion criteria were applied with regard to thoroughness of materials submitted.

Data Collection

Clinical and behavioural data within the AAB was collected at entry, via standardised clinical assessments appraising each participant’s clinical features, cognitive functioning, and physical development. To achieve the aim of this study, the analysis was limited to data associated with only the behavioural, sensory, and perinatal profiles of the children. This was achieved via multiple questionnaires completed by each participant’s parents or caregivers, namely the Vineland Adaptive Behaviour Scale-II (VABS-II) measuring adaptive functioning and behavioural manifestations, Short Sensory Profile-2 (SSP-2) for sensory processing abnormalities, and a custom family history questionnaire (FHQ) for sociodemographic data and detailed past child and family histories (Dunn, 2014; Sparrow et al., 2005). The data pertaining each participant included in the primary analysis was entered only once.

Latent Variables

The primary analysis of this study was a latent class analysis (see Statistical analyses section), which utilises the child’s age and twenty-six latent variables representing the behavioural (VABS-II), sensory (SSP-2), and perinatal (FHQ) factorial domains to create meaningful autism subgroups using the AAB data (Dunn, 2014; Sparrow et al., 2005).

Behavioural Profiles

Five composite-based scores generated by the VABS-II questionnaire were used to represent five behavioural domains: communication, socialisation, daily living, motor skills, and maladaptive behaviours (including temper tantrums, aggression, or disobedience). The scores of each domain are based on a five-level scale: low, moderately low, adequate, moderately high and high. The communication category measures receptive, expressive and written communication skills while the socialisation domain covers interpersonal relationships and play or leisure time. The daily living skills domain represents domestic and community interaction skills and the motor skills level measures gross and fine motor abilities.

Sensory Profiles

Four composite-based quadrants of seeking/seeker, avoiding/avoider, sensitivity/sensor, and registration/bystander generated by the SSP-2 questionnaire were employed to reflect sensory profiles of participants. The scoring system of each quadrant has five tiers: much less than others, less than others, just like the majority of others, more than others and much more than others. Seeking/seeker tendencies involve children seeking additional sensory input (e.g., touching objects in their environment or additional movement) throughout the day whereas the avoiding/avoider group are overwhelmed by and avoid sensory input, which is expressed through frustration precipitated by change and avoidance of large gatherings. High scores in the sensitivity/sensor classification suggest a higher rate of sensory detection and thus hyper-reactivity, manifesting as avoiding being messy and loud settings. Registration/bystander tendencies are results of missing or reduced detection of sensory input and include needing instructions repeated to them multiple times or frequent bumps into obstacles. The classifications are four distinct measures of sensory hyper-reactivity, hypo-reactivity, and seeking behaviour, which are traits significantly prevalent among autistic children (Ben-Sasson et al., 2009; Lane et al., 2014).

Perinatal Factors

Seventeen latent variables were chosen to represent the perinatal factors, segregated as antenatal, natal, and postnatal factors. These variables were selected following assessment of existing literature for significant and relevant perinatal determinants (Bilder et al., 2009; Yong et al., 2021).

The selected antenatal factors were mother’s age at birth (median age < 34 years or ≥ 34 years), mother’s body mass index (BMI) [BMI under 18.5 (Underweight), 18.5 to 24.9 (Healthy), 25.0 to 29.9 (Overweight) or 30.0 or higher (Obese)], raised blood pressure (BP) during pregnancy (yes/no), mother’s smoking history in the year before pregnancy (never, 1 to 5 daily or > 5 daily), mother’s prescription status during pregnancy (yes/no) and maternal history of physical and mental health illnesses. Physical health conditions were segregated into immune or inflammatory conditions (e.g., systemic lupus erythematosus, hepatitis, rheumatic fever) (yes/no) and non-communicable diseases (e.g., asthma, diabetes, heart disease, hypertension) (yes/no). Maternal mental illnesses (yes/no) investigated in the family history questionnaire included but were not limited to anxiety, depression, and schizophrenia.

Natal variables selected included birth factors while mothers were pregnant with the participating child. These included induction of labour (yes/no), type of delivery (caesarean or vaginal), length of term (preterm, early term, full term or late term), vaginal spotting or bleeding (yes/no) and complications during pregnancy (yes/no).

The predisposing postnatal variables selected were problems in child’s first week of life (e.g., jaundice, feeding problems, infections) (yes/no), unusual development during child’s first six months of life (e.g., maladaptive behaviours, feeding issues, behavioural abnormalities) (yes/no), postnatal conditions (yes/no) and baby’s birthweight (low or normal to high birthweight). Postnatal conditions that were investigated in the FHQ included but were not limited to intellectual disability, global developmental delay, cerebral palsy, and encephalitis.

Statistical Analyses

Latent class analysis (LCA) is defined as an empirical approach to identifying underlying subtypes or subgroups within patterns of data consisting of latent variables coded categorically (Weller et al., 2020). The analysis involved creating various latent class fit-models, starting with a one-class model and then creating subsequent models by increasing the number of classes. Finally, the model that best illustrated the latent structure of the data was selected. This was accomplished by appraising the specific goodness of fit statistics and interpretability of all the derived latent class models. The statistics included the log-likelihood ratio (LLR), where a higher value is favoured, Bayesian Information Criterion (BIC) and Akaike Information Criterion (AIC), of which smaller values indicate a better fit and parsimony (Weller et al., 2020). The Lo-Mendell-Rubin Likelihood Ratio Test (LMR-LRT) provides a p value, which if statistically significant, indicates that the model better represents the latent structure of data than the model with one fewer class (Nylund et al., 2007). The entropy is a value between 0 and 1, where a greater value indicates more precise classification of individuals into subgroups (Petersen et al., 2019). LCA uses the actual probability of each score in the categorical variables to assign individuals to their most likely class. The differing probabilities of the variables were compared between classes to find distinct and meaningful subgroups. Statistical analysis was executed with both JMP Pro 17 (“JMP® Version 17,” 1989–2023) and RStudio (RStudio 2023.03.1 + 446).

Results

This study used a sample of 1168 participants, including ASD-probands and ASD-queries. The mean (SD) age of the sample was 7.6 (3.9) years and had a higher proportion of males (67.8%). The demographics of the study population are outlined in Table 1. The respective probabilities for each score under each latent variable are available in the Supplementary Table 1.

Table 1 Demographics of the Australian autism biobank cohort

Latent Class Analysis

Table 2 outlines the goodness of fit statistics used to find the best-fitting latent class model. After comparison of the latent class models generated by LCA of the 26 latent variables, the four-class model was deemed the most appropriate to describe the identified subgroups. The LMR-LRT suggested that the four-class model was not a statistically better fit than the five-class model (p < 0.001) (Table 2). However, the gradual decrease of the adjusted BIC (aBIC) and consistent AIC (cAIC) values with each addition of class till the four-class model and the four-class model having a lower BIC value than the five-class model (Fig. 1), indicated that the four-class model was the best fit to explain the latent class structure of the sample. Further, the four-class model also had a high entropy value of 0.892, reflecting good separation between the classes.

The subgroups created by the four-class model suggested an association between the perinatal factors and the behavioural and sensory profiles in each group, explaining the distinct clinical phenotype of each subgroup. Identified subgroups were labelled Class 1: “Most behavioural concerns with moderate sensory and adaptive behaviour skills concerns”, with mixed exposure to perinatal determinants and in particular high rates of maternal mental illness; Class 2: “Most behavioural skills concerns”, with significant exposure to natal and postnatal determinants; Class 3: “Least sensory and behavioural skills concerns”, with the least exposure to perinatal determinants and Class 4: “Most sensory impairment”, with significant exposure to perinatal determinants. The probabilities of all the participants in each class are listed in Table 3 and the main variations that distinguish the classes are described in Table 4.

Fig. 1
figure 1

Line chart of goodness of fit statistics of latent class models of autism-probands and autism-queries in the Australian Autism Biobank

Table 2 Latent class fit statistics
Table 3 Variable probabilities by latent class for children with diagnosed or suspected autism in the Australian Autism Biobank – four-class model
Table 4 Summary of subtype variations for children with diagnosed or suspected autism in the Australian Autism Biobank, based on the four-class latent model

Class Descriptions

In this study, Class 1 (28.45%) was labelled “Most behavioural concerns with moderate sensory and adaptive behaviour skills concerns”, which showed mixed exposure to perinatal determinants and particularly high probability of maternal mental illness. Compared to the overall group, Class 1 members scored “low” or “moderately low” across the behavioural adaptive skills items of the VABS-II questionnaire. However, Class 1 had the highest probability of 0.7309 of exhibiting clinically significant or elevated maladaptive behaviours compared to overall group. Children in Class 1 were more likely to score “much more than others” in the seeking/seeker, avoiding/avoider, sensitivity/sensor and most likely to score “much more than others” in the registration/bystander classifications of the SSP-2, with probabilities of 0.6483 for seeking/seeker, 0.9251 for avoiding/avoider, 0.8886 for sensitivity/sensor, and 0.7249 for registration/bystander. This was coupled with a mixed exposure to perinatal factors. Relative to the overall group, mothers of individuals in Class 1 were more likely to report a history of non-communicable diseases, mental health illnesses, and an induced labour with probabilities of 0.4491, 0.5942 and 0.4523 respectively. They were also less likely to have experienced raised blood pressure during pregnancy and a caesarean delivery with probabilities 0.0810 and 0.2663 respectively. Class 1 had a higher probability of unusual development during the first six months of life with a probability of 0.5212 but a lower probability of problems in the first week of life with a value of 0.3253 compared to the overall group.

Class 2 (33.19%) was characterised as having “Most behavioural skills concerns”, with significant exposure to natal and postnatal determinants. This subgroup had the highest likelihood of scoring ‘low’ or ‘moderately low’ in the communication, socialisation, daily living, and motor adaptive categories of the VABS-II questionnaire, with the mean probability being 0.9678 across all adaptive functioning items. Class 2 also presented with a higher probability of clinically significant maladaptive tendencies than the overall group, with a value of 0.4098. While the probabilities of exposure to majority of antenatal factors were either similar or lower compared to the overall group, Class 2 had a higher likelihood of exposure to postnatal determinants, such as developmental concerns during child’s first six months of life and postnatal conditions, with probabilities of 0.5227 and 0.6612 respectively. With respect to natal factors, mothers of this subgroup were most likely to have had an induced labour with a probability of 0.4755 and more likely to have had a caesarean delivery, vaginal spotting or bleeding and complications during pregnancy than the overall group, with probabilities of 0.5330, 0.3450, and 0.4533 respectively.

Class 3 (29.74%), was labelled as “Least sensory and behavioural skills concerns” and reported least exposure to perinatal determinants. Class 3 presented with the lowest likelihood and mean probability of 0.05838 to score “much more than others” across all sensory classifications of the SSP-2. This was paired with lowest probabilities of scoring ‘low’ or ‘moderately low’ in the behavioural adaptive items with 0.3718 for communication, 0.4359 for socialisation, 0.4092 for daily living, and 0.3813 for motor skills. Class 3 was also less probable than the overall group to have ‘clinically significant’ or ‘elevated’ maladaptive behaviours, with a probability of 0.6676. Class 3 presents with a collectively lower probability of exposure to maternal antenatal factors than the overall group, with the probabilities being 0.4500 for maternal history of being overweight or obese, 0.1097 for immune or inflammatory conditions, 0.2644 for non-communicable diseases, 0.2062 for mental health illnesses, and 0.1121 for raised blood pressure during pregnancy. Mothers of children in Class 3 additionally had the lowest probability of natal vaginal spotting or bleeding. This was accompanied by the smallest probabilities of exposure to postnatal factors, with 0.2948 for problems in first week of child’s life, 0.2613 for unusual development in the first six months of life and 0.3203 for postnatal conditions.

Class 4 (8.62%), was identified as “Most sensory impairment”, with significant exposure to perinatal determinants. Class 4 had the highest percentage of individuals scoring “much more than others” across the seeking/seeker, avoiding/avoider and sensitivity sensor, classifications of the SSP-2, ranging from 79.0 to 100%. Participants in this subgroup were also more likely to have lower scores in the behavioural adaptive domains including socialisation, communication, daily living, and motor skills compared to the overall group. This was coupled with the greatest likelihood of exposure to antenatal factors, such as maternal obesity and hypertension during pregnancy, with probabilities of 0.6618 and 0.3970 respectively. Participants in this subgroup had the highest probability of a maternal history of immune or inflammatory conditions and mental illness and a probability higher than the overall group for non-communicable diseases with values 0.2655, 0.7217 and 0.3912 respectively. This was associated with probabilities of natal and postnatal variables either higher than or highest in the overall group, with 0.7018 for caesarean delivery, 1.000 for complications during pregnancy, 0.5597 for vaginal spotting or bleeding, 0.8.74 for problems in first week of child’s life, 0.6665 for unusual development during first six months of life, and 0.6480 for postnatal conditions.

Discussion

This study presents latent classes of children with autism differentiated by their behavioural, sensory, and perinatal profiles and the patterns between them. In doing so, we identified four distinct subgroups of autistic children. Class 1 presented with variable probabilities which were neither in the highest nor lowest categories, justifying its classification as moderate sensory and behavioural concerns with mixed exposure to perinatal determinants. Children in Class 2 were found to have the most adaptive behavioural skills concerns, with significant exposure to natal and postnatal determinants. Class 3 showed the least sensory and behavioural skills concerns with the lowest probability of exposure to perinatal determinants while Class 4 displayed the highest level of sensory impairment with the highest probability of exposure to most perinatal determinants. Each class showed a distinct behavioural and/or sensory profile directly corresponding to exposure to specific determinants in the antenatal, natal, and postnatal period.

Currently available studies are significantly diverse in terms of sample size, statistical methods, and the number and type of latent variables used to describe latent structure, making comparable existing literature scarce. Class 1 showed a gradient of symptom severity across the different variables, with classes being least to most behaviourally and/or sensorily affected. Despite multiple studies identifying subgroups predominantly differentiated by autism symptom severity (Cholemkery et al., 2016; Georgiades et al., 2014; Verté et al., 2006), studies have also found subtypes distinguished by qualitative differences (Hu & Steinberg, 2009; Pichitpunpong et al., 2019). This is likely due to the sample size needed to identify “well-separated” classes with qualitative variations versus those with simpler models and fewer latent variables (Nylund-Gibson & Choi, 2018). Class 1 displayed a higher probability of maladaptive tendencies and maternal history of mental illness compared to the overall group. Our findings and existing literature suggest a positive correlation between behavioural co-morbidities including emotional dysregulation in the child and a maternal history of mental illness such as anxiety and depression. A latent class analysis by Wiggins et al. (2019) on 672 autistic preschool children identified a similar trend where maternal anxiety and depression were associated with 4.4 times higher odds of child phenotype with mild language and motor delay with dysregulation (Wiggins et al., 2019). Further, autistic individuals who exhibit more teacher-reported internalising features and parent-reported behavioural problems were more likely to have a history of maternal recurrent depression (Cohen & Tsiouris, 2006).

Class 2 was characterised by significant concerns in all behavioural domains associated with significant involvement of natal and postnatal determinants. Montogomery et al. (2023) performed a latent profile analysis of variables reflecting core autism traits, psychiatric and medical comorbidity on a sample of 754 children with autism and identified a subgroup of individuals with significant social communication and cognitive difficulties as well as increased sensory seeking behaviours (Montgomery et al., 2023). Further, a study by Momany et al. (2023) based on neonatal latent class analysis with variables including birthweight, gestational age, and the diagnostic status of common neonatal morbidities followed by analysis of covariance to examine eighteen-month neurodevelopmental scores by latent class identified 5 subgroups (Momany et al., 2023). They included complicated delivery, minor illness, and critically ill classes which attained lower neurodevelopmental scores compared to the healthy class, analogous to the pattern suggested by Class 2 (Momany et al., 2023). A notable difference in methodology is that Momany et al. (2023) used eighteen-month neurodevelopmental scores, whereas in the current study, age at the time of behavioural and sensory scoring was not accounted for. Existing literature emphasises the shared aetiologies of natal and postnatal determinants such as pregnancy and birth complications, neonatal physiological stress and inflammation, which can lead to future neurodevelopmental and behavioural concerns (Cheong et al., 2020; O’Shea et al., 2014).

Class 3 identified a subgroup with least concerns in all domains including sensory and behavioural adaptive skills relative to the overall group. Growth modelling studies indicate that core autism traits such as reduced adaptive functioning and communication difficulty in children improve over time and development (Baghdadli et al., 2012; Bal et al., 2015; Smith et al., 2012). Our findings suggest that Class 3, with the lowest probability of sensory and behavioural skills concerns, also had least probability of exposure to perinatal determinants. This highlights the cumulative impact of environmental and psychosocial determinants as ‘second hits’ may have an impact on severity of core-autism traits with those in Class 3 having the least exposure to perinatal issues and least concerns. This is consistent with available evidence about the greater likelihood of history of autoimmune diseases – rheumatoid arthritis, celiac disease, and type 1 diabetes – and serum anti-fetal brain antibodies etc. in mothers of autistic children (Atladóttir et al., 2009; Careaga et al., 2010).

Class 4 was characterised by the highest sensorily impairment in the overall group, associated with the highest exposure to perinatal determinants. A model-based cluster analysis conducted by Lane et al. (2014) on a smaller sample size of 228 autistic individuals examining autism symptom severity, sensory differences, and non-verbal intelligence quotient identified a distinct subgroup of individuals with generalised difficulty across all sensory domains, similar to Class 4 (Lane et al., 2014). However, Lane et al. (2014) identified subgroups of cases with similar sensory disabilities using cluster analysis, while our study used latent class analysis to find convergence based on item scores of variables. Previous studies have suggested the prevalence of sensory differences in autistic children to be as high as 92% and subgrouping based on sensory profile may have clinical implications in matching the right supports (Tomchek & Dunn, 2007). Class 4, much like the other subtypes, suggested a positive association between perinatal factors and clinical features of autism. Consistent with this finding, a study by Traver et al. (2021) found that autistic children with a maternal history of pregnancy-induced hypertension and pregnancy complications such as placental pathology presented with greater behavioural and communication disability (Traver et al., 2021).

The study design had both strengths and limitations. In terms of study strengths, the minimal set of inclusion criteria of the AAB allows for inclusion of children with a diverse range of clinical phenotypes and language and intellectual capacities including minimally verbal children, or children with a co-occurring cognitive impairment who are frequently excluded from projects involving biological sample collection. In addition, the large sample size of 1168, has provided adequate power to recognise well-distinguished groups. Latent class analyses with low sample sizes are subject to limitations such as failure to converge, poor functioning fit indices and failure to reveal classes with low memberships (Nylund-Gibson & Choi, 2018).

We also note several limitations. The variables used in the analysis do not include medical or psychiatric comorbidities of the child, which evidence shows can be significant contributors to an autistic child’s phenotype and family functioning, and thus may be crucial to identifying distinct subgroups (Hyman et al., 2020; Montgomery et al., 2021). The data collected within the AAB is also reliant on parent-reporting and therefore is prone to recall bias. Moreover, comparison with existing literature is challenging as to the authors’ knowledge, only a handful of studies have considered perinatal factors and investigated their association with behavioural and sensory manifestations. Direct comparison with existing subtyping studies is limited because of significant diversity in variables chosen for analyses.

Conclusion

This study identified four subgroups within the AAB dataset that were distinguished based on behavioural skills and sensory impairments, and exposure to perinatal determinants. Our findings emphasise the relevance of perinatal determinants in distinct autism phenotypes. Further research is indicated for validation of the subgroups recognised. The identification of clinically meaningful subgroups and patterns between perinatal factors and core autism traits can pave the way for matching the right intervention and supports. Our findings also highlight a positive association between perinatal determinants and impairment in sensory processing and behavioural skills of autistic children. This association should be explored further in future research as replication of our findings is warranted to validate the subtypes identified. With distinct subgroups in place, practitioners will be able to classify children early in life based on perinatal determinants. This provides opportunities for early and individualised management of children’s behavioural and sensory profiles, which may improve the overall quality and wellbeing of both children and their families. Such individually tailored intervention and supports is expected to result in the best possible outcomes and developmental trajectories for children including better family functioning and societal participation.