2

I am trying to study the volunteer trajectories of a group of individuals. My data looks like something like this.

ID Program Area Impact Area Hours Served Organization Served
1  Tutoring     Education        2        org 1     
1  Hunger       Basic Needs     .25       org 2
1  Gardening    Beautification   1        org 3
2  Tutoring     Education        2        org 4
3  Hunger       Basic Needs      3        org 2
3  Hunger       Basic Needs      1        org 2
4  Tutoring     Education        1.5      org 1
4  Tutoring     Education        1.5      org 1 
4  Tutoring     Education        2        org 4
5  Hunger       Basic Needs      1        org 2
5  Hunger       Basic Needs      1        org 5

I was able to successfully install the TraMinerR package but I am struggling on how to convert this data from spell to sts format.

This is the code I have been trying to use to convert my data

mydata.sts <- seqformat(mydata, from = "SPELL", to = "STS",
  id = "ID", begin = "begin", end = "end", status = "states",
  process = FALSE)

I don't have start and end dates for these trajectories. Any insights or tips on how I might be able to address this?

1
  • What category you use depends on your theory/what you are trying to understand about differences and patterns in the sequences.
    – Elin
    Commented Feb 14, 2021 at 17:48

1 Answer 1

1

There are three issues we need to solve to be able to use these data with TraMineR.

  1. Time must be discreet because it is used to determine positions or differences between positions in a discreet sequence. Here a solution is to transform hours into quarters of hours.

  2. The only time information provided is Hours Served, i.e. durations. We need additional information (or assumptions) to transform these durations into start and end times. I will assume each individual (id) is observed from time 1 and the hours served are consecutive. Thus, begin time will be 1 for the first spell, 1 plus the duration of the first spell for the second spell, and so on. End time will be the duration of the spell for the first spell, and the previous end time plus the spell duration for the next spells.

  3. There are three categorical variables and it is not clear what should be used as status variable. I will assume that the status is the interaction between the Program Area and the organization number.

The code below illustrates these transformations:

library(TraMineR)

dat <- read.table(header=TRUE, text="
ID Program.Area Impact.Area Hours.Served Organization.Served x
1  Tutoring     Education        2        org 1     
1  Hunger       Basic.Needs     .25       org 2
1  Gardening    Beautification   1        org 3
2  Tutoring     Education        2        org 4
3  Hunger       Basic.Needs      3        org 2
3  Hunger       Basic.Needs      1        org 2
4  Tutoring     Education        1.5      org 1
4  Tutoring     Education        1.5      org 1 
4  Tutoring     Education        2        org 4
5  Hunger       Basic.Needs      1        org 2
5  Hunger       Basic.Needs      1        org 5
")

Need discreet time

dat[,4] <- 4*dat[,4]
names(dat)[4] <- "Quarter.Hours.Served"

Computing begin and end times assuming Hours.Served are consecutive and first spells start at 1.

k <- ncol(dat) + 1
dat[,k] <- 1
dat[,k+1] <- dat[,4]
names(dat)[k] <- "Begin"
names(dat)[k+1] <- "End" 
for (i in 2:nrow(dat)) {
  if (dat[i-1,1]==dat[i,1]) {
    dat[i,k] <- dat[i-1,k+1] + 1
    dat[i,k+1] <- dat[i,4] + dat[i-1,k+1]
  }
}

Status as interaction between Program Area and org number

dat[,k+2] <- interaction(dat[,2],dat[,"x"])
names(dat)[k+2] <- "Status"
dat[,c(1,k,k+1,k+2)]

#    ID Begin End      Status
# 1   1     1   8  Tutoring.1
# 2   1     9   9    Hunger.2
# 3   1    10  13 Gardening.3
# 4   2     1   8  Tutoring.4
# 5   3     1  12    Hunger.2
# 6   3    13  16    Hunger.2
# 7   4     1   6  Tutoring.1
# 8   4     7  12  Tutoring.1
# 9   4    13  20  Tutoring.4
# 10  5     1   4    Hunger.2
# 11  5     5   8    Hunger.5

Transforming spell data into STS form and creating the state sequence object

s.dat <- seqformat(dat[,c(1,k,k+1,k+2)], from="SPELL", to="STS", 
                   limit=max(dat[,k+1]))
seq <- seqdef(s.dat, cnames=1:20)
print(seq, format="SPS")

#   Sequence                                   
# 1 (Tutoring.1,8)-(Hunger.2,1)-(Gardening.3,4)
# 2 (Tutoring.4,8)                             
# 3 (Hunger.2,16)                              
# 4 (Tutoring.1,12)-(Tutoring.4,8)             
# 5 (Hunger.2,4)-(Hunger.5,4)  

seqiplot(seq)

enter image description here

0

Not the answer you're looking for? Browse other questions tagged or ask your own question.