Difference: InterruptionStudy (20 vs. 21)

Revision 212011-03-22 - MatthewBrehmer

Line: 1 to 1
 
META TOPICPARENT name="C-TOC"

Interruption Lab Study

Changed:
<
<

Abstract

>
>

Methodology

It is hypothesized that the effect of interruptions may interact with the age of the test-taker and the type of task being interruped. This mixed experimental design involves three factors: age, level of interruption demand, and type of main task. We will additionally rely on self-reports to acquire an understanding of task resumption strategies employed by participants in three age groups.

Apparatus

Half of the experimental sessions were conducted using a Lenovo ThinkPad T400 with a 2.26Ghz Core 2 Duo processor and 1.92 GB RAM, while the remaining sessions were conducted using and an IBM ThinkPad T43 with a 2.0GHz Pentium M processor and 1GB + 512MB RAM. Both laptop computers were running the Windows XP operating system. Both computers were connected to a 17 inch diagonal display, with a resolution of 1024 by 768 pixels; the laptop displays were not used. A logitech M110 optical mouse was used with both laptops; identical mouse gains and tracking speeds were used. For the experimental tasks, the screen was positioned the screen at a comfortable viewing angle. The experimental software was an Adobe AIR application written using the Adobe Flex 4.0 SDK.

Participants

We are recruiting 12 participants from three age groups, for a total of 36 participants: Young (19-54), Pre-Old (55-69), and Old (70+). The naming of these age groups is based on accepted terminology in the ageing literature. [Rimkus, A., Melinchok, M. D., McEvoy, K., & Yeager, A. K. (Eds.) (2005). Thesaurus of Aging Terminology. Eighth ed. AARP.]The justification for these groupings rests on the age-related changes that occur in cognition, notably that higher cognitive function remains relatively stable up to about age 55, after which there is a small decline, followed by a much steeper one after 70. [Craik, F. I. M., & Salthouse, T. A. (Eds.) (1992). The Handbook of Aging and Cognition. 2nd ed. Hillsdale, NJ: Erlbaum.]

Participants receive $5 for each half hour of participation. We are recruiting younger participants through advertisements on campus and through word-of-mouth advertisement. Older participants were recruited through word-of-mouth advertisement and a postings in the community. To participate, all participants must be free of any diagnosed cognitive impairments or motor impairments to their right hand, and have normal or corrected-to-normal eyesight.

We administered the Montreal Cognitive Assessment (MoCA) test [CITATION] to help ensure that participants are not cognitively impaired. A score of 26 / 30 or higher is considered normal. Additionally, we administered the North American Adult Reading Test (NAART) [B. Uttl. North american adult reading test: Age norms, reliability, and validity. Journal of Clinical and Experimental Neuropsychology, 24(8):1123–1137, 2002.] to help ensure participants had sufficient English fluency to follow our instructions. The NAART is a quick to administer test measuring verbal intelligence, which requires participants to read a list of 59 words increasing in difficulty. Using a somewhat arbitrary threshold, we accept only participants who get at least 25% of the words correct. Participants who do not have a normal MoCA score or those who do not meet our NAART threshold completed a shorter version of the study, but their data is not included in our analysis.

Tasks

To gain a better understanding of how interruption demand and age interact with task type, two main tasks were used in this study. Both tasks were adapted from C-TOC tests: Sentence Comprehension and Square Puzzles.

Sentence Comprehension (Figure X) tests verbal memory, wherein each trial is comprised of an instruction step and an execution step. The instruction screen displays 1-2 sentences instructing the user to arrange coloured geometric figures. The user clicks a 'Continue' button to advance to the execution screen, where geometric figures are to be arranged as per the instruction.

Square Puzzles (Figure Y) tests non-verbal spatial reasoning, wherein each trial is comprised of a single screen. The user is instructed to move lines to create a certain number of complete squares in a specified number of moves, without leaving any incomplete squares.

A participant can proceed to the next trial at any time by clicking on the 'Next' button at the bottom-right corner of the screen.

Two interruption tasks were used in this study, corresponding to two levels of interruption demand. The interrupting tasks occupy the entire screen, occluding the main task. Interrupting tasks are preceded by an interruption notification, in which a red banner bearing the message 'Interruption Pending' flashes for 2 seconds at the top of the screen; during this time all interactivity is disabled. Once begun, both interruption tasks display an automated sequence of a dozen cartoon images inside a box outline in the middle of the display. There is no fixed order to the presentation of images; each image is selected semi-randomly from a bank of ten images. Each image is displayed for 1500ms; no image is displayed in the box for 100ms between successive images. The total length of the interrupting tasks is approximately 19 seconds. At the end of an interrupting task, the user is prompted to click in order to dismiss the interruption, returning to the interrupted main task.

Interruption types are visually differentiated by a coloured instruction at the top of the screen. In the low-demand interrupting task, the user is prompted, in red font, to watch the sequence of images passively. The high-demand interrupting task (Figure Z) is a visual n-back memory game [CITATION], which is known to place high demand on working memory. In this task, the user is prompted, in green font, to click inside the box whenever the current image repeats what was shown 2 images prior to the current image. Feedback is displayed adjacent to the box outline in the form of a green check icon for correct responses (true positives), while a red 'x' icon is displayed for incorrect answers (false positives and false negatives). No feedback is shown for true negatives. The random sequence of images is weighted such that there is a 40% probability that any given image will repeat what was shown 2 images prior, otherwise the image is selected randomly.

Design

The experiment used a mixed design with two counterbalanced main tasks and three counterbalanced levels of interruption demand: 3 (age groups: young, pre-old, and old) x 2 (main tasks: Sentence Comprehension and Square Puzzles) x 3 (interruption demand: none, low, and high). Age was the only between-subjects factor. Each participant completed 30 trials in the Sentence Comprehension task, and 24 trials in the Square Puzzles task.

In both main tasks, three isomorphic trial blocks were used for the three conditions of interruption demand, comprised of trials increasing in difficulty. An equivalent level of difficulty between trials was not attempted, as the original C-TOC tests from which our tasks were adapted also increase in difficulty. For both main tasks, each participant was assigned one of the six possible trial block permutations at random.

Each Sentence Comprehension trial block contained 10 trials, wherein a subset of 4 trials were interrupted. This subset of trials was determined randomly with the restriction that interruptions could not occur on 4 successive trials and that no interruption may occur on the first trial in a block of trials. The same subset of trials received interruptions in conditions of low and high interuption demand. Interruption onsets were fixed in that they would occur shortly after beginning the execution phase of a trial, with a short lag corresponding to increasing trial difficulty.

Each Square Puzzles trial block contained 8 trials, wherein a subset of 3 trials were interrupted. This subset of trials was determined in the same fashion as in the Sentence Comprehension task. The same subset of trials received interruptions in conditions of low and high interuption demand. Interruption onsets were fixed in that they would occur after a 500ms lag upon the completion of the first move operation in a 2-move puzzle; in a 3-move puzzle, an interruption would occur after the completion of the first or second move operation, with equal likelihood.

Procedure

The experiment was designed to fit into a single 120 minute session. All participants finished in between 60 minutes and 100 minutes.

We began with the Montreal Cognitive Assessment and the North American Adult Reading Test.

Participants then received practice on both types of interrupting tasks: 1 practice trial was performed of the low-demand interrupting task and two practice trials was performed of the high-demand interrupting task. A 3rd practice trial was offered if performance on the high-demand interrupting task was still poor after two practice trials.

We then presented the first task (either Sentence Comprehension or Square Puzzles). An example trial was provided to familiarize the participants with the mechanics of the task. Participants were instructed to perform each task both as quickly and as accurately as possible. For the Square Puzzles task, participants were additionally instructed to not perform more than the specified number of moves, and to not leave remaining incomplete squares. Following this, a practice block of three trials was completed; during this set, the second trial was interrupted with a low-demand interrupting task, while the third trial was interrupted with a high-demand interrupting task. Participants then completed 3 blocks of trials of the first task. After each block, participants completed a short workload and fatigue survey. They then completed their second task.

Upon completion of the experimental trials, participants were subject to a brief interview regarding their perceptions of task difficulty and their strategies for task resumption following an interruption.

Measures

We included measures of speed and accuracy.

In Sentence Comprehension, we measured trial time as the uninterrupted time elapsed during the execution step of the trial until the participant clicks to proceed to the next trial; time spent in the instruction step was recorded separately. In Square Puzzles, we measured trial time as the total uninterruped time elapsed between the start of the trials and when the participant clicks to proceed to the next trial. Additionally, the average time interval between valid move operations was determined for each trial. For interrupted trials, we also recorded the interruption dismissal lag time, the task resumption time following an interruption, and the average interval between valid move operations prior to and following an interruption.

We included several measures of accuracy in both main tasks. The scoring scheme for these tasks was adapted from clinical scoring schemes and from the scoring scheme used in the C-TOC validity testing project. Accuracy scores were determined by the experimenter. A screen capture video of the entire experimental sesssion was recorded, which could be used to confirm scores. Each Sentence Comprehension trial has a possible total score based on the number of geometric figures the participant is instructed to arrange and the relative positioning of these figures upon completion. Each Square Puzzles trial has a possible total score based on the number of complete squares; points are deducted for each additional move operation and each incomplete square. For both main tasks, we additionally recorded the number of completed move operations, invalid moves, aborted moves, and clicks in each trial.

Performance accuracy on the high-demand interrupting task was also recorded, including the number of true positives, true negatives, false positives, and false negatives for each trial.

Hypotheses

We had the following hypotheses for this study:

H1. Age will interact with the presence of interruptions; older adults will perform proportionally worse on interruped trials than on corresponding uninterrupted trials.

H2. High-demand interruptions will incur worse performance than low-demand interruptions; interruption demand may interact with age such that older adults perform proportionally worse on trials interrupted with a high-demand interrupting task than on corresponding trials interrupted with a low-demand interrupting task.

H3. Interruptions will incur worse performance on the Sentence Comprehension task than on the Square Puzzles task; a three-way interaction between level of interruption demand, main task type, and age is also expected.

H4. Self-reports regarding the disruptive effects of interruptions are expected to reinforce H1-H3.

Planned Analysis

Analysis 1: To determine the local effects of interruption disruption in terms of speed and accuracy, we plan to conduct a 3 (age) x 3 (interruption demand) x [3,4] (trial) mixed-factor ANOVA for both Sentence Comprehension and Square Puzzles. We will only consider performance data from the subset of trials that are interrupted in the two interruption conditions, along with the corresponding subset of trials in the uninterrupted condition; results from the remainder of uninterrupted trials in each condition are discarded. Therefore, each subject will generate 12 Sentence Comprehension data points (4 in each interruption condition), and 9 Square Puzzle data points (3 in each interruption condition). The main effect of trial is not of interest, as the difficulty differences between trials are known to us. Similarly, the interactions of trial with age or interruption demand are also of little interest. This analysis will address H1 and H2, to determine if there are interactions between age and level of interruption demand.

Analysis 2: To determine the global effects of interruption disruption, we are to conduct a 3 (age) x 3 (interruption demand) x 2 (main task) mixed-factor ANOVA. A main effect of task is expected and is not of interest. We are, however, interested in the presence of any interaction between of task, age, and interruption demand, which will address H3. For this analysis, a normalized performance score and completion time is calculated for each block of trials, such that each participant will generate 6 data points (3 from each main task).

Preliminary Results

Preliminary findings indicate that there may be a three-way interaction between user age, level of interruption demand, and the requirements of the C-TOC test undertaken by the user. Research participants have reported that interruptions are more disruptive to their performance during a verbal Sentence Comprehension task than during the non-verbal Square Puzzles task. Our analyses will examine whether test performance is proportionally worse when high-demand interruptions are present, versus low-demand interruption and control conditions. The initial findings appear to suggest that technological interventions for mitigating the impact of interruptions and for ensuring the validity of the test scores could be developed to be specific to each type of test in the C-TOC battery.


Notes

  One's performance on C-TOC and other similar applications will likely be affected by interruptions and distractions. This study examines the nature of disruptions caused by interruptions in the C-TOC test-taker's environment, and how these disruptions interact with the age of the test-taker. Previous research relating to interruptions in the field of human-computer interaction (HCI) focuses predominantly on younger adults in workplace settings. A body of HCI research on the design of information and communication technology for older adults is similarly well established, yet is disjoint from research pertaining to interruptions and distractions. This research project attempts to unify these research areas. A decline in higher cognitive functioning and prospective memory in old age has been well documented in the cognitive psychology literature. As such, it is hypothesized that the effect of interruptions may interact with the age of the test-taker. Our mixed experimental design involves 3 factors: a between-subjects factor of age and a within-subjects factor of interruption demand. Our dependent measures include performance scores (i.e. completion time and accuracy) on a small subset of tests in the C-TOC battery. We will be recruiting an equal number of cognitively-healthy participants from 3 age groups: 19-54, 55-69, and more than 70 years old. Throughout each of the C-TOC tests used in this study, participants will experience three levels of interruption demand in the form of 20-second distraction tasks. Our aim is to learn how interruption demand interacts with age. We will additionally rely on self-reports to acquire an understanding of task resumption strategies employed by participants in the three age groups. This knowledge will hopefully guide the design of features in C-TOC pertaining to interruption mitigation and task resumption. Such knowledge would also be an important contribution to the broader HCI community, especially valuable for those designing tools for older adults.
 
This site is powered by the TWiki collaboration platform Powered by PerlCopyright © 2008-2025 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback