Anticipated, momentary, episodic, remembered: the many facets of User eXperience

User experience (UX) has been defined in several ways. In general terms, it refers to everything that is individually encountered, perceived, or lived through. The literature on UX reports studies mostly focused on specific interaction events, which may have an impact on the user's emotions and feelings. This paper provides a reflection on how UX evolves over time. We performed a medium term study comparing four types of UX: Anticipated, Momentary, Episodic and Remembered (or Cumulative) experience. Anticipated UX refers to the period of time before first use, and focuses on the expectations a person has on the product, service or system. Momentary UX refers to any perceived change during the interaction in the very moment it occurs. Episodic UX is an appraisal of a specific usage episode extrapolated from a wider interaction event. Remembered UX is the memory the user has after having used the system for a while. The different facets of UX have been analysed in a medium term research spanning over four weeks. The study compared the experience of ten users of a pedometer/fitness app that counts steps and burned calories all day long. The results show that the experience of use changed over time decreasing significantly before, during and after the interaction. The evaluative judgment related to the overall satisfaction with the product, was largely formed on the basis of an initial high expectation on pragmatic aspects (i.e. utility and usability) before and during the first encounters. After four weeks of use, the problems related to usability, reliability of data, and battery drain became a dominant aspect of how good the product was perceived. Hedonic qualities and Attractiveness were negatively impacted as well. The continuous reflection on the use, documented in online diaries, made the problematic aspects prevailing on the overall UX in particular on the evaluation of Episodic and Remembered UX. This prevented any change in behaviour in the participants.


I. INTRODUCTION
Roto et al. [1] edited the "User Experience White Paper", a document reporting the results from Dagstuhl Seminar on Demarcating User Experience, held in September 2010, where 30 experts from academia and industry worked together to define the concept of UX.In this document the editors highlight the multidisciplinary nature of UX, which has led to several definitions of and perspectives on UX.Interestingly they underline the importance of analysing time spans of user experience, stating that the actual experience of usage does not cover all relevant aspect of UX.Time spans matter in determining the UX.People have expectations on a certain product or system before the first encounter.Expectations are often generated by advertisements or others' opinions and have impact on the way people approach the system and prepare to use it.
At the first encounter and during the actual use people may change their appraisal of the system.Pragmatic and hedonic qualities of the product play a fundamental role in determining visceral responses related to momentary feeling perceived during usage [2].Different episodes of momentary experiences lead to a reflection on the experience itself.Reflection often determines a person's overall impression of a product and many factors come into play when thinking back and reflecting upon the total appeal and experience of use.The outcome of episodic experience is not necessarily equal in value to the sum of momentary experiences.Over time the perception of usage might change again.In remembering the overall experience, people select only few elements, positive or negative, which will determine the general opinion of the product and the chance that it will be recommended to others for later use Fig. 1.Barbara Fredrickson and Daniel Kahneman [3] proposed the model of remembered utility, which dictates that an event is not judged by the entireness of an experience, but by prototypical moments or "snapshots" that are considered representative of an event under uncertainty.The remembered value of snapshots determines the actual value of the entire experience.Fredrickson and Kahneman [3] explained such phenomenon saying that the selected snapshots correspond to the average of the most affectively intense moments of an experience and are related to the resulting feeling experienced in the end.So the duration of the experience does not affect the final Kahneman [4] studied the differences in perception between the actual and remembered experience through a series of experiments.In 1996, Redelmeier and Kahneman [5] assessed patients' appraisals of a painful colonoscopy procedure.They found that patients evaluated the discomfort of the experience in relation to the intensity of pain occurring in the end of the procedure (peak-end rule).So a peak painful short event occurring in the end is remembered as more negative than a prolonged painful episode occurring in the beginning or the middle of the experience.Length or variation in intensity of pain does not matter.

II. STATE OF THE ART
In the field of Experience Design the evaluation of the experience of prolonged use of interactive products is becoming a critical issue.Until few years ago, UX studies have mostly focused on short-term evaluations and the aspects relating to the initial adoption of new product design.
Only recently, an increasing number of studies have started focusing on assessing the changes in a person's experience in interaction with a product over time [6], [7], [8].
Consequently new methods and models have been defined to understand how the relationship between the user and the product evolves over long periods of time.
Karapanos et al. [9] developed "UX Curve", a method which aims at assisting users in retrospectively reporting how and why their experience with a product has changed over time.
Mahlke and Tḧring [10] developed a model which defines three components of user experience: perception of instrumental qualities (usability and usefulness), emotional reactions and perception of non-instrumental qualities (appeal and attractiveness).Applying this model, they provided evidence that instrumental and non-instrumental qualities influence emotional reactions in the use of interactive systems.
However, the majority of current UX evaluation methods still concentrate on single behavioural episodes and momentary evaluations.Vermeeren et al. [11] report that only 36% of methods focus on long-term period of experience.
Whilst measuring first encounters and momentary experiences is important for collecting feedback from users in particular in the early prototyping phases of the development process [11], recent researches demonstrated that different user experience aspects changes over time [12], [9].
Marti and Iacono [13] compared the experience of use of two tablet applications for zooming in and out while taking photos.They confronted two interaction modalities in the short and medium term: the classic "Slide to zoom" and the novel "Squeeze to zoom", a squeezable interface.Results obtained in the short-term evaluation revealed that "Squeeze to zoom" was awarded higher values than the "Slide to zoom" in the hedonic quality-stimulation and attractiveness dimensions, whilst it obtained lower values in the pragmatic quality and hedonic qualityidentity.However, in the longitudinal study, the usability of "Squeeze to zoom" improved whilst the attractiveness of "Slide to zoom" decreases significantly.Furthermore "Squeeze to zoom" was significantly more appreciated for its hedonic qualities and the effect was maintained over time.
Karapanos et al. [9] evaluated the experience of use of six participants for 1 month after the purchase of an Apple iPhone.They found that the relevance of novelty quickly faded away, while over time different the hedonic quality of the iPhone emerged.
Fenko et al. [12] found also that the perception of importance of sensory modalities changes over time.At the moment of purchasing a mobile phone, they found that vision was the most important perceived modality.After 1 month touch and audition became more important than vision.
In the following we report the result of a medium term study assessing the experience of use of a fitness application for mobile phone, conducted with ten participants.
The study compares four types of experiences as defined by Roto et al. [1]: anticipated experience, momentary experience, episodic experience and remembered experience.Anticipated UX refers to the expectations a person has before the first encounter with the product.Momentary UX refers to individual interaction episodes and the perceived change in use.Episodic UX refers to a usage episode extrapolated from a wider interaction event.Remembered UX refers to the memory of the user after having used the system for a while.

III. EXPERIMENTAL PROTOCOL
The study was conducted in Siena, Italy.Participants were asked to try out over four weeks, Pacer, a fitness application running on smartphone.
Pacer is a free app developed by Pacer Health, Inc.
[14] running on Android and iOS platform Fig. 2. It allows to track the steps, whether the phone is in the hand, pocket, in a belt or bag.Pacer records steps, distance, active time and calories burned all day, every day, and gives reminders to keep the person going.It allows to set health goals (e.g. to set the ideal weight) and to stay on target.Pre-defined programs like "from walking to slow ride" are also available.Through a GPS the app allows to track the walking, running or bicycling routes on a map.The ultimate goal is to bring together people based on common health goals and interests with the objective to improve health behaviour change outcomes.Users can create groups, connect with friends via Facebook, motivate each other in physical activities, achieve and compare performances, and ultimately create competitions.

B) Methodology
Ten subjects (M = 5 and F = 5) with an average age of 25.90 were involved in the study for a period of four weeks on a voluntary basis.Five participants were students of the MA course in Experience Design (University of Siena).Five participants were invited to join the study among their groups of friends.
As said above the study aimed to analyse any change among the anticipated, momentary, episodic and remembered experience of use over a month.
The study was conducted using different methods of data collection: an ad-hoc questionnaire to appraise the anticipated and remembered experience, an online-shared diary to assess the momentary experience, and AttrakDiff [15] to assess the episodic and remembered experience Fig. 3.A 5-point Likert scale with values from -2 to 2 was associated to each of the abovementioned functionality.The values were represented in the form of emoticons (-2 = very negative, -1 = negative, 0 = neutral, 1 = positive, 2 = very positive).To evaluate the anticipated experience, the questionnaire was administered at baseline before the app was installed on the participants' smartphones.None of the participants had ever used Pacer.At the beginning of the study all of them received a brief description of the six assessed functionality.
The momentary experience was evaluated using selfreporting.A closed group on Facebook was created to keep a shared diary.All participants joint the group.
The subjects were asked to take note of their experience on the everyday use of the application.The aim of keeping a diary was to express in a narrative form the impressions resulting from the use of the app in the very moment they were experienced by the subject (Momentary UX).The diary entries could be expressed in a free format, using text or images (e. g. screenshots of the app).However, participants were asked to associate an emoticon to each diary entry, the same used to assess the anticipated and remembered experience (-2 = very negative, -1 = negative, 0 = neutral, 1 = positive, 2 = very positive).
The episodic experience was evaluated using At-trakDiff, a questionnaire administered 5 times over a period of four weeks: T 0 = first encounter, T 1 = after 1 week, T 2 = after 2 weeks, T 3 = after 3 weeks, T 4 = after 4 weeks at the end of the study.
AttrakDiff, is a method developed by [15] to assess the user's experience and feelings in relation to interactive products and therefore a product's overall attractiveness.The questionnaire uses the technique of the semantic differential on pairs of opposite adjectives to evaluate the user experience.Users are asked to assess their experience and their perception of the product, responding to pairs of opposite adjectives.The adjectives are assessed on a seven-point Likert scale, from -3 to 3, in which 0 indicates neutrality.The questionnaire was developed in German and then translated into many languages including English.It consisted of 28 items, broken down into four dimensions:  Pragmatic quality or PQ: describes a product's usability.Indicates how the user can successfully achieve his or her goals using the product.A product need not be particularly beautiful or well-designed to satisfy this quality.For the present study we used an Italian version of the questionnaire translated by the authors.The same version was used in a previous study [16].
The remembered experience was assessed in two different ways: 1) using the same ad hoc questionnaire used to evaluated the anticipated experience, in order to compare what was expected with what was remembered; 2) conducting a paired-samples t-test to compare the four UX dimensions of AttrackDiff at time T 0 (first encounter) and T 4 (end of the study).

A) Anticipated and Remembered UX (ad hoc questionnaires)
The data collected on the anticipated UX are reported in Fig. 4.
The 10 participants had a high expectation on the use of the functionality Step counter, Community, Settings and GPS.They did not expect to have a similar positive experience associated to the functionality Calories and Notifications.The data related to the remembered UX after 4 weeks are reported in Fig. 5. Apparently participants had negative memories on the use of the app.After four weeks all functionality were rated between 0 and -1, except for Community, which maintained the same evaluation of the Anticipated UX.B) Momentary UX The 10 participants kept the diaries regularly recording events as soon as they occurred.In total, the corpus of 10 diaries contained 59 entries of which 29 comments were negative, 6 were neutral and 24 were positive.The negative comments were mostly related to poor usability of the app and to an improper way of functioning, which did not comply with the user expectations.These outcomes are consistent with the data collected through the questionnaire on the Anticipated UX.
After the first week of use, a 22-year-old boy wrote the following comment "The app drains the battery.Therefore if you value more saving battery rather than being sporty and burning calories, you have to start and stop it continuously...".
A similar comment was entered by a 25-year-old girl "I receive continuous alerts on the battery draining.This is annoying.I'll try to find a way to stop this.‖.(Fig. 6).Another negative comment on the usability was reported by a 33-year-old girl who wrote "not very positive ..

. I would say daunting: I tried a challenge but I failed and the app did not tell me why". A 26-year-old boy wrote: "After one hour walking the app marks only 223 steps. It is unreliable. I would like to uninstall it".
After two weeks of use, a 25-year-old boy wrote "I received a notification asking me if I would recommend the app to a friend, based on my experience of use.I discovered also the possibility to send feedback to developers.Honestly I would not recommend this version to a friend, and I'm tempted to send a full list of negative remarks to the developer‖ (Fig. 7).Some comments explicitly referred to the hedonicstimulation quality.A 33-year-old girl wrote "After the initial excitement, after four weeks I find it not really useful.I have not been very active in the past days and Pacer does not motivate me in doing more ‖ The most positive comments relate to the Community, that is the possibility to connect to others and share the performance and the achievement of common goals.A 22-year-old boy wrote "I've just beaten my record of 15.000 steps today!‖ .This is the notification from Pacer!!! (Fig. 9).

. I didn't think this functionality would be so engaging for me".
A 30-year-old boy wrote "I can see the percentage of people who walk less than me  ... It is an interesting information since it relates to all people using the app and not only my friends (the numbers wouldn't have been meaningful)".A 23-year-old girl reported: "As soon as I woke up this morning, I received a notification of yesterday activity.I discovered I walked more than 32% of us-ers…it is a small but meaningful achievement for me… very positive " (Fig. 10). An entire world of opportunity disclosed to me".
To summarise, the negative diary entries were mostly related to low usability of the app, to an untimely use of notifications, and a scarce accuracy of data (e.g. the burned calories or steps).Some participants confronted the data obtained with Pacer with the ones provided by other step counters, realising that Pacer was not reliable.The app was not really motivating for participants and the majority of them uninstalled it at the end of the study.Furthermore, Pacer does not seem to meet the requirements of runners.It displays the pace stat as the overall pace for the whole run, rather than a lap-by-lap breakdown of the pace, which is what the typical runner's app shows.Runners want to know whether the first mile was as fast as the third, for example, and Pacer doesn't tell this.
The positive comments were mainly associated to the Community, Social sharing and Security aspects.In fact, to use the product, it is not necessary to create an account and provide the email address and personal data.For those who are concerned about security, that is a plus.
Overall Pacer did not offer much to explore beyond the basics and this caused a drop of interest after four weeks.The diary entries followed a negative trend over four weeks.Negative comments increased at T 3 and T 4 .

C) Episodic UX
As said before, the episodic experience was evaluated using AttrakDiff.The graph presented in Fig. 11 shows the mean values obtained for the 4 dimensions of analysis (PQ; HQ-I; HQ-S and ATT) at time T 0 , T 1 , T 2 , T 3 , and T 4 .Table I provides the average values obtained for the four dimensions and the relative standard deviation (Table I).
A closer look at the evaluation of specific items contained in the PQ dimension Impractical-Practical; Cumbersome-Straightforward show clearly how the judgement on pragmatic qualities decreased over time (Fig. 12).At the end of the study, the product was considered non-practical to use and cumbersome.The assessment of the item Impractical-Practical changed significant since over four weeks the pragmatic aspects were evaluated in real contexts of use (e.g.battery drain affected the entire use of the smartphone, the step counter stops when the person receives a call), and compared with other products considered more effective.Also the items related to HQ-I: Isolating-Connective; Alienating-Integrating; Separates me-Bring me close, decreased over time (Fig. 13), even if the social features like the possibility to form or join groups and to create personal goals were generally appreciated in the diaries, and judged positively in the questionnaire on Anticipated and Remembered UX.The app was unsuccessful in motivating participants.The items relating to the ability of the app to create enjoyable and captivating experience (HQ-S, ATT) decreased significantly over time (Fig. 14, Fig. 15).The following Fig. 14     There were no differences between male and female participants (Fig. 16 and Fig. 17

D) Remembered UX (AttrakDiff)
As said above, the remembered experience was assessed using an ad hoc questionnaire and AttrackDiff.The results of the ad hoc questionnaire are reported in section A above.The results of AttrakDiff related to the remembered experience were analysed conducting a paired-samples t-test to compare the four UX dimensions (PQ; HQ-I; HQ-S; ATT) at the first encounter (T 0 ) and after four weeks (T 4 ).
On the contrary, there is no statistically significant difference for PQ.
Apparently after four weeks the participants remembered far better their dissatisfaction related to the hedonic qualities and the overall attractiveness of Pacer.These qualities were predominant with respect to the pragmatic qualities of the app.In fact other fitness app offer similar functionality, therefore the overall attractiveness makes the difference for a memorable UX.The participants clearly reported this in their qualitative comments.
The paired-samples t-test confirms that the user experience of using Pacer decreased over the time and the difference is statistically significant.

V. DISCUSSION AND CONCLUSIONS
The data obtained from the questionnaire on the Anticipated UX were consistent with those collected with At-trakDiff at T 1 (Episodic UX) and the diaries after the first encounter (Momentary UX).The same consistency can be noted between the data related to the Remembered UX and those collected with AttrakDiff at T 4 .
Time seems to have an impact on the importance people attribute to different qualities of the experience with interactive products, as confirmed by previous studies [9].Despite the crucial importance of usability in the product's initial acceptance, aspects of reliability, motivation, comparison with other products, change in behaviour and touch points (how the product communicates with the user, for example by notifications and alerts) are even more crucial for a user to resonate with a product and value it in the long term.That is why the UX evaluation in the long term is crucial.
Furthermore, even if it is not possible to proceed to general conclusions after a study involving a limited number (10) of subjects, the present study offers an original contribution that we hope could stimulate additional studies taking time systematically into account using different methods for the evaluation.Longitudinal studies on UX evaluation reported in literature assessed users' per- ceptions focusing on specific times (Episodic UX) rather than assessing how their perceptions changed over time (Momentary and Episodic UX) and what memories people form in the long term that are crucial in stimulating later use.The diaries combined to questionnaires allowed us to reduce concerns about the reliability of the absolute measures collected with AttrakDiff where judgments were taken at predefine times and without reference points to single functionality.From one side diaries allowed us to assess the qualities of UX in context, that is in specific moments of interaction that were meaningful for participants.On the other side, the questionnaires on Anticipated and Remembered UX allowed us to associate the evaluation on expectations and memories to specific functionality of the product.The importance of such judgement was recognised also by Jordan and Persson [17] who suggested a hierarchical structure of qualities that contribute to positive experience, having the functionality of the product as a baseline.In addition to Jordan and Persson [17], we assumed the importance of UX qualities to vary with several personal and contextual factors including time as a fundamental source of diversity in UX, considered in its many facets Anticipated, Momentary, Episodic and Remembered.

Fig. 1
Fig. 1 Time spans of user experience adapted from Roto et al. 2011 Patrizia Marti University of Siena, Department of Social, Political and Cognitive Science and Eindhoven University of Technology, Department of Industrial Design, Via Roma 56, 53100 Siena Italy Email: patrizia.marti@unisi.itIolanda Iacono University of Siena, Department of Social, Political and Cognitive Science Via Roma 56, 53100 Siena Italy Email: iolanda.iacono@unisi.itjudgment ("duration neglect" effect) while the most intense moments experienced in the end do.

Fig. 3
Fig. 3 Overview of the methods used to assess the four types of UX More in detail the anticipated experience was assessed using an ad-hoc questionnaire focused on the main functionality of Pacer: Step counter, Burned calories, Community, Reminders/Notifications, Settings and use of GPS.A 5-point Likert scale with values from -2 to 2 was associated to each of the abovementioned functionality.The values were represented in the form of emoticons (-2 = very negative, -1 = negative, 0 = neutral, 1 = positive, 2 = very positive).To evaluate the anticipated experience, the questionnaire was administered at baseline before the app was installed on the participants' smartphones.None of the participants had ever used Pacer.At the beginning of the study all of them received a brief description of the six assessed functionality.The momentary experience was evaluated using selfreporting.A closed group on Facebook was created to keep a shared diary.All participants joint the group.The subjects were asked to take note of their experience on the everyday use of the application.The aim of keeping a diary was to express in a narrative form the impressions resulting from the use of the app in the very moment they were experienced by the subject (Momentary UX).The diary entries could be expressed in a free format, using text or images (e. g. screenshots of the app).However, participants were asked to associate an emoticon to each diary entry, the same used to assess the anticipated and remembered experience (-2 = very negative, -1 = negative, 0 = neutral, 1 = positive, 2 = very positive).The episodic experience was evaluated using At-trakDiff, a questionnaire administered 5 times over a period of four weeks: T 0 = first encounter, T 1 = after 1 week, T 2 = after 2 weeks, T 3 = after 3 weeks, T 4 = after 4 weeks at the end of the study.AttrakDiff, is a method developed by[15] to assess the user's experience and feelings in relation to interactive products and therefore a product's overall attractiveness.The questionnaire uses the technique of the semantic differential on pairs of opposite adjectives to evaluate the user experience.Users are asked to assess their experience and their perception of the product, responding to


Hedonic quality -Identity or HQ-I: indicates to what extent the product allows the user to identify with it in a certain social context.It relates to what we communicate socially when we use a product.Identification with a brand, for example a certain type of mobile phone, defines our inclinations and preferences of use of that product.Some products are preferred by certain categories of users because they are seen as cool, and not necessarily for the features they offer.Hedonic quality -Stimulation or HQ-S: indicatesto what extent the product can support users' needs in terms of novelty, content, stimulating interaction, presentation of style.It is defined by attributes that encourage users to improve their skills of use of the product.Examples of hedonic stimulation are those features of software applications that are usually little used, and the shortcuts for some commands.Some products offer the user flexibility of use, and the person feels gratified to learn or to find alternative or more effective and efficient modes of use of the product. Attractiveness or ATT: describes the product's overall value on the basis of perceived quality.Hedonic and pragmatic qualities are independent of one another, but together contribute to determining attractiveness.

Fig. 4
Fig. 4 Mean value of the Questionnaire Anticipated UX

Fig. 5
Fig. 5 Mean value of the Questionnaire on Remembered 1650PROCEEDINGS OF THE FEDCSIS.GDA ŃSK, 2016

Fig. 6
Fig. 6 This app could drain the battery.

Fig. 7 "Fig. 8 "
Fig. 7 "Based on your experience with this version of Pacer, would you recommend it to a friend?Certainly yes, probably, maybe or maybe not, probably no, definitely no."A 23-year-old girl commented the following: "I have received this notification for the third time today ... I find it annoying especially if you have been walking the whole day.It pops up when having meal or when sitting for more than an hour .I wish to turn the notifications off" (Fig. 8) Fig. 9 Goal achievement A 33-year-old girl wrote: "My friend invited me to join her network ... it is nice, I can send her messages and see how many steps she does during the day ...I didn't think

Fig. 10
Fig. 10 Goal achievement After the first week of use, a 25-year-old boy wrote "I discovered a fantastic functionality!!!I set the program -Sleep eight hours a day and exercise the abdominal muscles‖.Just after, a weekly calendar appeared on the screen associated to a chat where it was possible to share

Fig. 11
Fig. 11 Mean values for the four AttrakDiff dimensions over the time Data show a decreasing trend for all dimensions of the analysis, from an initial positive attribution to all dimensions, to a progressive decreasing assessment over the four weeks.The HQ-S at T 4 scored below zero.The dimensions that obtained the highest values during the

Fig. 16
Fig. 16 Mean values for the four AttrakDiff dimensions over the time for male 1654PROCEEDINGS OF THE FEDCSIS.GDA ŃSK, 2016

TABLE I .
MEAN AND STANDARD DEVIATION OVER THE TIME PATRIZIA MARTI, IOLANDA IACONO: ANTICIPATED, MOMENTARY, EPISODIC, REMEMBERED: THE MANY FACETS OF USER EXPERIENCE