DECSYS – Discrete and Ellipse-based response Capture SYStem

Data-driven techniques that capture uncertainty through intervals or fuzzy sets can substantially improve systematic reasoning about uncertain information. Recent years have seen renewed interest in the capture of intervals from a variety of sources – including experts and general survey participants. This approach avoids the more cumbersome batteries of questions that are otherwise required to capture individual uncertainty, and which may not obtain the same degree of fidelity. It also enables respondents to effectively communicate any range (e.g. vagueness) inherent in their response, allowing generation of models that represent this additional information. However, manual methods of obtaining and processing interval-valued data – such as through paper-based questionnaires, are labour and time intensive. This has provided a practical barrier to adoption of interval-valued response-formats in the wider community, from research to industry (e.g. marketing). We argue that establishing an effective and accessible method for interval-valued data-capture will greatly encourage research in and application of uncertainty-aware models of data. Thus, we present DECSYS, a newly developed open-source software tool, which enables the creation and administration of digital surveys that elicit both conventional and interval-valued responses. DECSYS incorporates a range of features, and is designed to maximise versatility for experimenters and usability for participants. Surveys can be conducted either locally or online, and results easily exported. We welcome community feedback, including on how to best tailor the tool in the future to maximise value and support multidisciplinary adoption of uncertainty-aware data collection.


A. Why are intervals valuable for survey responses?
The quantitative elicitation of responses is fundamental in a variety of domains, including perceptual and categorical judgement, attitudinal measurement, and public opinion. Up until now, a broad body of discussion and empirical research has amassed examining the most effective methods of obtaining these responses, along with determining best practice, in terms of both implementation of these methods and interpretation of the resulting data in as accurate and unbiased a manner as is reasonably possible, cf. [1]- [9]. However, while this literature is comprehensive in many respects, we believe that methods of capturing interval-valued data have, up to this point, remained relatively overlooked.
Interval-valued responses provide a powerful and natural means for respondents to communicate any inherent range that may be present in the appropriate response. They also This work was part funded by the UK's National Cyber Security Centre (NCSC) and the EPSRC EP/P011918/1 grant. enable the capture and better quantification of variable degrees of uncertainty at an individual-response level, going beyond more conventional between-subject or between-item measures. Here, the position of the interval captures the strength of the response, e.g. level of agreement, whilst the width of the interval captures the uncertainty in the same response (see Fig.  1). This uncertainty may reflect a limitation of the participant's knowledge, or a lack of specificity in the question, and is often present when a response is made, even if it goes unreported by traditional measures.
Although it is possible to collect information about uncertainty without using intervals, for example by using a traditional ordinal scale paired with further associated questions, adopting the direct capture of intervals avoids the need for additional questions. We argue that an intervalvalued response-format can therefore substantially streamline the process of capturing individual response-uncertainty, by reducing the burden on respondents in terms of both length and complexity of surveys. It may also improve the fidelity of vague or uncertain responses, through offering respondents a more natural and cohesive method of representing these aspects of a response.
Once an effective paradigm for capturing interval-valued response-data becomes established, this could provide complementary benefits of both a quantitative and qualitative nature. Using interval-valued data can improve the accuracy of model outputs. This is achieved by virtue of retaining potentially valuable information about individual response uncertainty, range, or vagueness, which traditional responseformats would otherwise discard. For example, interval-valued responses can clearly distinguish between ambivalent and uncertain responses, which cannot be similarly discriminated using conventional measures [4], [10], [11]. This is particularly useful in application of fuzzy data analysis, [12]- [16] and Computing with Words [17], [18], where interval-values are aggregated and modelled through fuzzy sets (see Fig. 2, for example). Additionally, intervals can not only improve precision of model outputs, but also better inform their interpretation, through better quantification of the highly variable degrees of uncertainty that surround them.

B. Why aren't intervals used widely already?
Development and validation of the most suitable approaches to collect, process and interpret interval-valued data in practice is an area of ongoing research across a variety of disciplines. The inherent complexity of modelling interval-valued data, paired with a lack of approaches to support this modelling and associated data interpretation has, up until recently, provided a barrier to general adoption of this data format. However, research involving computational intelligence approaches, such as fuzzy sets, are now providing increasingly sophisticated tools, fostering strong motivation for a renewed focus on capturing intervals [16]- [26].
A large number of online survey tools are already available, (e.g., Qualtrics, SurveyMonkey, Typeform, Google Forms, SurveyGizmo), of which many allow some degree of question and response-format customisability. With these tools, responses may be discrete along an ordinal or interval 1 scale, cf. [27], or a point response along a continuous sliding or visual analogue scale, cf. [28]- [31]. However, we are unaware of any openly available software tools that have been purposely developed for efficient and effective capture (cf. Section I-D) of interval-valued survey responses.
Manual methods of collecting, collating and processing interval-valued data require substantially greater commitments of time and effort than comparable conventional approaches. It is therefore perhaps of little surprise that the adoption of interval-valued response-formats is still yet to become mainstream across the wider research community, despite the advantages that this could offer. In light of this, we believe that easily accessible software for the collection of intervalvalued data could expand its use (and the associated use of data-driven fuzzy sets) to a range of new domains, particularly within the social sciences.
C. What have we done to solve this problem?
In this paper, we present newly developed, open-source, platform-independent software that enables capture of intervalvalued responses. This permits both creation and administration of digital surveys, and can be run both over local networks and online. Distinctively, this will provide researchers with the capability to elicit not only conventional responses, but also interval-valued data, using an ellipse-based response-format. We call this tool the Discrete and Ellipse-based response Capture SYStem (DECSYS). This software is available under a multi-license model, offering a free, open-source license (details online) for academic use, while other licensing options are available on request. It is available to download at: http://www.lucidresearch.org/software.html.
We are currently working to demonstrate and explain the added-value provided by interval-valued survey responses of the type obtained through our ellipse response-format; as well as examining the usability of this response-format, across a variety of domains, through soliciting subjective user feedback.

D. Why ellipses?
The ellipse-based response-format leverages the recent proliferation of touchscreen and stylus-compatible devices to obtain interval-valued responses in a quick and intuitive manner, with minimal divergence from how such a response would be made on paper.
Related work has been conducted into developing a 'Fuzzy Graphic Rating Scale' (FRS) [32] to capture 'fuzzified' questionnaire responses [33]- [35]. By comparison with this, and other possible mechanisms of obtaining interval-valued data, e.g. through soliciting estimates of each endpoint independently, the ellipse-based response-format allows each response to be made in a single, natural and cohesive action. This represents a substantial practical advantage, reducing questionnaire size, and therefore the workload of both the survey respondent and administrator. Further, this method should be easier for respondents to intuitively understand, reducing initial training requirements and engendering higher response fidelity. In addition, the ellipse approach is assumption free, in that it bypasses a fundamental requirement of the FRS to assume the underlying distribution of the membership function representing each response. By contrast, the ellipse method treats all points on the specified interval equally. Such a response can then be combined with others, either within or between subjects, to construct a fuzzy set from multiple responses, the distribution of which will be derived entirely from the input data (cf. Fig. 2).
E. Paper overview.
The remainder of this paper will document the key features of the DECSYS tool. Section II-A will specify technical aspects of the software. Section II-B will present the range of options available when creating and adapting surveys. Section II-C will discuss features relevant to administering surveys, which can be done either locally or online. Section II-D will then address mechanisms by which surveys are accessed by the respondent. Section III will provide a brief summary of the key aspects of the paper, including justification, features and aims of DECSYS.

II. FEATURES AND APPLICATION
The DECSYS survey platform provides experimenters with the capacity to both create and administer surveys according to their particular specifications, in terms of content, general appearance and response-format. The core and unique component of DECSYS, which sets it apart from already available survey tools, is the capacity to effectively and efficiently capture interval-valued responses. However, DECSYS incorporates a range of features, designed to make it a wellrounded and versatile standalone tool for the purposes of both survey design and data-collection. This section will document these features.

A. Technical specification
DECSYS is a web application written in C#. It uses the ASP.NET Core Web Framework and targets .NET Core, and can therefore run cross-platform. In addition, as it is a web application, much of it is written with HTML, CSS and JavaScript. Notably, the question components, enabling Likert-type, ellipse-based and free-text responses (discussed in Section II-B) are written entirely in JavaScript, using the React framework.

B. Creating surveys
An administrator homepage (see Fig. 3) enables new surveys to be created. In addition, this page shows an overview of all previously built surveys, with the option to preview, export, duplicate or delete any existing survey, as well as to edit any survey that has not yet been launched.  When constructing the main body of a survey, questions are added and modified in the form of 'Pages', which each comprise a collection of components (see Fig. 4). Selectable components include headings, secondary text, images, and a range of categories of response-component. Once selected, it is then also possible to customise multiple aspects relating to each component (see Fig. 5). The current version of DEC-SYS includes four principal types of response-component, as follows: 'Confirm', provides the option to include a simple checkbox. This can be used to obtain explicit acknowledgement of survey requirements, to indicate understanding of instructions, and to record consent for terms and conditions of how data will be used.
'Free-text', (see Fig. 6(a)) allows respondents to write their own open-ended response in a text-box. It is possible to specify a character limit, which can be useful when collecting demographic questions with a certain desired response format, e.g. Age  or Gender (M/F). 'Likert', (see Fig. 6(b)) provides a conventional responseformat, in which respondents select a single discrete value from a given set of options along an ordinal scale. When designing this component it is possible to specify the number of response-options, along with how each option is denoted, which can be in either text or numeric format. The option is also provided to add secondary-labels to the ends of the scale, allowing a primarily numeric scale combined with linguistic markers to indicate directionality, e.g. More -Less, Disagree -Agree. It is also possible to customise the position in which the response options are presented, along with the font, size and colour of the text in which the labels are shown.
'Ellipse', (see Fig. 6(c)) provides a novel method of capturing interval-valued responses, namely through drawing an ellipse along a continuous scale. This response-component is touch screen compatible and was designed specifically to support stylus-based input, thus providing an intuitive way for participants to give interval-valued responses through a single, cohesive and familiar action (see also Section I-D). Many aspects of an ellipse component are also configurable. Labels may be added underneath the response-scale, to denote minimum, midpoint and maximum values, with specified text content, font, size and colour. Vertical markers can optionally be included, to visually divide the scale. The position, width and colours of both the scale and the drawn response may also be customised. When a participant draws an ellipse response, range markers appear (shown in Fig. 5 in black & Fig. 6(c), in blue), to indicate the left and right-most bounds of each response, according to where the extremities of the ellipse cross the horizontal response bar. This feature is valuable because it allows participants to clearly view how any given response will be recorded before it is finalised. Respondents then have the option to either confirm or re-draw the response.
The broad range of customisable options provide a highlevel of flexibility to tailor survey content and formats. This facilitates accurate replication of previous questionnaire studies. Another convenient feature of DECSYS is that existing questions can be duplicated and then adapted. This substantially streamlines the process of maintaining a cohesive format and style throughout the questionnaire, as settings don't have to be re-specified from defaults for each individual question.
Further notable features include the capability to show an image alongside a question, allowing DECSYS to be used for stimulus-based judgements. In addition, the experimenter can specify whether questions will be presented in a fixed or random order. This feature is designed such that individual questions can be toggled to a fixed or random position (see Figs. 4 & 5). This provides the flexibility to either begin or end a survey with a block of specific questions (e.g. demographic or feedback related), while the remainder of the questionnaire comprises randomly ordered items. Third, whilst editing a component, a small-scale visual preview is shown of how it will appear to the respondent (see Fig. 5). Similarly, it is possible for the experimenter to preview a whole survey before launch. This option allows progression through the entire survey, without recording responses and with the option to exit at any point.

C. Conducting surveys
DECSYS has the functionality to run surveys over local networks, in 'Workshop Mode'. This allows a high-level of intrinsic data security, through the use of a closed network. It is also designed for use over the internet, in 'Online Mode' (presently still in development). This will provide greater flexibility in obtaining participants and broaden the potential user base. The appearance, functions and user interface of the online version are designed to be generally equivalent to the local version, but with the addition of further security measures to control who can access the survey, as either an administrator or participant.
When running in 'Workshop Mode', the survey is managed on a host computer and respondents connect to the survey from client computers using a web-browser. The administrator accesses the survey homepage on the host computer, also through a web-browser. All surveys that have been created on the host computer are visible and accessible from the administrator homepage (see Fig. 3), and each survey is indicated here as either currently active or inactive. From this page, any inactive survey can be activated, and an active survey can be closed. In workshop mode, only a single survey can be activated at any one time, as a result of how they are hosted and accessed over local network (see also Section II-D).
The majority of options for survey management (preview, export, duplication or deletion), remain available from the admin page irrespective of its current activity status. However, although any survey can be edited prior to its initial launch, this is not permitted afterwards, even if it is subsequently made inactive again. This feature ensures that all results stored with a given survey pertain to the same questions and format (i.e. data and survey remain consistent), even if these were obtained over multiple sessions. If required, the 'Duplicate' function can  be used to copy an entire existing survey following its initial launch. This can then be adapted and saved as an updated version of the same survey.
Once a survey has been made active, the administrator can view the real-time progress of each participant (see Fig. 7). This informs the experimenter about the number of participants that have signed up to take part, allows estimation of time until completion, and also indicates whether any respondents may be having trouble completing the questionnaire.
After launching a survey it is also possible to view results, even as responses come in (see Fig. 8). These are timestamped and linked to a unique identifier for each respondent, and show core response content, e.g. text input, discrete response value, or left and right ellipse endpoints, alongside question number and the order in which each question was presented (important in the case of randomisation). If the survey has been run over multiple sessions, results from each session will be accessible here under different tabs. Results from an individual session can be exported for external processing and analysis (in .json format). Alternatively, results from all sessions can be exported in a single file, together with survey parameters.

D. Accessing and completing surveys
Surveys can be distributed in multiple ways. Responses can be entirely anonymous, in the sense that anyone can sign up to the survey. This offers the broadest potential uptake, but provides minimal guarantees concerning the quality of the data. For instance, the same respondent could potentially complete the survey multiple times, leading to bias and entailing that data is non-independent. Alternatively, responses could be intrinsically linked to a specific individual or account. This approach requires more safeguards regarding considerations of privacy and data security, but offers the option of comparing and cross-validating responses across multiple independent surveys. There is also a middle-ground, by which responses can be made on a semi-anonymous basis, in that only a select group of respondents are invited to take part, and can only do so a single time, but each set of responses remains separate from the identity of the individual that provided them. DECSYS is designed to provide the survey administrator with the capability to collect data according to each of these approaches, depending upon what is most appropriate for their specific use case. Potential respondents will therefore have multiple ways of accessing surveys, dependent upon survey specifications.
In Workshop Mode, respondents join a survey via a pre-set local network address (the IP address and port number of the host computer). This is input into the web browser of a client computer that is connected to same local network. Doing so immediately takes the participant to the survey welcome page, from which they can then progress through whichever survey is currently active.
In Online Mode, different methods will be used to control access to the survey platform. For survey administrators, this will take the form of an account-based log-in. However, as in other facets of the software, DECSYS is designed to provide a high degree of flexibility with respect to authentication of survey participants. Respondents will be able to access a survey either using a single or limited use authentication token, sent to them via email, through account-based log-in, or, where appropriate, entirely anonymously, by simply entering an openly available URL into a web-browser. This will depend upon the prior specification of the survey administrator.

III. CONCLUSIONS AND FUTURE WORK
We propose that collecting interval-valued survey responses is a useful method of obtaining information about vagueness, uncertainty or fuzziness in participants' answers, where the interval captures a range of possibly correct responses. Parallel research studies are ongoing, which aim to empirically establish the added-value offered by interval-valued responses, alongside the reported usability and acceptance of our novel ellipse-based response-format by survey respondents. These span a variety of domains, including Cyber-Security, user experience, food rating, personality inventory, and judgement under uncertainty. The present paper aims to address the lack of currently available software that is purposely designed for the efficient and intuitive capture of interval-valued data within surveys. Without such a digital tool, there is a substantial disincentive for the wider adoption of interval-valued survey response-formats, as paper-based versions require increased workload compared to more traditional response-formats.
In this paper we present DECSYS, an open-source and platform-independent software tool that enables the collection of interval-valued responses, as well as traditional free-text and Likert response-formats. DECSYS incorporates a broad range of features, which provide a high level of versatility for the survey administrator, in terms of both function and design, and which also maximise ease-of-use for the respondent. The innovative ellipse-based method used to capture intervalvalued data is designed to be both quick and intuitive. This allows interval-valued responses to be provided in a single, cohesive and familiar action, which diverges minimally from how an equivalent paper-based response would be made.
While an initial version of DECSYS has been completed, development of the software is ongoing; look, feel and functionality are continuously being refined. Both provision of additional features and improvements to existing features are planned, including the expansion of security and authentication protocols to provide flexible and secure access control in an online setting. We also aim to facilitate future third-party development, and thereby encourage the wider community to take full advantage of the potential for adaptation of the tool for a broad range of applications.