EUSES Logoskip to page content Intranet Login »
 
Home    About EUSES    News & Events    Research    Publications    Resources    People    Contact 

End-User Software Engineering: Empirical Findings

    Here is a collection of empirical findings about end-user software engineering: its effectiveness, what works and what doesn't, the end users who do it, and the factors involved. We hope these findings will assist you in your research or development of environments for end-user software development.

    Effectiveness and performance of algorithms

      Testing, Debugging, and Fault Localization

  • For unit inference, we have found the classification of cells into different cell types and determining the spatial relationships between them to be an effective way of identifying header information in spreadsheets. The initial tests we have performed indicate that header inference is almost always correct. Those cases in which incorrect headers are inferred seem to be identifiable by further spatial analysis techniques. Accordingly, the unit inference that is based on the header information performs very accurately in practice.
    • Header and Unit Inference for Spreadsheets Through Spatial Analyses, Robin Abraham and Martin Erwig. IEEE Symp. on Visual Languages and Human-Centric Computing (VL/HCC'04), 165-172, 2004. abstract and download

  • Our region identification algorithms have been able to signal the presence of anomalies in formulas on real commercial spreadsheets.
    • Header and Unit Inference for Spreadsheets Through Spatial Analyses, Robin Abraham and Martin Erwig. IEEE Symp. on Visual Languages and Human-Centric Computing (VL/HCC'04), 165-172, 2004. abstract and download

  • Fault localization algorithms for interactive environments must be designed to expect a certain number of oracle mistakes (sometimes as high of 20% of oracle judgments) and to be as robust as possible to such mistakes. Our results showed that one way to develop this robustness is to weight negative judgments (judgments that a value is wrong) more heavily than positive judgments, because users are far less likely to make oracle mistakes on negative judgments. Our results also strongly suggested that second way to enhance robustness is to never assume that a variable (cell) is completely error-free if it has ever contributed to an incorrect value. Finally, our results empirically showed that weighting the local impacts of marks much more heavily than the global impacts further enhances robustness.
    • Phalgune, A., Kissinger, C., Burnett, M., Cook, C., Beckwith, L., Ruthruff, J.R., Garbage In, Garbage Out? An Empirical Look at Oracle Mistakes by End-User Programmers, VL/HCC'05: IEEE Symposium on Visual Languages and Human-Centric Computing, September 2005. pdf

  • Fault localization algorithms need to be decomposed into two separate attributes for empirical purposes: the behind-the-scenes information base used, and the mapping from the information content to a presentation device. Each of these factors can independently ‘cancel out’ any gains made by the other, thereby interfering with proper empirical collection of evidence about the advantages of an algorithm.
    • Ruthruff, J., Burnett, M., Rothermel, Gregg., An Empirical Study of Fault Localization for End-User Programmers, ICSE'05: International Conference on Software Engineering, St. Louis, MO, USA, May 15-21, 2005. pdf

  • Automatic debugging of spreadsheets based on user input is feasible. Even though the number of possibilities for error sources are very high, a backwards directed constraint propagation approach coupled with suitable heuristics can provide very good change suggestions that can automatically remove spreadsheet errors.
    • Goal-Directed Debugging of Spreadsheets, Robin Abraham and Martin Erwig. IEEE Symp. on Visual Languages and Human-Centric Computing (VL/HCC'05), 37-44, 2005. abstract and download
    • Type Inference

  • Two different kinds of unit errors can be distinguished: primary errors and dependent errors, which depend on primary errors. Dependent errors will disappear when primary errors are removed. The kind of error is communicated to the end user by a different cell background color.
    • Header and Unit Inference for Spreadsheets Through Spatial Analyses, Robin Abraham and Martin Erwig. IEEE Symp. on Visual Languages and Human-Centric Computing (VL/HCC'04), 165-172, 2004. abstract and download

  • The investigation of the structure of non-well formed units, which indicate unit errors, has revealed that not only does the unit structure provide in some cases detailed information about the nature and source of the unit error, but also carries in some cases enough information to offer corrective actions to the end user. This is a surprising and extremely valuable finding since it shows that unit inference can not only perform automatic error detection, but also to some degree automatic error removal.
    • How to Communicate Unit Error Messages in Spreadsheets, Robin Abraham and Martin Erwig. 1st Workshop on End-User Software Engineering (WEUSE'05), 52-56, 2005. abstract and download
    • Specifications

  • Spreadsheet templates can effectively prevent reference, range, and type errors in spreadsheets. Based on a mathematical semantic definition, a corresponding theorem has been proved.
    • Gencel: A Program Generator for Correct Spreadsheets, Martin Erwig and Robin Abraham and Irene Cooperstein and Steve Kollmansberger. Journal of Functional Programming, 2006, to appear. abstract and download
    • Automatic Generation and Maintenance of Correct Spreadsheets, Martin Erwig and Robin Abraham and Irene Cooperstein and Steve Kollmansberger. 27th IEEE Int. Conf. on Software Engineering (ICSE'05), 136-145, 2005. abstract and download
    • ClassSheets: Automatic Generation of Spreadsheet Applications from Object-Oriented Specifications, Gregor Engels and Martin Erwig. 20th IEEE/ACM Int. Conf. on Automated Software Engineering (ASE'05), 124-133, 2005. abstract and download

  • Our automatic engine to dynamically support end-user web requests can make inferences on the expected behavior of a remote site. For example, we are able to distinguish mandatory from optional variables.
    • Sebastian Elbaum, Kalyan-Ram Chilakamarri, Marc Fisher II, and Gregg Rothermel. Web application characterization through directed requests. In Proceedings of the 4th International Workshop on Dynamic Analysis, Shanghai, China, May 2006. pdf

  • Improving effectivness by end-user programmers

      Characterizing current practices of end user programmers

  • We estimate that 80 million people use computers at American workplaces in 2005, and this number will grow to 90 million by 2012. Of these, over 55 million will use spreadsheets and databases, and 13 million will describe themselves as programmers. In comparison, the number of professional programmers will remain between 2 and 3 million.
    • C. Scaffidi, M. Shaw, and B. Myers. Estimating the Numbers of End Users and End User Programmers. VL/HCC'05: Proceedings of the 2005 IEEE Symposium on Visual Languages and Human-Centric Computing, pp. 207-214, 2005. pdf

  • When asked how software has interfered with doing their work, 25% of highly-skilled end users (mostly managers) mentioned problems with data reuse, especially data incompatibility. In comparison, only 15% mentioned software reliability problems.
    • C. Scaffidi, M. Shaw, B. Myers. Games Programs Play: Obstacles to Data Reuse, 2nd Workshop on End User Software Engineering (WEUSE), at the Conference on Human Factors in Computing Systems (CHI), 2006. pdf

  • Factor analysis of users' feature usage within several popular end user programming tools reveals three clusters of features, macro features, linked structure features, and imperative features‚ such that information workers with an inclination to use a feature in each cluster also were inclined to use other features in that cluster, even though each cluster spans several applications.
    • C. Scaffidi, A. Ko, B. Myers, M. Shaw. Dimensions Characterizing Programming Feature Usage by Information Workers, accepted to VL/HCC'06. pdf

    • Motivating and Barriers in End-User Software Engineering

  • Affective rewards (such as progress bars) that are not actually functional rewards (such as pointing out a fault) have a surprisingly large effect on end users’ effectiveness as debuggers and on their understanding of how behind-the-scenes reasoning relates to what they should do to find the faults in their programs.
    • Ruthruff, J., Phalgune, A., Beckwith, L., Burnett, M., Rewarding Good Behavior: End-User Debugging and Rewards, VL/HCC'04: IEEE Symposium on Visual Languages and Human-Centric Computing, Rome, Italy, September 26-29, 2004.pdf

  • K-12 teachers who engage in end-user programming are facilitated by their domain knowledge, their high motivation to create instructional materials, collaborative programming with fellow teachers, modifying existing programs rather than creating programs from scratch, and using cut-and-paste extensively.
    • Wiedenbeck, S., Facilitators and inhibitors of end-user development by teachers in a school environment, IEEE Symposia on Visual Languages and Human-Centric Computing, Dallas, TX, Sept. 20-25, 2005.

  • In our verbal protocols, end users express that they are somewhat comfortable working with small programs and are more likely to modify an existing program for their own use than to create a new program.
    • Wiedenbeck, S. (2005). Facilitators and inhibitors of end-user development by teachers in a school environment. IEEE Symposia on Visual Languages and Human-Centric Computing, Dallas, TX, Sept. 20-24, 2005, 215-222.

      Improving User Effectivness at End-User Software Engineering

  • End users with little or no previous programming experience have little concrete basis for judging their efficacy. Misjudgments may lead to failure to persist in programming (initial self-efficacy judgments too low).
    • Wiedenbeck, S., LaBelle, D., Kain, V.N.R., Factors Affecting Course Outcomes in Introductory Programming, Psychology of Programming Interest Group (PPIG) 2004, Carlow, Ireland, April 5-7, 2004.pdf

  • Users sometimes make oracle “mistakes” as a result of deliberately using features in the environment in unintended ways, so that they are not actually mistakes from the users’ perspective, but they can lead the system into mistaken reasoning.
    • Phalgune, A., Kissinger, C., Burnett, M., Cook, C., Beckwith, L., Ruthruff, J.R., Garbage In, Garbage Out? An Empirical Look at Oracle Mistakes by End-User Programmers, VL/HCC'05: IEEE Symposium on Visual Languages and Human-Centric Computing, September 2005. pdf

  • Programmers take advantage of very little of the flexibility offered by text-based interaction techniques for editing code, suggesting the feasibility of trading some flexibility offered by text for the increased support offered by structured editors.
    • Ko, A.J., Aung, H.H., Myers, B.A. (2005) Design Requirements for More Flexible Structured Editors from a Study of Programmers' Text Editing, ACM Conference on Human Factors in Computer Systems, Portland OR, April 2-7, 2005, 1557-1560. pdf

  • Over one third of programmers’ time on corrective and perfective maintenance tasks is spent with the interactive mechanics of navigating code, suggesting that new representations for code that avoid this interactive overhead could significantly increase productivity and help with understanding.
    • Ko, A.J., Aung, H.H., Myers, B.A., Eliciting Design Requirements for Maintenance-Oriented IDEs: A Detailed Study of Corrective and Perfective Maintenance Tasks, ICSE'05: International Conference on Software Engineering, St. Louis, MO, USA, May 15-21, 2005.pdf

  • Statistical models of programmers’ interruptibility based on software-based sensors in a programming environment can predict programmers’ interruptibility with significantly higher accuracy than current environments, which always assume programmers to be interruptible.
    • Fogarty, J., Ko, A.J., Aung, H.H., Golden, E., Tang, K.P. and Hudson, S.E. (2005). Examining Task Engagement in Sensor-Based Statistical Models of Human Interruptibility. ACM Conference on Human Factors in Computing Systems, Portland OR, April 2-7, 2005, 331-340. pdf

  • K-12 teachers who engage in end-user programming are inhibited by their low programming knowledge, the very limited time they have to devote to programming, the complexity of programming environments, and the lack of support for programming activities by school administration.
    • Wiedenbeck, S., Facilitators and inhibitors of end-user development by teachers in a school environment, IEEE Symposia on Visual Languages and Human-Centric Computing, Dallas, TX, Sept. 20-25, 2005.

  • K-12 teachers emphasize the need for dependability in the programs they create for students, but do not employ effective procedures for achieving program correctness.
    • Wiedenbeck, S., Facilitators and inhibitors of end-user development by teachers in a school environment, IEEE Symposia on Visual Languages and Human-Centric Computing, Dallas, TX, Sept. 20-25, 2005.

  • Our studies have shown that with the Whyline, expert, novice, and non-programmers were 8 times faster at debugging and completed 40% more tasks.
    • Ko, A.J., Myers, B.A., "Designing the Whyline: A Debugging Interface for Asking Questions about Program Behavior," Proceedings CHI'2004: Human Factors in Computing Systems. Vienna, Austria, April 24-29, 2004. pdf

  • Negotiated-style interruptions are superior to immediate-style interruptions in the domain of end-user debugging, in terms of end users successfully finding and fixing faults, in terms of end users successfully learning about debugging features available to them in the environment, and in terms of end users’ ability to judge when debugging is finished.
    • Robertson, T.J., Prabhakararao, S., Burnett, M., Cook, C., Ruthruff, J.R., Beckwith, L., Phalgune, A., Impact of Interruption Style on End-User Debugging, ACM Conference on Human Factors in Computing Systems, Vienna, Austria, April, 2004.pdf

  • Our analyses of debugging behavior have shown that the majority of programmers’ errors and difficulties with debugging are due to false assumptions about what their code will do, or what it has done at runtime. (The key reason for the Whyline’s success is that it reveals these false assumptions to programmers.)
    • Ko, A.J., Myers, B.A., Development and Evaluation of a Model of Programming Errors, IEEE Symposium on End-User and Domain-Specific Programming (EUP'03), October 28-31, 2003, Auckland, New Zealand, p. 7-14.pdf

  • Our study of end users shows that some end users study parts of the program in detail by close reading and simulation with values. They study code in detail when their expectations are violated or their curiosity is aroused, as a result of running test cases in the interface. Other end users do not carefully evaluate the results of their test cases and, as a result, are unlikely to study segments of code in detail.
    • Wiedenbeck, S., Engebretson, A., Comprehension Strategies of End-User Programmers in an Event-Driven Application, ITiCSE'04: The 9th Annual Conference on Innovation and Technology in Computer Science Education, Leeds, U.K., June 28-30, 2004.pdf

  • As we have seen in previous experiments with end-user programmers in spreadsheets, experienced professional software developers’ testing behaviors in an industrial software development environment seem to be significantly modified by interactive coverage-based testing visualizations of code coverage. Whether the behavior modification of testing achieved is desirable depends heavily on the test adequacy criterion supported by the tool.
    • Lawrance, J., Clarke, S., Burnett, M., Rothermel, G., How Well Do Professional Developers Test with Code Coverage Visualizations?, VL/HCC'05: IEEE Symposium on Visual Languages and Human-Centric Computing, September 2005. pdf
    • Helping End-User Programmers Learn

  • Our empirical and observational research has also led to a more complete understanding of the learning barriers in programming systems. In particular, we have found and described 6 types of learning barriers that prevent end-user programmers from adopting a programming system. These barriers have helped analyze existing programming systems for difficulties and have also helped us conceive of several programming system features that we hope will either eliminate learning barriers for novice programmers, or at least lower them.
    • Ko, A.J., Myers, B.A., Aung, H.H., Six Learning Barriers in End-User Programming Systems, VL/HCC'04: IEEE Symposium on Visual Languages and Human-Centric Computing, Rome, Italy, 26-29 September 2004.pdf

  • The modeling studies of end users confirm the parallel importance of good mental models and strong self-efficacy on success in learning to program.
    • Ramalingam, V., LaBelle, D., and Wiednebeck, S. (2004). Self-Efficacy and Mental Models in Learning to Program. ITiCSE2004: Proceedings of the 9th Annual SIGCSE Conference on Innovation and Technology in Computer Science Education, 171-175. pdf

  • End users taking a first university programming course have low previous experience with programming compared to CS majors taking a first programming course.
    • Wiedenbeck, S., LaBelle, D., Kain, V.N.R., Factors Affecting Course Outcomes in Introductory Programming, Psychology of Programming Interest Group (PPIG) 2004, Carlow, Ireland, April 5-7, 2004. pdf

      Gender Differences' Impact on End-User Software Engineering

  • Females had lower self-efficacy for debugging spreadsheets than males. Furthermore, females’ self-efficacy was a predictor of their final effectiveness in debugging, while males self-efficacy did not predict their final effectiveness.
    • Beckwith, L., Kissinger, C., Burnett, M., Wiedenbeck, S., Lawrance, J., Blackwell, A., Cook, C., Tinkering and Gender in End-User Programmers&rsquo Debugging, Proceedings CHI'2006: Human Factors in Computing Systems, Montreal, Canada, April 22-23, 2006. To appear. pdf

  • Females were quicker to begin debugging using formula edits than were males, but were slower than males to begin using new debugging features provided in the prototype spreadsheet environment.
    • Beckwith, L., Kissinger, C., Burnett, M., Wiedenbeck, S., Lawrance, J., Blackwell, A., Cook, C., Tinkering and Gender in End-User Programmers' Debugging, Proceedings CHI'2006: Human Factors in Computing Systems, Montreal, Canada, April 22-23, 2006. To appear. pdf

  • Females showed continued, genuine usage of formula edits while debugging but did not show continued, genuine usage of the new features. At the end of the study, females more than males thought that using a new feature might take too long to learn, even though their actual understanding of the new feature was not different from males’.
    • Beckwith, L., Kissinger, C., Burnett, M., Wiedenbeck, S., Lawrance, J., Blackwell, A., Cook, C., Tinkering and Gender in End-User Programmers' Debugging, Proceedings CHI'2006: Human Factors in Computing Systems, Montreal, Canada, April 22-23, 2006. To appear. pdf

  • Prototype changes made to encourage use of new debugging features by individual with low self-efficacy showed mixed results in an early, small scale evaluation. Few females or males chose to use a new feature for expressing levels of confidence in their debugging decisions. However, there was some evidence that those who did use the feature were more willing to make decisions on the correctness of cell values than in earlier studies.
    • Beckwith, L., Burnett, M., Gender: An Important Factor in End-User Programming Environments?, VL/HCC'04: IEEE Symposium on Visual Languages and Human-Centric Computing, Rome, Italy, p. 107-114, 26-29 September 2004.pdf

  • In the first round of our informal web developer survey, approximately 25% of the 320 respondees were women. We found that these women were significantly less likely to identify themselves as programmers, even though they were just as likely to be carrying out their web development in a work context (versus volunteer/hobby/civic). Interestingly, there were no associated differences in reported frequency or strategies with respect to testing. Because the sample size is still relatively small (fewer than 80 women overall), we are expanding our sample, attempting to increase its diversity by deliberately over-sampling among women and non-white populations. This will enable a second round of analyses that examines in more detail the web development attitudes and strategies of different subpopulations.
    • Rosson, M.B. 2004. Web development by nonprogrammers. Proceedings of the NSF's ITWF & ITR/EWF Principal Investigator Conference (pp.89-95). Philadelphia, PA, 24-26 October 2004.

Effectiveness issues in end-user web application development

  • The population of web application programmers is diverse, as are their motivations for doing development, ranging from individuals who have no training at all and use pre-fab tools for development, to individuals with long-lived personal interests and experiences in using computers.
    • Rosson, M.B., Ballin, J., & Nash, H. 2004. Everyday programming: Challenges and opportunities for informal web development. Visual Languages and Human-Centric Computing 2004 (pp. 123-130). New York: IEEE. pdf

  • Tools used by web application programmers typically included either FrontPage or Dreamweaver. Interestingly, choice of tool was usually not made on the basis of preference but rather on external factors such as what others in the organization are using, or simple cost.
    • Rosson, M.B., Ballin, J., & Nash, H. 2004. Everyday programming: Challenges and opportunities for informal web development. Visual Languages and Human-Centric Computing 2004 (pp. 123-130). New York: IEEE. pdf

  • Very few people reported actual copy-and-pasting of example code (e.g. scavenged from the internet). In a few cases that were reported, it led to insurmountable problems (the code was too advanced and could not be debugged). However virtually all of the community webmasters interviewed report that they commonly find examples to use as learning aids or models, both for specific goals like getting a web service to work and more general goals like website organization and visual design.
    • Rosson, M.B., Ballin, J., & Nash, H. 2004. Everyday programming: Challenges and opportunities for informal web development. Visual Languages and Human-Centric Computing 2004 (pp. 123-130). New York: IEEE. pdf

  • Community webmasters for the most part were not sensitized to testing as part of their everyday development. Depending on their work environment, they might test that their pages “look the same” when they move from development tool to uploaded page. A few reported concerns about different user populations but rarely had a systematic strategy for dealing with this. Friends or colleagues were often used as an informal testing process.
    • Rosson, M.B., Ballin, J., & Nash, H. 2004. "Everyday programming: Challenges and opportunities for informal web development". Visual Languages and Human-Centric Computing 2004 (pp. 123-130). New York: IEEE. pdf

  • Virtually all web app developers had extensive dependencies with others in their organization with respect to their work. Commonly this was content collection, cited as a major stumbling block in timely and effective development. Other dependencies arose in the need to convert content from one format to another, or reliance on an “expert” for certain sorts of debugging situations.
    • Rosson, M.B., Ballin, J., & Nash, H. 2004. Everyday programming: Challenges and opportunities for informal web development. Visual Languages and Human-Centric Computing 2004 (pp. 123-130). New York: IEEE. pdf

  • Approximately half of the community webmasters interviewed work largely on maintenance activities, that is, updating pages with new information. In several cases, the developer has only participated in this fashion, “inheriting” the website from a more expert colleague. The need to maintain a complex website was seen to provide a number of interesting opportunities for informal learning, such as tracking down and developing error-handling procedures for an interactive form developed by someone else.
    • Rosson, M.B., Ballin, J., & Nash, H. 2004. Everyday programming: Challenges and opportunities for informal web development. Visual Languages and Human-Centric Computing 2004 (pp. 123-130). New York: IEEE. pdf

  • Our analysis of the first round of informal web developer survey data (320 respondees) indicated that the population was approximately evenly divided between individuals who self-identify as programmers and those who do not. For the most part, the nonprogrammers report similar needs as the programmers, but their needs are not yet met as well by current tools (e.g., relative to programmers, they report much bigger differences between the perceived value of a feature like user authentication and success in implementing it). Interestingly, the nonprogrammers report similar concerns with respect to testing their web sites (suggesting they have correctness as a goal) but a less systematic development process.
    • Rosson, M.B., Ballin, J., & Rode, J. 2005. Who, what and why? A survey of informal and professional web developers. Proceedings of Visual Languages and Human-Centric Computing 2005 (pp. 199-206). New York: IEEE. pdf

If you have empirical findings related to end user software development feel free to send them in. This page is open to findings from anyone. Send us your finding and a link to your source at eusesconsortium@eecs.oregonstate.edu.

 

Home    About EUSES    News & Events    Research    Publications    Resources    People    Contact 
©2006 EUSES
Last Updated on June 1st, 2006
Site Designed and Developed by Htet Htet Aung
Oregon State University University of Nebraska Lincoln University of Cambridge Penn State University HCII of Carnegie Mellon University Drexel University