mirror of
https://github.com/amyjko/cooperative-software-development
synced 2024-12-25 21:58:15 +01:00
Fixed #52, incorporating issues of racism, sexism, and ableism through the book.
This commit is contained in:
parent
420ba055e2
commit
df24a62146
14 changed files with 114 additions and 24 deletions
|
@ -51,7 +51,15 @@
|
|||
|
||||
<p>Even with carefully selected architectures, systems can still be difficult to put together, leading to <b>architectural mismatch</b> (<a href="#garlan">Garlan et al. 1995</a>). When mismatch occurs, connecting two styles can require dramatic amounts of code to connect, imposing significant risk of defects and cost of maintenance. One common example of mismatches occurs with the ubiquitous use of database schemas with client/server web-applications. A single change in a database schema can often result in dramatic changes in an application, as every line of code that uses that part of the scheme either directly or indirectly must be updated (<a href="#qiu">Qiu et al. 2013</a>). This kind of mismatch occurs because the component that manages data (the database) and the component that renders data (the user interface) is highly "coupled" with the database schema: the user interface needs to know <em>a lot</em> about the data, its meaning, and its structure in order to render it meaningfully.</p>
|
||||
|
||||
<p>The most common approach to dealing with both architectural mismatch and the changing of requirements over time is <b>refactoring</b>, which means changing the <em>architecture</em> of an implementation without changing its behavior. Refactoring is something most developers do as part of changing a system (<a href="#murphyhill">Murphy-Hill et al 2009</a>, <a href="#silva">Silva et al. 2016</a>). Refactoring code to eliminate mismatch and technical debt can simplify change in the future, saving time (<a href="#ng">Ng et al. 2006</a>) and prevent future defects (<a href="#kim">Kim et al. 2012</a>).
|
||||
<p>
|
||||
The most common approach to dealing with both architectural mismatch and the changing of requirements over time is <b>refactoring</b>, which means changing the <em>architecture</em> of an implementation without changing its behavior.
|
||||
Refactoring is something most developers do as part of changing a system (<a href="#murphyhill">Murphy-Hill et al 2009</a>, <a href="#silva">Silva et al. 2016</a>).
|
||||
Refactoring code to eliminate mismatch and technical debt can simplify change in the future, saving time (<a href="#ng">Ng et al. 2006</a>) and prevent future defects (<a href="#kim">Kim et al. 2012</a>).
|
||||
However, because refactoring remains challenging, the difficulty of changing an architecture is often used as a rationale for rejecting demands for change from users.
|
||||
For example, Google does not allow one to change their Gmail address, which greatly harms people who have changed their name (such as this author when she came out as a trans woman), forcing them to either live with an address that includes their old name, or abandon their Google account, with no ability to transfer documents or settings.
|
||||
The rationale for this has nothing to do with policy and everything to do with the fact that the original architecture of Gmail treats the email address as a stable, unique identifier for an account.
|
||||
Changing this basic assumption throughout Gmail's implementation would be an immense refactoring task.
|
||||
</p>
|
||||
|
||||
<p>Research on the actual activity of software architecture is actually somewhat sparse. One of the more recent syntheses of this work is Petre et al.'s book, <em>Software Design Decoded</em> (<a href="#petre2">Petre et al. 2016</a>), which distills many of the practices and skills of software design into a set of succinct ideas. For example, the book states, "<em>Every design problem has multiple, if not infinite, ways of solving it. Experts strongly prefer simpler solutions over complex ones, for they know that such solutions are easier to understand and change in the future.</em>" And yet, in practice, studies of how projects use APIs often show that developers do the exact opposite, building projects with dependencies on large numbers of sometimes trivial APIs. Some behavior suggests that while software <em>architects</em> like simplicity of implementation, software <em>developers</em> are often choosing whatever is easiest to build, rather than whatever is least risky to maintain over time (<a href="#abdalkareem">Abdalkareem 2017</a>).</p>
|
||||
|
||||
|
|
|
@ -57,8 +57,8 @@
|
|||
|
||||
<p>Communication is not always effective. In fact, there are many kinds of communication that are highly problematic in software engineering teams. For example, Perlow (<a hef="perlow">1999</a>) conducted an <a href="https://en.wikipedia.org/wiki/Ethnography" target="_blank">ethnography</a> of one team and found a highly dysfunctional use of interruptions in which the most expert members of a team were constantly interrupted to “fight fires” (immediately address critical problems) in other parts of the organization, and then the organization rewarded them for their heroics. This not only made the most expert engineers less productive, but it also disincentivized the rest of the organization to find effective ways of <em>preventing</em> the disasters from occurring in the first place. Not all interruptions are bad, and they can increase productivity, but they do increase stress (<a href="#mark">Mark et al. 2008</a>).</p>
|
||||
|
||||
<p>Communication isn't just about transmitting information; it's also about relationships and identity. For example, the dominant culture of many software engineering work environments—and even the <em>perceived</em> culture—is one that can deter many people from even pursuing careers in computer science. Modern work environments are still dominated by men, who speak loudly, out of turn, and disrespectfully, with <a href="https://www.susanjfowler.com/blog/2017/2/19/reflecting-on-one-very-strange-year-at-uber">some even bordering on sexual harassment</a>. Similarly, software developers often have to work with people in other domains such as artists, content developers, data scientists, design researchers, designers, electrical engineers, mechanical engineers, product planners, program managers, and service engineers. One study found that developers' cross-disciplinary collaborations with people in these other domains required open-mindedness about the input of others, proactively informing everyone about code-related constraints, and ultimately seeing the broader picture of how pieces from different disciplines fit together; when developers didn't do these things, collaborations failed, and therefore projects failed (<a href="#li">Li et al. 2017</a>). These are not the conditions for trusting, effective communication.</p>
|
||||
|
||||
<p>Communication isn't just about transmitting information; it's also about relationships and identity. For example, the dominant culture of many software engineering work environments—and even the <em>perceived</em> culture—is one that can deter many people from even pursuing careers in computer science. Modern work environments are still dominated by men, who speak loudly, out of turn, and disrespectfully, with <a href="https://www.susanjfowler.com/blog/2017/2/19/reflecting-on-one-very-strange-year-at-uber">some even bordering on sexual harassment</a>. Computer science as a discipline, and the software industry that it shapes, has only just begun to consider the urgent need for <em>cultural competence</em> (the ability for individuals and organizations to work effectively when their employee's thoughts, communications, actions, customs, beliefs, values, religions, and social groups vary) (<a href="#washington">Washington, 2020</a>). Similarly, software developers often have to work with people in other domains such as artists, content developers, data scientists, design researchers, designers, electrical engineers, mechanical engineers, product planners, program managers, and service engineers. One study found that developers' cross-disciplinary collaborations with people in these other domains required open-mindedness about the input of others, proactively informing everyone about code-related constraints, and ultimately seeing the broader picture of how pieces from different disciplines fit together; when developers didn't do these things, collaborations failed, and therefore projects failed (<a href="#li">Li et al. 2017</a>). These are not the conditions for trusting, effective communication.</p>
|
||||
|
||||
<p>
|
||||
When communication is effective, it still takes time.
|
||||
One of the key strategies for reducing the amount of communication necessary is <em>knowledge sharing</em> tools, which broadly refers to any information system that stores facts that developers would normally have to retrieve from a person.
|
||||
|
@ -101,6 +101,7 @@
|
|||
<p id="treudestory1">Christoph Treude and Margaret-Anne Storey. 2011. <a href="http://dx.doi.org/10.1145/2025113.2025129" target="_blank">Effective communication of software development knowledge through community portals</a>. In Proceedings of the 19th ACM SIGSOFT symposium and the 13th European conference on Foundations of software engineering (ESEC/FSE '11). ACM, New York, NY, USA, 91-101.</p>
|
||||
<p>Christoph Treude and Margaret-Anne Storey. 2009. <a href="http://dx.doi.org/10.1109/ICSE.2009.5070504" target="_blank">How tagging helps bridge the gap between social and technical aspects in software development</a>. In Proceedings of the 31st International Conference on Software Engineering (ICSE '09). IEEE Computer Society, Washington, DC, USA, 12-22.</p>
|
||||
<p>Keiji Uemura and Miki Ohori. 1984. <a href="http://dl.acm.org/citation.cfm?id=801955" target="_blank">A cooperative approach to software development by application engineers and software engineers</a>. In Proceedings of the 7th international conference on Software engineering (ICSE '84). IEEE Press, Piscataway, NJ, USA, 86-96.</p>
|
||||
<p id="washington">Alicia Nicki Washington. 2020. <a href="https://doi.org/10.1145/3328778.3366792">When Twice as Good Isn't Enough: The Case for Cultural Competence in Computing</a>. Proceedings of the 51st ACM Technical Symposium on Computer Science Education. 2020.</p>
|
||||
<p id="zhou">Minghui Zhou and Audris Mockus. 2011. <a href="https://doi.org/10.1145/1985793.1985831" target="_blank">Does the initial environment impact the future of developers?</a> In Proceedings of the 33rd International Conference on Software Engineering (ICSE '11). ACM, New York, NY, USA, 271-280.</p>
|
||||
|
||||
</small>
|
||||
|
|
|
@ -110,10 +110,28 @@
|
|||
|
||||
<p>If you think about the diversity of questions in this list, you can see why program comprehension requires expertise. You not only need to understand programming languages quite well, but you also need to have strategies for answering all of the questions above (and more) quickly, effectively, and accurately.</p>
|
||||
|
||||
<p>So how do developers go about answering these questions? Studies comparing experts and novices show that experts use prior knowledge that they have about architecture, design patterns, and the problem domain a program is built for to know what questions to ask and how to answer them, whereas novices use surface features of code, which leads them to spend considerable time reading code that is irrelevant to a question (<a href="#vonmay">von Mayrhauser & Vans 1994</a>, <a href="latoza2">LaToza et al. 2007</a>). Reading and comprehending source code is fundamentally different from those of reading and comprehending natural language (<a href="#binkley">Binkley et al. 2013</a>); what experts are doing is ultimately reasoning about <strong>dependencies</strong> between code (<a href="#weiser">Weiser 1981</a>). Dependencies include things like <strong>data dependencies</strong> (where a variable is used to compute something, what modifies a data structure, how data flows through a program, etc.) and <strong>control dependencies</strong> (which components call which functions, which events can trigger a function to be called, how a function is reached, etc.). All of the questions above fundamentally get at different types of data and control dependencies. In fact, theories of how developers navigate code by following these dependencies are highly predictive of what information a developer will seek next (<a href="#fleming">Fleming et al. 2013</a>), suggesting that expert behavior is highly procedural. This work, and work explicitly investigating the role of identifier names (<a href="#lawrie">Lawrie et al. 2006</a>), finds that names are actually critical to facilitating higher level comprehension of program behavior.</p>
|
||||
<p>
|
||||
So how do developers go about answering these questions?
|
||||
Studies comparing experts and novices show that experts use prior knowledge that they have about architecture, design patterns, and the problem domain a program is built for to know what questions to ask and how to answer them, whereas novices use surface features of code, which leads them to spend considerable time reading code that is irrelevant to a question (<a href="#vonmay">von Mayrhauser & Vans 1994</a>, <a href="latoza2">LaToza et al. 2007</a>).
|
||||
Reading and comprehending source code is fundamentally different from those of reading and comprehending natural language (<a href="#binkley">Binkley et al. 2013</a>); what experts are doing is ultimately reasoning about <strong>dependencies</strong> between code (<a href="#weiser">Weiser 1981</a>).
|
||||
Dependencies include things like <strong>data dependencies</strong> (where a variable is used to compute something, what modifies a data structure, how data flows through a program, etc.) and <strong>control dependencies</strong> (which components call which functions, which events can trigger a function to be called, how a function is reached, etc.).
|
||||
All of the questions above fundamentally get at different types of data and control dependencies.
|
||||
In fact, theories of how developers navigate code by following these dependencies are highly predictive of what information a developer will seek next (<a href="#fleming">Fleming et al. 2013</a>), suggesting that expert behavior is highly procedural.
|
||||
This work, and work explicitly investigating the role of identifier names (<a href="#lawrie">Lawrie et al. 2006</a>), finds that names are actually critical to facilitating higher level comprehension of program behavior.
|
||||
</p>
|
||||
|
||||
<p>
|
||||
Of course, program comprehension is not an inherently individual process either.
|
||||
Expert developers are resourceful, and frequently ask others for explanations of program behavior.
|
||||
Some of this might happen between coworkers, where someone seeking insight asks other engineers for summaries of program behavior, to accelerate their learning (<a href="#koinfo">Ko et al. 2007</a>).
|
||||
Others might rely on public forums, such as Stack Overflow, for explanations of API behavior (<a href="#mamykina">Mamykina et al. 2011</a>).
|
||||
These social help seeking strategies are strongly mediated by a developers' willingness to express that they need help to more expert teammates.
|
||||
Some research, for example, has found that junior developers are reluctant to ask for help out of fear of looking incompetent, even when everyone on a team is willing to offer help and their manager prefers that the developer prioritize productivity over fear of stigma (<a href="#begel">Begel and Simon, 2008</a>).
|
||||
These findings suggest the critical importance of teams ensuring that newcomers view them as psychologically safe places, where vulnerable actions like expressing a need for help will not be punished, ridiculed, or shamed, but rather validated, celebrated, and encouraged.
|
||||
</p>
|
||||
|
||||
<p>
|
||||
While much of program comprehension is skill, some of it is determined by design.
|
||||
While much of program comprehension is individual and social skill, some aspects of program comprehension are determined by the design of programming languages.
|
||||
For example, some programming languages result in programs that are more comprehensible.
|
||||
One framework called the <em>Cognitive Dimensions of Notations</em> (<a href="#green">Green 1989</a>) lays out some of the tradeoffs in programming language design that result in these differences in comprehensibility.
|
||||
For example, one of the dimensions in the framework is <strong>consistency</strong>, which refers to how much of a notation can be <em>guessed</em> based on an initial understanding of a language.
|
||||
|
@ -148,6 +166,7 @@
|
|||
<small>
|
||||
|
||||
<p id="baecker">R. Baecker. 1988. <a href="http://ieeexplore.ieee.org/abstract/document/93716/" target="_blank">Enhancing program readability and comprehensibility with tools for program visualization</a>. In Proceedings of the 10th international conference on Software engineering (ICSE '88). IEEE Computer Society Press, Los Alamitos, CA, USA, 356-366.</p>
|
||||
<p id="begel">Begel, A., & Simon, B. (2008, September). <a href="https://doi.org/10.1145/1404520.1404522">Novice software developers, all over again</a>. In Proceedings of the fourth international workshop on computing education research (pp. 3-14).</p>
|
||||
<p id="bhattacharya">Pamela Bhattacharya and Iulian Neamtiu. 2011. <a href="https://doi.org/10.1145/1985793.1985817" target="_blank">Assessing programming language impact on development and maintenance: a study on C and C++</a>. In Proceedings of the 33rd International Conference on Software Engineering (ICSE '11). ACM, New York, NY, USA, 171-180.</p>
|
||||
<p id="binkley">Binkley, D., Davis, M., Lawrie, D., Maletic, J. I., Morrell, C., & Sharif, B. (2013). <a href="https://link.springer.com/article/10.1007/s10664-012-9201-4" target="_blank">The impact of identifier style on effort and comprehension</a>. Empirical Software Engineering, 18(2), 219-276.</p>
|
||||
<p id="callau">Callaú, O., Robbes, R., Tanter, É., & Röthlisberger, D. (2013). <a href="https://doi.org/10.1145/1985441.1985448" target="_blank">How (and why) developers use the dynamic features of programming languages: the case of Smalltalk</a>. Empirical Software Engineering, 18(6), 1156-1194.
|
||||
|
@ -155,11 +174,13 @@
|
|||
<p id="endrikat">Stefan Endrikat, Stefan Hanenberg, Romain Robbes, and Andreas Stefik. 2014. <a href="https://doi.org/10.1145/2568225.2568299" target="_blank">How do API documentation and static typing affect API usability?</a> In Proceedings of the 36th International Conference on Software Engineering (ICSE 2014). ACM, New York, NY, USA, 632-642.</p> <p id="green">Green, T. R. (1989). Cognitive dimensions of notations. People and computers V, 443-460.</p>
|
||||
<p id="fleming">Fleming, S. D., Scaffidi, C., Piorkowski, D., Burnett, M., Bellamy, R., Lawrance, J., & Kwan, I. (2013). <a href="https://doi.org/10.1145/2430545.2430551" target="_blank">An information foraging theory perspective on tools for debugging, refactoring, and reuse tasks</a>. ACM Transactions on Software Engineering and Methodology (TOSEM), 22(2), 14.</p>
|
||||
<p id="hanenberg">Stefan Hanenberg, Sebastian Kleinschmager, Romain Robbes, Éric Tanter, Andreas Stefik. <a href="https://doi.org/10.1007/s10664-013-9289-1" target="_blank">An empirical study on the impact of static typing on software maintainability</a>. Empirical Software Engineering. 2013.</p>
|
||||
<p id="ko">Ko, A. J., & Myers, B. A. (2009, April). <a href="https://doi.org/10.1145/1518701.1518942" target="_blank">Finding causes of program output with the Java Whyline</a>. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (pp. 1569-1578).</p>
|
||||
<p id="koinfo">Amy J. Ko, Rob DeLine, and Gina Venolia (2007). <a href="https://doi.org/10.1109/ICSE.2007.45">Information needs in collocated software development teams</a>. In 29th International Conference on Software Engineering, 344-353).
|
||||
<p id="ko">Amy J. Ko and Brad A. Myers (2009, April). <a href="https://doi.org/10.1145/1518701.1518942" target="_blank">Finding causes of program output with the Java Whyline</a>. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (pp. 1569-1578).</p>
|
||||
<p id="latoza">Thomas D. LaToza and Brad A. Myers. 2010. <a href="http://dx.doi.org/10.1145/1806799.1806829" target="_blank">Developers ask reachability questions</a>. In Proceedings of the 32nd ACM/IEEE International Conference on Software Engineering - Volume 1 (ICSE '10), Vol. 1. ACM, New York, NY, USA, 185-194.</p>
|
||||
<p id="latoza2">Thomas D. LaToza, David Garlan, James D. Herbsleb, and Brad A. Myers. 2007. <a href="http://dx.doi.org/10.1145/1287624.1287675" target="_blank">Program comprehension as fact finding</a>. In Proceedings of the the 6th joint meeting of the European software engineering conference and the ACM SIGSOFT symposium on The foundations of software engineering (ESEC-FSE '07). ACM, New York, NY, USA, 361-370.</p>
|
||||
<p id="lawrie">Lawrie, D., Morrell, C., Feild, H., & Binkley, D. (2006, June). What's in a name? a study of identifiers. IEEE International Conference on Program Comprehension, 3-12.</p>
|
||||
<p id="maalej">Walid Maalej, Rebecca Tiarks, Tobias Roehm, and Rainer Koschke. 2014. <a href="http://dx.doi.org/10.1145/2622669" target="_blank">On the Comprehension of Program Comprehension</a>. ACM Transactions on Software Engineering and Methodology. 23, 4, Article 31 (September 2014), 37 pages.</p>
|
||||
<p id="mamykina">Mamykina, L., Manoim, B., Mittal, M., Hripcsak, G., & Hartmann, B. (2011, May). <a href="https://doi.org/10.1145/1978942.1979366">Design lessons from the fastest q&a site in the west</a>. In Proceedings of the SIGCHI conference on Human factors in computing systems, 2857-2866.</p>
|
||||
<p id="vonmay">A. von Mayrhauser and A. M. Vans. 1994. <a href="http://dl.acm.org/citation.cfm?id=257741" target="_blank">Comprehension processes during large scale maintenance</a>. In Proceedings of the 16th international conference on Software engineering (ICSE '94). IEEE Computer Society Press, Los Alamitos, CA, USA, 39-48.</p>
|
||||
<p id="ray">Baishakhi Ray, Daryl Posnett, Vladimir Filkov, and Premkumar Devanbu. 2014. <a href="http://dx.doi.org/10.1145/2635868.2635922" target="_blank">A large scale study of programming languages and code quality in GitHub</a>. In Proceedings of the 22nd ACM SIGSOFT International Symposium on Foundations of Software Engineering (FSE 2014). ACM, New York, NY, USA, 155-165.</p>
|
||||
<p id="roehm">Tobias Roehm, Rebecca Tiarks, Rainer Koschke, and Walid Maalej. 2012. <a href="http://dl.acm.org/citation.cfm?id=2337254" target="_blank">How do professional developers comprehend software?</a> In Proceedings of the 34th International Conference on Software Engineering (ICSE '12). IEEE Press, Piscataway, NJ, USA, 255-265.</p>
|
||||
|
|
|
@ -110,6 +110,8 @@
|
|||
This can involve analyzing dependencies that are affected by a bug fix, re-running manual and automated tests, and perhaps even running users tests to ensure that the way in which you fixed a bug does not inadvertently introduce problems with usability or workflow.
|
||||
Debugging is therefore like surgery: slow, methodical, purposeful, and risk-averse.
|
||||
</p>
|
||||
|
||||
<center class="lead"><a href="index.html">Back to table of contents</a></center>
|
||||
|
||||
<h2>Further reading</h2>
|
||||
|
||||
|
|
|
@ -31,13 +31,13 @@
|
|||
Programs change.
|
||||
You find bugs, you fix them.
|
||||
You discover a new requirement, you add a feature.
|
||||
A requirement changes because the world changes, you revise a feature.
|
||||
The simple fact about programs are that they're rarely stable, but rather constantly changing, living documents that shift as much as the world around them shift.
|
||||
A requirement changes because users demand it, you revise a feature.
|
||||
The simple fact about programs are that they're rarely stable, but rather constantly changing, living artifacts that shift as much as our social worlds shift.
|
||||
</p>
|
||||
|
||||
<p>Nowhere is this constant evolution more apparent then in our daily encounters with software updates. The apps on our phones are constantly being updated to improve our experiences, while the web sites we visit potentially change every time we visit them, without us noticing. These different models have different notions of who controls changes to user experience: should software companies control when your experience changes or should you? And with systems with significant backend dependencies, is it even possible to give users control over when things change?</p>
|
||||
|
||||
<p>To manage change, developers use all kinds of tools and practices.</p>
|
||||
<p>To manage change, developers use many kinds of tools and practices.</p>
|
||||
|
||||
<p>One of the most common ways of managing change is to <strong>refactor</strong> code. Refactoring helps developers modify the <em>architecture</em> of a program while keeping its behavior the same, enabling them to implement or modify functionality more easily. For example, one of the most common and simple refactorings is to rename a variable (renaming its definition and all of its uses). This doesn't change the architecture of a program at all, but does improve its readability. Other refactors can be more complex. For example, consider adding a new parameter to a function: all calls to that function need to pass that new parameter, which means you need to go through each call and decide on a value to send from that call site. Studies of refactoring in practice have found that refactorings can be big and small, that they don't always preserve the behavior of a program, and that developers perceive them as involving substantial costs and risks (<a href="#kim">Kim et al. 2012</a>).</p>
|
||||
|
||||
|
|
|
@ -78,6 +78,7 @@
|
|||
<li>When people are working in parallel, how do you prevent them from clobbering each other's work?</li>
|
||||
<li>If software engineering is about more than coding, what skills does a good coder need to have?</li>
|
||||
<li>What kinds of tools and languages can accelerate a programmers work and help them prevent mistakes?</li>
|
||||
<li>How can projects not lose sight of the immense complexity of human needs, values, ethics, and policy that interact with engineering decisions?</li>
|
||||
</ul>
|
||||
|
||||
<p>
|
||||
|
@ -98,8 +99,9 @@
|
|||
<p>
|
||||
Other social aspects of software engineering have received considerably less treatment.
|
||||
For example, despite the central role of women in programming the first digital computers, and the central role of women like Margaret Hamilton and Grace Hopper leading the formation of software engineering as a field in research and government, these histories are often forgotten, erased, and overshadowed by the gradual shift from software development being a field dominated by women to a field dominated by men.
|
||||
Many texts are beginning to document the central role of sexism that was at the heart of causing this culture shift (e.g., <a href="">Abbate 2012</a>).
|
||||
These histories show that, just like any other human activity, there are strong cultural forces that shape how people engineer software together.
|
||||
Many texts are beginning to document the central role of sexism that was at the heart of causing this culture shift (e.g., <a href="#abbate">Abbate 2012</a>).
|
||||
Similarly, software engineering research and practice has largely ignored the way that software can encode, amplify, and reinforce discrimination by encoding it into data, algorithms, and software architectures (e.g., <a href="#benjamin">Benjamin, 2019</a>).
|
||||
These histories show that, just like any other human activity, there are strong cultural forces that shape how people engineer software together, what they engineer, and what affect that has on society.
|
||||
</p>
|
||||
|
||||
<p>
|
||||
|
@ -128,7 +130,8 @@
|
|||
|
||||
<h2>Further reading</h2>
|
||||
|
||||
<p>Abbate, Janet (2012). <a href="https://mitpress.mit.edu/books/recoding-gender">Recoding Gender: Women's Changing Participation in Computing</a>. The MIT Press.</a>
|
||||
<p id="abbate">Abbate, Janet (2012). <a href="https://mitpress.mit.edu/books/recoding-gender">Recoding Gender: Women's Changing Participation in Computing</a>. The MIT Press.</a>
|
||||
<p id="benjamin">Benjamin, R. (2019). Race after Technology: Abolitionist Tools for the New Jim Code. Social Forces.</p>
|
||||
<p>Brooks Jr, F. P. (1995). <a href="https://books.google.com/books?id=Yq35BY5Fk3gC" target="_blank">The Mythical Man-Month (anniversary ed.)</a>. Chicago</p>
|
||||
<p>Gleick, James (2011). <a href="https://books.google.com/books?id=617JSFW0D2kC" target="_blank">The Information: A History, A Theory, A Flood</a>. Pantheon Books.</p>
|
||||
<p>Grudin, Jonathan (2017). <a href="https://books.google.com/books?id=Wc3hDQAAQBAJ" target="_blank">From Tool to Partner: The Evolution of Human-Computer Interaction</a>.</p>
|
||||
|
|
|
@ -72,7 +72,13 @@
|
|||
<li>"How well does test coverage correspond to actual code usage by our customers?"</li>
|
||||
</ul>
|
||||
|
||||
<p>The most mature data science roles in software engineering teams even have multiple distinct roles, including <em>Insight Providers</em>, who gather and analyze data to inform decisions, <em>Modeling Specialists</em>, who use their machine learning expertise to build predictive models, <em>Platform Builders</em>, who create the infrastructure necessary for gathering data (<a href="#kim">Kim et al. 2016</a>). Of course, smaller organizations may have individuals who take on all of these roles.</p>
|
||||
<p>
|
||||
The most mature data science roles in software engineering teams even have multiple distinct roles, including <em>Insight Providers</em>, who gather and analyze data to inform decisions, <em>Modeling Specialists</em>, who use their machine learning expertise to build predictive models, <em>Platform Builders</em>, who create the infrastructure necessary for gathering data (<a href="#kim">Kim et al. 2016</a>).
|
||||
Of course, smaller organizations may have individuals who take on all of these roles.
|
||||
Moreover, not all ways of discovering missing requirements are data science roles.
|
||||
Many companies, for example, have customer experience specialists and community managers, who are less interested in data about experiences and more interested in directly communicating with customers about their experiences.
|
||||
These relational forms of monitoring can be much more effective at revealing software quality issues that aren't as easily observed, such as issues of racial or sexual bias in software or other forms of structural injustices built into the architecture of software.
|
||||
</p>
|
||||
|
||||
<p>All of this effort to capture and maintain user feedback can be messy to analyze because it usually comes in the form of natural language text. Services like <a href="http://answerdash.com">AnswerDash</a> (a company I co-founded) structure this data by organizing requests around frequently asked questions. AnswerDash imposes a little widget on every page in a web application, making it easy for users to submit questions and find answers to previously asked questions. This generates data about the features and use cases that are leading to the most confusion, which types of users are having this confusion, and where in an application the confusion is happening most frequently. This product was based on several years of research in my lab (<a href="#chilana">Chilana et al. 2013</a>).</p>
|
||||
|
||||
|
|
|
@ -54,9 +54,11 @@
|
|||
|
||||
<ul>
|
||||
<li><strong>Engineering managers</strong> exist in all roles when teams get to a certain size, helping to move information from between higher and lower parts of an organization. Even <em>engineering</em> managers are primarily focused on organizing and prioritizing work, and not doing engineering (<a href="#kalliamvakou">Kalliamvakou et al. 2018)</a>. Much of their time is also spent ensuring every engineer has what they need to be productive, while also managing coordination and interpersonal conflict between engineers.</li>
|
||||
<li><b>Data scientists</b>, although a new role, typically <em>facilitate</em> decision making on the part of any of the roles above <a href="#begel">(Begel & Zimmermann 2014)</a>. They might help engineers find bugs, marketers analyze data, track sales targets, mine support data, or inform design decisions. They're experts at using data to accelerate and improve the decisions made by the roles above.</li>
|
||||
<li><b>Researchers</b>, also called user researchers, also help people in a software organization make decisions, but usually <em>product</em> decisions, helping marketers, sales, and product managers decide what products to make and who would want them. In many cases, they can complement the work of data scientists, <a href="https://www.linkedin.com/pulse/ux-research-analytics-yann-riche?trk=prof-post" target="_blank">providing qualitative work to triangulate quantitative data</a>.</li>
|
||||
<li><strong>Data scientists</strong>, although a new role, typically <em>facilitate</em> decision making on the part of any of the roles above <a href="#begel">(Begel & Zimmermann 2014)</a>. They might help engineers find bugs, marketers analyze data, track sales targets, mine support data, or inform design decisions. They're experts at using data to accelerate and improve the decisions made by the roles above.</li>
|
||||
<li><strong>Researchers</strong>, also called user researchers, also help people in a software organization make decisions, but usually <em>product</em> decisions, helping marketers, sales, and product managers decide what products to make and who would want them. In many cases, they can complement the work of data scientists, <a href="https://www.linkedin.com/pulse/ux-research-analytics-yann-riche?trk=prof-post" target="_blank">providing qualitative work to triangulate quantitative data</a>.</li>
|
||||
<li><strong>Ethics and policy specialists</strong>, who might come with backgrounds in law, policy, or social science, might shape terms of service, software licenses, algorithmic bias audits, privacy policy compliance, and processes for engaging with stakeholders affected by the software being engineered. Any company that works with data, especially those that work with data at large scales or in contexts with great potential for harm, hate, and abuse, needs significant expertise to anticipate and prevent harm from engineering and design decisions.
|
||||
</ul>
|
||||
|
||||
<p>Every decision made in a software team is under uncertainty, and so another important concept in organizations is <strong>risk</strong> <a href="#boehm">(Boehm 1991)</a>. It's rarely possible to predict the future, and so organizations must take risks. Much of an organization's function is to mitigate the consequences of risks. Data scientists and researchers mitigate risk by increasing confidence in an organization's understanding of the market and its consumers. Engineers manage risk by trying to avoid defects. Of course, as many popular outlets on software engineering have begun to discover, when software fails, it usually "did exactly what it was told to do. The reason it failed is that it was told to do the wrong thing." (<a href="https://www.theatlantic.com/technology/archive/2017/09/saving-the-world-from-code/540393/">Somers 2017</a>).</p>
|
||||
|
||||
<p>Open source communities are organizations too. The core activities of design, engineering, and support still exist in these, but how much a community is engaged in marketing and sales depends entirely on the purpose of the community. Big, established open source projects like <a href="https://mozilla.org" target="_blank">Mozilla</a> have revenue, buildings, and a CEO, and while they don't sell anything, they do market. Others like Linux <a href="#lee">(Lee & Cole 2013)</a> rely heavily on contributions both from volunteers <a href="#ye">(Ye & Kishida 2003)</a>, but also paid employees from companies that depend on Linux, like IBM, Google, and others. In these settings, there are still all of the challenges that come with software engineering, but fewer of the constraints that come from a for-profit or non-profit motive.</p>
|
||||
|
|
11
process.html
11
process.html
|
@ -45,7 +45,16 @@
|
|||
|
||||
<p><strong>Pace</strong> is another factor that affects quality. Clearly, there's a tradeoff between how fast a team works and the quality of the product it can build. In fact, interview studies of engineers at Google, Facebook, Microsoft, Intel, and other large companies found that the pressure to reduce "time to market" harmed nearly every aspect of teamwork: the availability and discoverability of information, clear communication, planning, integration with others' work, and code ownership (<a href="#rubin">Rubin & Rinard 2016</a>). Not only did a fast pace reduce quality, but it also reduced engineers' personal satisfaction with their job and their work. I encountered similar issues as CTO of my startup: while racing to market, I was often asked to meet impossible deadlines with zero defects and had to constantly communicate to the other executives in the company why this was not possible (<a href="ko2">Ko 2017</a>).</p>
|
||||
|
||||
<p>Because of the importance of awareness and communication, the <strong>distance</strong> between teammates is also a critical factor. This is most visible in companies that hire remote developers, building distributed teams. The primary motivation for doing this is to reduce costs or gain access to engineering talent that is distant from a team's geographical center, but over time, companies have found that doing so necessitates significant investments in travel and socialization to ensure quality, minimizing geographical, temporal and cultural separation (<a href="#smite">Smite 2010</a>). Researchers have found that there appear to be fundamental tradeoffs between productivity, quality, and/or profits in these settings (<a href="#ramasubbu">Ramasubbu et al. 2011</a>). For example, more distance appears to lead to slower communication (<a href="#wagstrom">Wagstrom & Datta 2014</a>). Despite these tradeoffs, most rigorous studies of the cost of distributed development have found that when companies work hard to minimize temporal and cultural separation, the actual impact on defects was small (<a href="#kocaguneli">Kocaguneli et al. 2013</a>). Some researchers have begun to explore even more extreme models of distributed development, hiring contract developers to complete microtasks over a few days without hiring them as employees; early studies suggest that these models have the worst of outcomes, with greater costs, poor scalability, and more significant quality issues (<a href="#stol">Stol & Fitzgerald 2014</a>).</p>
|
||||
<p>
|
||||
Because of the importance of awareness and communication, the <strong>distance</strong> between teammates is also a critical factor.
|
||||
This is most visible in companies that hire remote developers, building distributed teams, or when teams are fully distributed (such as when there is a pandemic requiring social distancing).
|
||||
One motivation for doing this is to reduce costs or gain access to engineering talent that is distant from a team's geographical center, but over time, companies have found that doing so necessitates significant investments in socialization to ensure quality, minimizing geographical, temporal and cultural separation (<a href="#smite">Smite 2010</a>).
|
||||
Researchers have found that there appear to be fundamental tradeoffs between productivity, quality, and/or profits in these settings (<a href="#ramasubbu">Ramasubbu et al. 2011</a>).
|
||||
For example, more distance appears to lead to slower communication (<a href="#wagstrom">Wagstrom & Datta 2014</a>).
|
||||
Despite these tradeoffs, most rigorous studies of the cost of distributed development have found that when companies work hard to minimize temporal and cultural separation, the actual impact on defects was small (<a href="#kocaguneli">Kocaguneli et al. 2013</a>).
|
||||
These efforts to minimize separation include more structured onboarding practices, more structured communication, and more structured processes, as well as systematic efforts to build and maintain trusting social relationships.
|
||||
Some researchers have begun to explore even more extreme models of distributed development, hiring contract developers to complete microtasks over a few days without hiring them as employees; early studies suggest that these models have the worst of outcomes, with greater costs, poor scalability, and more significant quality issues (<a href="#stol">Stol & Fitzgerald 2014</a>).
|
||||
</p>
|
||||
|
||||
<p>
|
||||
A critical part of ensuring all that a team is successful is having someone responsible for managing these factors of distance, pace, ownership, awareness, and overall process.
|
||||
|
|
|
@ -75,7 +75,16 @@
|
|||
</ul>
|
||||
|
||||
<p>One could imagine using these concepts to refine processes and practices in a team, helping both developers and managers be more aware of sources of waste that harm productivity.</p>
|
||||
|
||||
|
||||
<p>
|
||||
Of course, productivity is not only shaped by professional and organizational factors, but personal ones as well.
|
||||
Consider, for example, an engineer that has friends, wealth, health care, health, stable housing, sufficient pay, and safety: they likely have everything they need to bring their full attention to their work.
|
||||
In contrast, imagine an engineer that is isolated, has immense debt, has no health care, has a chronic disease like diabetes, is being displaced from an apartment by gentrification, has lower pay than their peers, or does not feel safe in public.
|
||||
Any one of these factors might limit an engineer's ability to be productive at work; some people might experience multiple, or even all of these factors, especially if they are a person of color in the United States, who has faced a lifetime of racist inequities in school, health care, and housing.
|
||||
Because of the potential for such inequities to influence someone's ability to work, managers and organizations need to make space for surfacing this inequities at work, so that teams can acknowledgement them, plan around them, and ideally address them through targeted supports.
|
||||
Anything less tends to make engineers feel unsupported, which will only decrease their motivation to contribute to a team.
|
||||
</p>
|
||||
|
||||
<p>
|
||||
These widely varying conceptions of productivity reveal that programming in a software engineering context is about far more than just writing a lot of code.
|
||||
It's about coordinating productively with a team, synchronizing your work with an organizations goals, and most importantly, reflecting on ways to change work to achieve those goals more effectively.
|
||||
|
|
|
@ -113,6 +113,10 @@
|
|||
<td>Usability</td>
|
||||
<td>This quality encompasses all of the qualities above. We use it as a holistic term to represent any quality that affects someone's ability to use a system.</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td>Bias</td>
|
||||
<td>The multiple ways in which software can discriminate, exclude, or amplify or reinforce discriminatory or exclusionary structures in society. For example, data used to train a classifier might used racially biased data, algorithms might use sexist assumptions about gender, web forms might systematically exclude non-Western names and language, and applications might be only accessible to people who can see or use a mouse.</td>
|
||||
</tr>
|
||||
</table>
|
||||
|
||||
<p>Although the lists above are not complete, you might have already noticed some tradeoffs between different qualities. A secure system is necessarily going to be less learnable, because there will be more to learn to operate it. A robust system will likely be less maintainable because it it will likely have more code to account for its diverse operating environments. Because one cannot achieve all software qualities, and achieving each quality takes significant time, it is necessary to prioritize qualities for each project.</p>
|
||||
|
|
|
@ -59,7 +59,9 @@
|
|||
|
||||
<p>
|
||||
And yet, like design, requirements come from the world and the people in it and not from software (<a href="#jackson">Jackson 2001</a>).
|
||||
Sometimes requirements even come from law, as is the case of the European Union's General Data Protection Regulation (<a href="https://eugdpr.org/">GDPR</a>) regulation, which specifies a set of data privacy requirements that all software systems used by EU citizens must meet.
|
||||
Because they come from the world, requirements are rarely objective or unambiguous.
|
||||
For example, some requirements come from law, such as the European Union's General Data Protection Regulation (<a href="https://eugdpr.org/">GDPR</a>) regulation, which specifies a set of data privacy requirements that all software systems used by EU citizens must meet.
|
||||
Other requirements might come from public pressure for change, as in Twitter's decision to label particular tweets as having false information or hate speech.
|
||||
Therefore, the methods that people use to do requirements engineering are quite diverse.
|
||||
Requirements engineers may work with lawyers to interpret policy.
|
||||
They might work with regulators to negotiate requirements.
|
||||
|
|
|
@ -75,7 +75,7 @@ function min(a, b) {
|
|||
<p>Assertions are related to the broader category of <strong>error handling</strong> language features. Error handling includes assertions, but also programming language features like exceptions and exception handlers. Error handling is a form of specification in that <em>checking</em> for errors usually entails explicitly specifying the conditions that determine an error. For example, in the code above, the condition <code>Number.isInteger(a)</code> specifies that the parameter <code>a</code> must be an integer. Other exception handling code such as the Java <code>throws</code> statement indicates the cases in which errors can occur and the corresponding <code>catch</code> statement indicates what is to done about errors. It is difficult to implement good exception handling that provides granular, clear ways of recovering from errors (<a href="#chen">Chen et al. 2009</a>). Evidence shows that modern developers are still exceptionally bad at designing for errors; one study found that errors are not designed for, few errors are tested for, and exception handling is often overly general, providing little ability for users to understand errors or for developers to debug them (<a href="#ebert">Ebert et al. 2015</a>). These difficulties appear to be because it is difficult to imagine the vast range of errors that can occur (<a href="#maxion">Maxion & Olszewski 2000</a>).</p>
|
||||
|
||||
<p>Researchers have invented many forms of specification that require more work and more thought to write, but can be used to make stronger guarantees about correctness (<a href="#woodcock">Woodcock et al. 2009</a>). For example, many languages support the expression of formal <strong>pre-conditions</strong> and <strong>post-conditions</strong> that represent contracts that must be kept for the program to be corect. (<strong>Formal</strong> means mathematical, facilitating mathematical proofs that these conditions are met). Because these contracts are essentially mathematical promises, we can build tools that automatically read a function's code and verify that what it computes exhibits those mathematical properties using automated theorem proving systems. For example, suppose we wrote some formal specifications for our example above to replace our assertions (using a fictional notation for illustration purposes):</p>
|
||||
|
||||
|
||||
<pre>
|
||||
// pre-conditions: a in Integers, b in Integers
|
||||
// post-conditions: result <= a and result <= b
|
||||
|
@ -86,11 +86,23 @@ function min(a, b) {
|
|||
|
||||
<p>The annotations above require that, no matter what, the inputs have to be integers and the output has to be less than or equal to both values. The automatic theorem prover can then start with the claim that result is always less than or equal to both and begin searching for a counterexample. Can you find a counterexample? Really try. Think about what you're doing while you try: you're probably experimenting with different inputs to identify arguments that violate the contract. That's similar to what automatic theorem provers do, but they use many tricks to explore large possible spaces of inputs all at once, and they do it very quickly.</p>
|
||||
|
||||
<p>There are definite tradeoffs with writing detailed, formal specifications. The benefits are clear: many companies have written formal functional specifications in order to make <em>completely</em> unambiguous the required behavior of their code, particularly systems that are capable of killing people or losing money, such as flight automation software, banking systems, and even compilers that create executables from code (<a href="#woodcock">Woodcock et al. 2009</a>). In these settings, it's worth the effort of being 100% certain that the program is correct because if it's not, people can die.</p>
|
||||
<p>
|
||||
There are definite tradeoffs with writing detailed, formal specifications.
|
||||
The benefits are clear: many companies have written formal functional specifications in order to make <em>completely</em> unambiguous the required behavior of their code, particularly systems that are capable of killing people or losing money, such as flight automation software, banking systems, and even compilers that create executables from code (<a href="#woodcock">Woodcock et al. 2009</a>).
|
||||
In these settings, it's worth the effort of being 100% certain that the program is correct because if it's not, people can die.
|
||||
Specifications can have other benefits.
|
||||
The very act of writing down what you expect a function to do in the form of test cases can slow developers down, causing to reflect more carefully and systematically about exactly what they expect a function to do (<a href="#fucci">Fucci et al. 2016</a>).
|
||||
Perhaps if this is true in general, there's value in simply stepping back before you write a function, mapping out pre-conditions and post-conditions in the form of simple natural language comments, and <em>then</em> writing the function to match your intentions.
|
||||
</p>
|
||||
|
||||
<p>When the consequences aren't so high, other factors dominate: writing functional specifications is very hard and very time consuming, you need tools to verify the annotations themselves, and you have to maintain annotations. These barriers deter many developers from writing them (<a href="#schiller">Schiller et al. 2014</a>). Some forms of specifications, like the UML diagrams we described when discussing architecture, lack the benefits of formal specifications <em>and</em> require a lot of work to create, leading many practitioners to find them not worth the effort (<a href="#petre">Petre 2013</a>).</p>
|
||||
|
||||
<p>Specifications can have other benefits. The very act of writing down what you expect a function to do in the form of test cases can slow developers down, causing to reflect more carefully and systematically about exactly what they expect a function to do (<a href="#fucci">Fucci et al. 2016</a>). Perhaps if this is true in general, there's value in simply stepping back before you write a function, mapping out pre-conditions and post-conditions in the form of simple natural language comments, and <em>then</em> writing the function to match your intentions.</p>
|
||||
<p>
|
||||
Writing formal specifications can also have downsides.
|
||||
When the consequences of software failure aren't so high, the difficulty and time required to write and maintain functional specifications may not be worth the effort (<a href="#petre">Petre 2013</a>).
|
||||
These barriers deter many developers from writing them (<a href="#schiller">Schiller et al. 2014</a>).
|
||||
Formal specifications can also warp the types of data that developers work with.
|
||||
For example, it is much easier to write formal specifications about Boolean values and integers than string values.
|
||||
This can lead engineers to be overly reductive in how they model data (e.g., settling for binary models of gender, then gender is inherently non-binary and multidimensional).
|
||||
</p>
|
||||
|
||||
<center class="lead"><a href="process.html">Next chapter: Process</a></center>
|
||||
|
||||
|
|
|
@ -106,7 +106,18 @@ function count(input) {
|
|||
|
||||
<p>Not all analytical techniques rely entirely on logic. In fact, one of the most popular methods of verification in industry are <strong>code reviews</strong>, also known as <em>inspections</em>. The basic idea of an inspection is to read the program analytically, following the control and data flow inside the code to look for defects. This can be done alone, in groups, and even included as part of process of integrating changes, to verify them before they are committed to a branch. Modern code reviews, while informal, help find defects, stimulate knowledge transfer between developers, increase team awareness, and help identify alternative implementations that can improve quality (<a href="#bacchelli">Bacchelli & Bird 2013</a>). One study found that measures of how much a developer knows about an architecture can increase 66% to 150% depending on the project (<a href="#rigby2">Rigby & Bird 2013</a>). That said, not all reviews are created equal: the best ones are thorough and conducted by a reviewer with strong familiarity with the code (<a href="#kononenko">Kononenko et al. 2016</a>); including reviewers that do not know each other or do not know the code can result in longer reviews, especially when run as meetings (<a href="#seaman">Seaman & Basili 1997</a>). Soliciting reviews asynchronously by allowing developers to request reviewers of their peers is generally much more scalable (<a href="#rigby">Rigby & Storey 2011</a>), but this requires developers to be careful about which reviews they invest in.</p>
|
||||
|
||||
<p>Beyond these more technical considerations around verifying a program's correctness are organizational issues around different software qualities. For example, different organizations have different sensitivities to defects. If a $0.99 game on the app store has a defect, that might not hurt its sales much, unless that defect prevents a player from completing the game. If Boeing's flight automation software has a defect, hundreds of people might die. The game developer might do a little manual play testing, release, and see if anyone reports a defect. Boeing will spend years proving mathematically with automatic program analysis that every line of code does what is intended, and repeating this verification every time a line of code changes. What type of verification is right for your team depends entirely on what you're building, who's using it, and how they're depending on it.</p>
|
||||
<p>
|
||||
Beyond these more technical considerations around verifying a program's correctness are organizational issues around different software qualities.
|
||||
For example, different organizations have different sensitivities to defects.
|
||||
If a $0.99 game on the app store has a defect, that might not hurt its sales much, unless that defect prevents a player from completing the game.
|
||||
If Boeing's flight automation software has a defect, hundreds of people might die.
|
||||
The game developer might do a little manual play testing, release, and see if anyone reports a defect.
|
||||
Boeing will spend years proving mathematically with automatic program analysis that every line of code does what is intended, and repeating this verification every time a line of code changes.
|
||||
Moreover, requirements may change differently in different domains.
|
||||
For example, a game company might finally recognize the sexist stereotypes amplified in its game mechanics and have to change requirements, resulting in changed definitions of correctness, and the incorporation of new software qualities such as bias into testing plans.
|
||||
Similarly, Boeing might have to respond to pandemic fears by having to shift resources away from verifying flight crash safety to verifying public health safety.
|
||||
What type of verification is right for your team depends entirely on what a team is building, who's using it, and how they're depending on it.
|
||||
</p>
|
||||
|
||||
<center class="lead"><a href="monitoring.html">Next chapter: Monitoring</a></center>
|
||||
|
||||
|
|
Loading…
Reference in a new issue