cooperative-software-develo.../chapters/evolution.md

23 lines
7.4 KiB
Markdown
Raw Normal View History

2020-09-18 14:31:20 -07:00
Programs change. You find bugs, you fix them. You discover a new requirement, you add a feature. A requirement changes because users demand it, you revise a feature. The simple fact about programs are that they're rarely stable, but rather constantly changing, living artifacts that shift as much as our social worlds shift.
2020-09-11 10:28:38 -07:00
Nowhere is this constant evolution more apparent then in our daily encounters with software updates. The apps on our phones are constantly being updated to improve our experiences, while the web sites we visit potentially change every time we visit them, without us noticing. These different models have different notions of who controls changes to user experience: should software companies control when your experience changes or should you? And with systems with significant backend dependencies, is it even possible to give users control over when things change?
To manage change, developers use many kinds of tools and practices.
2020-09-11 10:37:17 -07:00
One of the most common ways of managing change is to *refactor* code. Refactoring helps developers modify the _architecture_ of a program while keeping its behavior the same, enabling them to implement or modify functionality more easily. For example, one of the most common and simple refactorings is to rename a variable (renaming its definition and all of its uses). This doesn't change the architecture of a program at all, but does improve its readability. Other refactors can be more complex. For example, consider adding a new parameter to a function: all calls to that function need to pass that new parameter, which means you need to go through each call and decide on a value to send from that call site. Studies of refactoring in practice have found that refactorings can be big and small, that they don't always preserve the behavior of a program, and that developers perceive them as involving substantial costs and risks<kim12>.
2020-09-11 10:28:38 -07:00
Another fundamental way that developers manage change is *version control* systems. As you know, they help developers track changes to code, allowing them to revert, merge, fork, and clone projects in a way that is traceable and reliable. Version control systems also help developers identify merge conflicts, so that they don't accidentally override each others' work<nelson19>. While today the most popular version control system is Git, there are actually many types. Some are _centralized_, representing one single ground truth of a project's code, usually stored on a server. Commits to centralized repositories become immediately available to everyone else on a project. Other version control systems are _distributed_, such as Git, allowing one copy of a repository on every local machine. Commits to these local copies don't automatically go to everyone else; rather, they are pushed to some central copy, from which others can pull updates.
Research comparing centralized and distributed revision control systems mostly reveal tradeoffs rather than a clear winner. Distributed version control, for example, appears to lead to commits that are smaller and more scoped to single changes, since developers can manage their own history of commits to their local repository<brindescu14>. Google uses one big centralized version control repository for all of its projects, however, because it offers one source of truth, simplified dependency management, large-scale refactoring, and flexible team boundaries<potvin16>.
When code changes, you need to test it, which often means you need to *build* it, compiling source, data, and other resources into an executable format suitable for testing (and possibly release). Build systems can be as simple as nothing (e.g., loading an HTML file in a web browser interprets the HTML and displays it, requiring no special preparation) and as complex is hundreds and thousands of lines of build script code, compiling, linking, and managing files in a manner that prepares a system for testing, such as those used to build operating systems like Windows or Linux. To write these complex build procedures, developers use build automation tools like `make`, `ant`, `gulp` and dozens of others, each helping to automate builds. In large companies, there are whole teams that maintain build automation scripts to ensure that developers can always quickly build and test. In these teams, most of the challenges are social and not technical: teams need to clarify role ambiguity, knowledge sharing, communication, trust, and conflict in order to be productive, just like other software engineering teams<phillips14>.
Perhaps the most modern form of build practice is *continuous integration* (CI). This is the idea of completely automating not only builds, but also the running of a collection of tests, every time a bundle of changes is pushed to a central version control repository. The claimed benefit of CI is that every major change is quickly built, tested, and ready for deployment, shortening the time between a change and the discovery of failures. Research shows this is true: CI helps projects release more often and is widely adopted in open source<hilton16>. Of course, these benefits occur only if builds are fast.
For example, some large projects like Windows can take a whole day to build, making continuous integration of the whole operating system infeasible. When builds and tests are fast, continuous integration can accelerate development, especially in projects with large numbers of contributors<vasilescu15>. Some teams even go further than continuous integration, building continuous _delivery_ systems that ensure that complete builds are readily available for release (potentially multiple times per day for software on the web). Having a repeatable, automated deployment process is key for such processes<chen15>.
2020-09-18 14:31:20 -07:00
One last problem with changes in software is managing the *releases* of software. Good release management should archive new versions of software, automatically post the version online, make the version accessible to users, keep a history of who accesses the new version, and provide clear release notes describing changes from the previous version<vanderhoek97>. By default, all of this is quite manual, but many of these steps can be automated, streamlining how teams release changes to the world. You've probably encountered these most in the form of software updates to applications and operating systems.
With so many ways that software can change, and so many tools for managing that change, it also becomes important to manage the risk of change. One approach to managing this risk is *impact analysis*<arnold96>, an activity of systematically and precisely analyzing the consequences of a change _before_ making the change. This can involve analyzing dependencies that are affected by a bug fix, running unit tests on smaller parts of an implementation<runeson06>, and running regression tests on previously encountered failures<rothermel96>, and running users tests to ensure that the way in which you fixed a bug does not inadvertently introduce problems with usability, usefulness, or other qualities critical to meeting requirements.
Impact analysis, and software evolution in general, is therefore ultimately a process of managing change. Change in requirements, change in code, change in data, and change in how software is situated in the world. And like any change management, it must be done cautiously, both to avoid breaking critical functionality, but also ensure that whatever new changes are being brought to the world achieve their goals.