diff --git a/chapters/architecture.bd b/chapters/architecture.bd index 831ebd7..6b32015 100644 --- a/chapters/architecture.bd +++ b/chapters/architecture.bd @@ -1,6 +1,6 @@ Once you have a sense of what your design must do (in the form of requirements or other less formal specifications), the next big problem is one of organization. How will you order all of the different data, algorithms, and control implied by your requirements? With a small program of a few hundred lines, you can get away without much organization, but as programs scale, they quickly become impossible to manage alone, let alone with multiple developers. Much of this challenge occurs because requirements _change_, and every time they do, code has to change to accommodate. The more code there is and the more entangled it is, the harder it is to change and more likely you are to break things. -This is where *architecture* comes in. Architecture is a way of organizing code, just like building architecture is a way of organizing space. The idea of software architecture has at its foundation a principle of *information hiding*: the less a part of a program knows about other parts of a program, the easier it is to change. The most popular information hiding strategy is *encapsulation*: this is the idea of designing self-contained abstractions with well-defined interfaces that separate different concerns in a program. Programming languages offer encapsulation support through things like *functions* and *classes*, which encapsulate data and functionality together. Another programming language encapsulation method is *scoping*, which hides variables and other names from other parts of program outside a scope. All of these strategies attempt to encourage developers to maximize information hiding and separation of concerns. If you get your encapsulation right, you should be able to easily make changes to a program's behavior without having to change _everything_ about it's implementation. +This is where *architecture* comes in. Architecture is a way of organizing code, just like building architecture is a way of organizing space. The idea of software architecture has at its foundation a principle of *information hiding*: the less a part of a program knows about other parts of a program, the easier it is to change. The most popular information hiding strategy is *encapsulation*: this is the idea of designing self-contained abstractions with well-defined interfaces that separate different concerns in a program. Programming languages offer encapsulation support through things like *functions* and *classes*, which encapsulate data and functionality together. Another programming language encapsulation method is *scoping*, which hides variables and other names from other parts of program outside a scope. All of these strategies attempt to encourage developers to maximize information hiding and separation of concerns. If you get your encapsulation right, you should be able to easily make changes to a program's behavior without having to change _everything_ about its implementation. When encapsulation strategies fail, one can end up with what some affectionately call a "ball of mud" architecture or "spaghetti code". Ball of mud architectures have no apparent organization, which makes it difficult to comprehend how parts of its implementation interact. A more precise concept that can help explain this disorder is *cross-cutting concerns*, which are things like features and functionality that span multiple different components of a system, or even an entire system. There is some evidence that cross-cutting concerns can lead to difficulties in program comprehension and long-term design degradation, all of which reduce productivity and increase the risk of defects. As long-lived systems get harder to change, they can take on _technical debt_, which is the degree to which an implementation is out of sync with a team's understanding of what a product is intended to be. Many developers view such debt as emerging from primarily from poor architectural decisions. Over time, this debt can further result in organizational challenges, making change even more difficult. diff --git a/chapters/debugging.bd b/chapters/debugging.bd index b27293d..73f1b49 100644 --- a/chapters/debugging.bd +++ b/chapters/debugging.bd @@ -39,7 +39,7 @@ This process, while straightforward, is the slowest, requiring a long, vigilant 4. Analyzing the data from your observations 5. If you've identified the defect, move on to the repair phase; if not, return to step 2. -The problems with the strategy above are numerous. First, what if you can't generate a hypothesis? What if you can, but testing the hypothesis is slow or impossible? You could spend _hours_ generating hypotheses that are completely off-base, effectively analyzing all of your code and it's executions before finding the defect. +The problems with the strategy above are numerous. First, what if you can't generate a hypothesis? What if you can, but testing the hypothesis is slow or impossible? You could spend _hours_ generating hypotheses that are completely off-base, effectively analyzing all of your code and its executions before finding the defect. Another strategy is working backwards. diff --git a/chapters/organizations.bd b/chapters/organizations.bd index ccc2d23..eb4e501 100644 --- a/chapters/organizations.bd +++ b/chapters/organizations.bd @@ -13,7 +13,7 @@ The individuals in a software organization take on different roles to achieve th * *Designers* decide _how_ software will provide value. This isn't about code or really even about software; it's about envisioning solutions to problems that people have. * *Software engineers* write code with other engineers to implement requirements envisioned by designers. If they fail to meet requirements, the design won't be implemented correctly, which will prevent the software from providing value. * *Sales* takes the product that's been built and try to sell it to the audiences that marketers have identified. They also try to refine an organization's understanding of what the customer wants and needs, providing feedback to marketing, product, and design, which engineers then address. -* *Support* helps the people using the product to use it successfully and, like sales, provides feedback to product, design, and engineering about the product's value (or lack thereof) and it's defects. +* *Support* helps the people using the product to use it successfully and, like sales, provides feedback to product, design, and engineering about the product's value (or lack thereof) and its defects. As I noted above, sometimes the roles above get merged into individuals. When I was CTO at AnswerDash, I had software engineering roles, design roles, product roles, sales roles, _and_ support roles. This was partly because it was a small company when I was there. As organizations grow, these roles tend to be divided into smaller pieces. This division often means that different parts of the organization don't share knowledge, even when it would be advantageous. diff --git a/chapters/process.bd b/chapters/process.bd index 0082d5a..407c1a2 100644 --- a/chapters/process.bd +++ b/chapters/process.bd @@ -1,6 +1,6 @@ So you know what you're going to build and how you're going to build it. What process should you go about building it? Who's going to build what? What order should you build it in? How do you make sure everyone is in sync while you're building it? And most importantly, how to do you make sure you build well and on time? These are fundamental questions in software engineering with many potential answers. Unfortunately, we still don't know which of those answers are right. -At the foundation of all of these questions are basic matters of [project management|https://en.wikipedia.org/wiki/Project_management]: plan, execute, and monitor. But developers in the 1970's and on found that traditional project management ideas didn't seem to work. The earliest process ideas followed a "waterfall" model, in which a project begins by identifying requirements, writing specifications, implementing, testing, and releasing, all under the assumption that every stage could be fully tested and verified. (Recognize this? It's the order of topics we're discussing in this book!). Many managers seemed to like the waterfall model because it seemed structured and predictable; however, because most managers were originally software developers, they preferred a structured approach to project management. The reality, however, was that no matter how much verification one did of each of these steps, there always seemed to be more information in later steps that caused a team to reconsider it's earlier decision (e.g., imagine a customer liked a requirement when it was described in the abstract, but when it was actually built, they rejected it, because they finally saw what the requirement really meant). +At the foundation of all of these questions are basic matters of [project management|https://en.wikipedia.org/wiki/Project_management]: plan, execute, and monitor. But developers in the 1970's and on found that traditional project management ideas didn't seem to work. The earliest process ideas followed a "waterfall" model, in which a project begins by identifying requirements, writing specifications, implementing, testing, and releasing, all under the assumption that every stage could be fully tested and verified. (Recognize this? It's the order of topics we're discussing in this class!). Many managers seemed to like the waterfall model because it seemed structured and predictable; however, because most managers were originally software developers, they preferred a structured approach to project management. The reality, however, was that no matter how much verification one did of each of these steps, there always seemed to be more information in later steps that caused a team to reconsider its earlier decision (e.g., imagine a customer liked a requirement when it was described in the abstract, but when it was actually built, they rejected it, because they finally saw what the requirement really meant). In 1988, Barry Boehm proposed an alternative to waterfall called the *Spiral model*: rather than trying to verify every step before proceeding to the next level of detail, _prototype_ every step along the way, getting partial validation, iteratively converging through a series of prototypes toward both an acceptable set of requirements _and_ an acceptable product. Throughout, risk assessment is key, encouraging a team to reflect and revise process based on what they are learning. What was important about these ideas were not the particulars of Boehm's proposed process, but the disruptive idea that iteration and process improvement are critical to engineering great software. @@ -12,7 +12,7 @@ These early ideas in software project management led to a wide variety of other Beyond process improvement, other factors emerged. For example, researchers discovered that critical to team productivity was *awareness* of teammates' work. Teams need tools like dashboards to help make them aware of changing priorities and tools like feeds to coordinate short term work. Moreover, researchers found that engineers tended to favor non-social sources such as documentation for factual information, but social sources for information to support problem solving. Decades ago, developers used tools like email and IRC for awareness; now they use tools like [Slack|https://slack.com], [Trello|https://trello.com/], [GitHub|http://github.com], and [JIRA|https://www.atlassian.com/software/jira], which have the same basic functionality, but are much more polished, streamlined, and customizable. -In addition to awareness, *ownership* is a critical idea in process. This is the idea that for every line of code, someone is responsible for it's quality. The owner _might_ be the person who originally wrote the code, but it could also shift to new team members. Studies of code ownership on Windows Vista and Windows 7 found that less a component had a clear owner, the more pre-release defects it had and the more post-release failures were reported by users. This means that in addition to getting code written, having clear ownership and clear processes for transfer of ownership are key to functional correctness. +In addition to awareness, *ownership* is a critical idea in process. This is the idea that for every line of code, someone is responsible for its quality. The owner _might_ be the person who originally wrote the code, but it could also shift to new team members. Studies of code ownership on Windows Vista and Windows 7 found that less a component had a clear owner, the more pre-release defects it had and the more post-release failures were reported by users. This means that in addition to getting code written, having clear ownership and clear processes for transfer of ownership are key to functional correctness. *Pace* is another factor that affects quality. Clearly, there's a tradeoff between how fast a team works and the quality of the product it can build. In fact, interview studies of engineers at Google, Facebook, Microsoft, Intel, and other large companies found that the pressure to reduce "time to market" harmed nearly every aspect of teamwork: the availability and discoverability of information, clear communication, planning, integration with others' work, and code ownership. Not only did a fast pace reduce quality, but it also reduced engineers' personal satisfaction with their job and their work. I encountered similar issues as CTO of my startup: while racing to market, I was often asked to meet impossible deadlines with zero defects and had to constantly communicate to the other executives in the company why this was not possible. diff --git a/chapters/quality.bd b/chapters/quality.bd index 46b6a4d..f07432e 100644 --- a/chapters/quality.bd +++ b/chapters/quality.bd @@ -10,7 +10,7 @@ There are a surprisingly large number of software qualities. Many conce * *Robustness* is the extent to which a program can recover from errors or unexpected input. For example, a login form that crashes if an email is formatted improperly isn't very robust. A login form that handles _any_ text input is optimally robust. One can make a system more robust by breadth of errors and inputs it can handle in a reasonable way. -* *Performance* is the extent to which a program uses computing resources economically. Synonymous with "fast" and "zippy". Performance is directly determined by how many instructions a program has to execute to accomplish it's operations, but it is difficult to measure because operations, inputs, and the operating environment can vary widely. +* *Performance* is the extent to which a program uses computing resources economically. Synonymous with "fast" and "zippy". Performance is directly determined by how many instructions a program has to execute to accomplish its operations, but it is difficult to measure because operations, inputs, and the operating environment can vary widely. * *Portability* is the extent to which an implementation can run on different platforms without being modified. For example, "universal" applications in the Apple ecosystem that can run on iPhones, iPads, and Mac OS without being modified or recompiled are highly portable. diff --git a/chapters/requirements.bd b/chapters/requirements.bd index e008c7a..eeb6ac0 100644 --- a/chapters/requirements.bd +++ b/chapters/requirements.bd @@ -1,8 +1,8 @@ Once you have a problem, a solution, and a design specification, it's entirely reasonable to start thinking about code. What libraries should we use? What platform is best? Who will build what? After all, there's no better way to test the feasibility of an idea than to build it, deploy it, and find out if it works. Right? -It depends. This mentality towards product design works fine if building and deploying something is cheap and getting feedback has no consequences. Simple consumer applications often benefit from this simplicity, especially early stage ones, because there's little to lose. For example, if you are starting a company, and do not even know if there is a market opportuniity yet, it may be worth quickly prototyping an idea, seeing if there's interesting, and then later thinking about how to carefully architect a product that meets that opportunity. This is [how products such as Facebook started|https://en.wikipedia.org/wiki/History_of_Facebook], with a poorly implemented prototype that revealed an opportunity, which was only later translated into a functional, reliable software service. +It depends. This mentality towards product design works fine if building and deploying something is cheap and getting feedback has no consequences. Simple consumer applications often benefit from this simplicity, especially early stage ones, because there's little to lose. For example, if you are starting a company, and do not even know if there is a market opportuniity yet, it may be worth quickly prototyping an idea, seeing if there's interest, and then later thinking about how to carefully architect a product that meets that opportunity. This is [how products such as Facebook started|https://en.wikipedia.org/wiki/History_of_Facebook], with a poorly implemented prototype that revealed an opportunity, which was only later translated into a functional, reliable software service. -However, what if prototyping a beta _isn't_ cheap to build? What if your product only has one shot at adoption? What if you're building something for a client and they want to define success? Worse yet, what if your product could _kill_ people if it's not built properly? Consider the [U.S. HealthCare.gov launch|https://en.wikipedia.org/wiki/HealthCare.gov], for example, which was lambasted for is countless defects and poor scalability at launch, only working for 1,100 simultaneous users, when 50,000 were exected and 250,000 actually arrived. To prevent disastrous launches like this, software teams have to be more careful about translating a design specification into a specific explicit set of goals that must be satisfied in order for the implementation to be complete. We call these goals *requirements* and we call this process of *requirements engineering*. +However, what if prototyping a beta _isn't_ cheap to build? What if your product only has one shot at adoption? What if you're building something for a client and they want to define success? Worse yet, what if your product could _kill_ people if it's not built properly? Consider the [U.S. HealthCare.gov launch|https://en.wikipedia.org/wiki/HealthCare.gov], for example, which was lambasted for its countless defects and poor scalability at launch, only working for 1,100 simultaneous users, when 50,000 were exected and 250,000 actually arrived. To prevent disastrous launches like this, software teams have to be more careful about translating a design specification into a specific explicit set of goals that must be satisfied in order for the implementation to be complete. We call these goals *requirements* and we call this process of *requirements engineering*. In principle, requirements are a relatively simple concept. They are simply statements of what must be true about a system to make the system acceptable. For example, suppose you were designing an interactive mobile game. You might want to write the requirement _The frame rate must never drop below 60 frames per second._ This could be important for any number of reasons: the game may rely on interactive speeds, your company's reputation may be for high fidelity graphics, or perhaps that high frame rate is key to creating a sense of realism. Or, imagine your game company has a reputation for high performance, high fidelity graphics, high frame rate graphics, and achieving any less would erode your company's brand. Whatever the reasons, expressing it as a requirement makes it explicit that any version of the software that doesn't meet that requirement is unacceptable, and sets a clear goal for engineering to meet. @@ -14,7 +14,7 @@ And yet, like design, requirements come from the world and the people in it and There are some approaches to specifying requirements _formally_. These techniques allow requirements engineers to automatically identify _conflicting_ requirements, so they don't end up proposing a design that can't possibly exist. Some even use systems to make requirements _traceable_, meaning the high level requirement can be linked directly to the code that meets that requirement. All of this formality has tradeoffs: not only does it take more time to be so precise, but it can negatively effect creativity in concept generation as well. -Expressing requirements in natural language can mitigate these effects, at the expense of precision. They just have to be *complete*, *precise*, *non-conflicting*, and *verifiable*. For example, consider a design for a simple to do list application. It's requirements might be something like the following: +Expressing requirements in natural language can mitigate these effects, at the expense of precision. They just have to be *complete*, *precise*, *non-conflicting*, and *verifiable*. For example, consider a design for a simple to do list application. Its requirements might be something like the following: * Users must be able to add to do list items with a single action. * To do list items must consist of text and a binary completed state. @@ -26,7 +26,7 @@ Expressing requirements in natural language can mitigate these effects, at the e Let's review these requirements against the criteria for good requirements that I listed above: * Is it *complete*? I can think of a few more requirements: is the list ordered? How long does state persist? Are there user accounts? Where is data stored? What does it look like? What kinds of user actions must be supported? Is delete undoable? Even just on these completeness dimension, you can see how even a very simple application can become quite complex. When you're generating requirements, your job is to make sure you haven't forgotten important requirements. -* Is the list *precise*? Not really. When you add a to do list item, is it added at the beginning? The end? Wherever a user request it be added? How long can the to do list item text be? Clearly the requirement above is imprecise. And imprecise requirements lead to imprecise goals, which means that engineers might not meet them. Is this to do list team okay with not meeting it's goals? +* Is the list *precise*? Not really. When you add a to do list item, is it added at the beginning? The end? Wherever a user request it be added? How long can the to do list item text be? Clearly the requirement above is imprecise. And imprecise requirements lead to imprecise goals, which means that engineers might not meet them. Is this to do list team okay with not meeting its goals? * Are the requirements *non-conflicting*? I _think_ they are since they all seem to be satisfiable together. But some of the missing requirements might conflict. For example, suppose we clarified the imprecise requirement about where a to do list item is added. If the requirement was that it was added to the end, is there also a requirement that the window scroll to make the newly added to do item visible? If not, would the first requirement of making it possible for users to add an item with a single action be achieveable? They could add it, but they wouldn't know they had added it because of this usability problem, so is this requirement met? This example shows that reasoning through requirements is ultimately about interpreting words, finding source of ambiguity, and trying to eliminate them with more words. * Finally, are they *verifiable*? Some more than others. For example, is there a way to guarantee that the state saves successfully all the time? That may be difficult to prove given the vast number of ways the operating environment might prevent saving, such as a failing hard drive or an interrupted internet connection. This requirement might need to be revised to allow for failures to save, which itself might have implications for other requirements in the list.