Merge pull request #163 from HeidiEllis/patch-1

Update verification.bd
This commit is contained in:
Amy J. Ko 2023-11-15 13:18:08 -08:00 committed by GitHub
commit 62d1828768
No known key found for this signature in database
GPG key ID: 4AEE18F83AFDEB23

View file

@ -5,7 +5,7 @@ Part of this is being clear about what you intended (by writing [specifications|
There are many ways to verify code. A reasonable first instinct is to simply run your program. After all, what better way to check whether you expressed your intents then to see with your own eyes what your program does? This is an empirical approach is called *testing*. Some testing is _manual_, in that a human executes a program and verifies that it does what was intended. Some testing is _automated_, in that the test is run automatically by a computer. Another way to verify code is to *analyze* it, using logic to verify its correct operation. As with testing, some analysis is _manual_, since humans do it. We call this manual analysis _inspection_, whereas other analysis is _automated_, since computers do it. We call this _program analysis_. This leads to a nice complementary set of verification technique along two axes: degree of automation and type of verification: There are many ways to verify code. A reasonable first instinct is to simply run your program. After all, what better way to check whether you expressed your intents then to see with your own eyes what your program does? This is an empirical approach is called *testing*. Some testing is _manual_, in that a human executes a program and verifies that it does what was intended. Some testing is _automated_, in that the test is run automatically by a computer. Another way to verify code is to *analyze* it, using logic to verify its correct operation. As with testing, some analysis is _manual_, since humans do it. We call this manual analysis _inspection_, whereas other analysis is _automated_, since computers do it. We call this _program analysis_. This leads to a nice complementary set of verification technique along two axes: degree of automation and type of verification:
* Manual techniques include *manual testing* (which is empirical) and *inspections* (which is analytical) * Manual techniques include *manual testing* (which is empirical) and *inspections* (which is analytical)
* Automated technqiues include *automated testing* (which is empirical) and *program analysis* (which is analytical) * Automated techniques include *automated testing* (which is empirical) and *program analysis* (which is analytical)
To discuss each of these and their tradeoffs, first we have to cover some theory about verification. The first and simplest ideas are some terminology: To discuss each of these and their tradeoffs, first we have to cover some theory about verification. The first and simplest ideas are some terminology:
@ -59,4 +59,4 @@ One popular type of automatic program analysis tools is a *static analysis* tool
Not all analytical techniques rely entirely on logic. In fact, one of the most popular methods of verification in industry are *code reviews*, also known as _inspections_. The basic idea of an inspection is to read the program analytically, following the control and data flow inside the code to look for defects. This can be done alone, in groups, and even included as part of process of integrating changes, to verify them before they are committed to a branch. Modern code reviews, while informal, help find defects, stimulate knowledge transfer between developers, increase team awareness, and help identify alternative implementations that can improve quality<bacchelli13>. One study found that measures of how much a developer knows about an architecture can increase 66% to 150% depending on the project<rigby13>. That said, not all reviews are created equal: the best ones are thorough and conducted by a reviewer with strong familiarity with the code<kononenko16>; including reviewers that do not know each other or do not know the code can result in longer reviews, especially when run as meetings<seaman97>. Soliciting reviews asynchronously by allowing developers to request reviewers of their peers is generally much more scalable<rigby11>, but this requires developers to be careful about which reviews they invest in. These choices about where to put reviewing attention can result in great disparities in what is reviewed, especially in open source: the more work a review is perceived to be, the less likely it is to be reviewed at all and the longer the delays in receiving a review<thongtanunam16>. Not all analytical techniques rely entirely on logic. In fact, one of the most popular methods of verification in industry are *code reviews*, also known as _inspections_. The basic idea of an inspection is to read the program analytically, following the control and data flow inside the code to look for defects. This can be done alone, in groups, and even included as part of process of integrating changes, to verify them before they are committed to a branch. Modern code reviews, while informal, help find defects, stimulate knowledge transfer between developers, increase team awareness, and help identify alternative implementations that can improve quality<bacchelli13>. One study found that measures of how much a developer knows about an architecture can increase 66% to 150% depending on the project<rigby13>. That said, not all reviews are created equal: the best ones are thorough and conducted by a reviewer with strong familiarity with the code<kononenko16>; including reviewers that do not know each other or do not know the code can result in longer reviews, especially when run as meetings<seaman97>. Soliciting reviews asynchronously by allowing developers to request reviewers of their peers is generally much more scalable<rigby11>, but this requires developers to be careful about which reviews they invest in. These choices about where to put reviewing attention can result in great disparities in what is reviewed, especially in open source: the more work a review is perceived to be, the less likely it is to be reviewed at all and the longer the delays in receiving a review<thongtanunam16>.
Beyond these more technical considerations around verifying a program's correctness are organizational issues around different software qualities. For example, different organizations have different sensitivities to defects. If a $0.99 game on the app store has a defect, that might not hurt its sales much, unless that defect prevents a player from completing the game. If Boeing's flight automation software has a defect, hundreds of people might die. The game developer might do a little manual play testing, release, and see if anyone reports a defect. Boeing will spend years proving mathematically with automatic program analysis that every line of code does what is intended, and repeating this verification every time a line of code changes. Moreover, requirements may change differently in different domains. For example, a game company might finally recognize the sexist stereotypes amplified in its game mechanics and have to change requirements, resulting in changed definitions of correctness, and the incorporation of new software qualities such as bias into testing plans. Similarly, Boeing might have to respond to pandemic fears by having to shift resources away from verifying flight crash safety to verifying public health safety. What type of verification is right for your team depends entirely on what a team is building, who's using it, and how they're depending on it. Beyond these more technical considerations around verifying a program's correctness are organizational issues around different software qualities. For example, different organizations have different sensitivities to defects. If a $0.99 game on the app store has a defect, that might not hurt its sales much, unless that defect prevents a player from completing the game. If Boeing's flight automation software has a defect, hundreds of people might die. The game developer might do a little manual play testing, release, and see if anyone reports a defect. Boeing will spend years proving mathematically with automatic program analysis that every line of code does what is intended, and repeating this verification every time a line of code changes. Moreover, requirements may change differently in different domains. For example, a game company might finally recognize the sexist stereotypes amplified in its game mechanics and have to change requirements, resulting in changed definitions of correctness, and the incorporation of new software qualities such as bias into testing plans. Similarly, Boeing might have to respond to pandemic fears by having to shift resources away from verifying flight crash safety to verifying public health safety. What type of verification is right for your team depends entirely on what a team is building, who's using it, and how they're depending on it.