<p>Once you have a sense of what your design must do (in the form of requirements or other less formal specifications), the next big problem is one of organization. How will you order all of the different data, algorithms, and control implied by your requirements? With a small program of a few hundred lines, you can get away without much organization, but as programs scale, they quickly become impossible to manage alone, let alone with multiple developers. Much of this challenge occurs because requirements <em>change</em>, and every time they do, code has to change to accommodate. The more code there is and the more entangled it is, the harder it is to change and more likely you are to break things.</p>
<p>This is where <b>architecture</b> comes in. Architecture is a way of organizing code, just like building architecture is a way of organizing space. The idea of software architecture has at its foundation a principle of <b>information hiding</b>: the less a part of a program knows about other parts of a program, the easier it is to change. The most popular information hiding strategy is <b>encapsulation</b>: this is the idea of designing self-contained abstractions with well-defined interfaces that separate different concerns in a program. Programming languages offer encapsulation support through things like <b>functions</b> and <b>classes</b>, which encapsulate data and functionality together. Another programming language encapsulation method is <b>scoping</b>, which hides variables and other names from other parts of program outside a scope. All of these strategies attempt to encourage developers to maximize information hiding and separation of concerns. If you get your encapsulation right, you should be able to easily make changes to a program's behavior without having to change <em>everything</em> about it's implementation.</p>
<p>When encapsulation strategies fail, one can end up with what some affectionately call a "ball of mud" architecture or "spaghetti code". Ball of mud architectures have no apparent organization, which makes it difficult to comprehend how parts of its implementation interact. A more precise concept that can help explain this disorder is <b>cross-cutting concerns</b>, which are things like features and functionality that span multiple different components of a system, or even an entire system. There is some evidence that cross-cutting concerns can lead to difficulties in program comprehension and long-term design degradation (<ahref="#walker">Walker et al. 2012</a>), all of which reduce productivity and increase the risk of defects. As long-lived systems get harder to change, they can take on <em>technical debt</em>, which is the degree to which an implementation is out of sync with a team's understanding of what a product is intended to be. Many developers view such debt as emerging from primarily from poor architectural decisions (<ahref="ernst">Ernst et al. 2015</a>). Over time, this debt can further result in organizational challenges (<ahref="#khadka">Khadka et al. 2014</a>), making change even more difficult.</p>
The preventative solution to this problems is to try to design architecture up front, mitigating the various risks that come from cross-cutting concerns (defects, low modifiability, etc.) (<ahref="#fairbanks">Fairbanks 2010</a>).
A popular method in the 1990's was the <ahref="https://en.wikipedia.org/wiki/Unified_Modeling_Language">Unified Modeling Language</a> (UML), which was a series of notations for expressing the architectural design of a system before implementing it.
Recent studies show that UML generally not used and generally not universal (<ahref="#petre">Petre 2013</a>).
While these formal representations have generally not been adopted, informal, natural language architectural specifications are still widely used.
For example, <ahref="https://www.industrialempathy.com/posts/design-docs-at-google/">Google engineers write design specifications</a> to sort through ambiguities, consider alternatives, and clarify the volume of work required.
</p>
<p>
More recent developers have investigated ideas of <b>architectural styles</b>, which are patterns of interactions and information exchange between encapsulated components.
<li><strong>Client/server</strong>, in which data is transacted in response to requests. This is the basis of the Internet and cloud computing (<ahref="#cito">Cito et la. 2015</a>).</li>
<li><strong>Pipe and filter</strong>, in which data is passed from component to component, and transformed and filtered along the way. Command lines, compilers, and machine learned programs are examples of pipe and filter architectures.</li>
<li><strong>Model-view-controller (MVC)</strong>, in which data is separated from views of the data and from manipulations of data. Nearly all user interface toolkits use MVC, including popular modern frameworks such as React.</li>
<li><strong>Peer to peer (P2P)</strong>, in which components transact data through a distributed standard interface. Examples include Bitcoin, Spotify, and Gnutella.</em>
<li><strong>Event-driven</strong>, in which some components "broadcast" events and others "subscribe" to notifications of these events. Examples include most model-view-controller-based user interface frameworks, which have models broadest change events to views, so they may update themselves to render new model state.</p>
<p>Architectural styles come in all shapes and sizes. Some are smaller design patterns of information sharing (<ahref="#beck">Beck et al. 2006</a>), whereas others are ubiquitous but specialized patterns such as the architectures required to support undo and cancel in user interfaces (<ahref="#bass">Bass et al. 2004</a>).</p>
<p>One fundamental unit of which an architecture is composed is a <b>component</b>. This is basically a word that refers to any abstraction—any code, really—that attempts to <em>encapsulate</em> some well defined functionality or behavior separate from other functionality and behavior. For example, consider the Java class <em>Math</em>: it encapsulates a wide range of related mathematical functions. This class has an interface that decide how it can communicate with other components (sending arguments to a math function and getting a return value). Components can be more than classes though: they might be a data structure, a set of functions, a library, an API, or even something like a web service. All of these are abstractions that encapsulate interrelated computation and state for some well-define purpose.</p>
<p>The second fundamental unit of architecture is <b>connectors</b>. Connectors are code that transmit information <em>between</em> components. They're brokers that connect components, but do not necessarily have meaningful behaviors or states of their own. Connectors can be things like function calls, web service API calls, events, requests, and so on. None of these mechanisms store state or functionality themselves; instead, they are the things that tie components functionality and state together.</p>
<p>Even with carefully selected architectures, systems can still be difficult to put together, leading to <b>architectural mismatch</b> (<ahref="#garlan">Garlan et al. 1995</a>). When mismatch occurs, connecting two styles can require dramatic amounts of code to connect, imposing significant risk of defects and cost of maintenance. One common example of mismatches occurs with the ubiquitous use of database schemas with client/server web-applications. A single change in a database schema can often result in dramatic changes in an application, as every line of code that uses that part of the scheme either directly or indirectly must be updated (<ahref="#qiu">Qiu et al. 2013</a>). This kind of mismatch occurs because the component that manages data (the database) and the component that renders data (the user interface) is highly "coupled" with the database schema: the user interface needs to know <em>a lot</em> about the data, its meaning, and its structure in order to render it meaningfully.</p>
The most common approach to dealing with both architectural mismatch and the changing of requirements over time is <b>refactoring</b>, which means changing the <em>architecture</em> of an implementation without changing its behavior.
Refactoring is something most developers do as part of changing a system (<ahref="#murphyhill">Murphy-Hill et al 2009</a>, <ahref="#silva">Silva et al. 2016</a>).
Refactoring code to eliminate mismatch and technical debt can simplify change in the future, saving time (<ahref="#ng">Ng et al. 2006</a>) and prevent future defects (<ahref="#kim">Kim et al. 2012</a>).
However, because refactoring remains challenging, the difficulty of changing an architecture is often used as a rationale for rejecting demands for change from users.
For example, Google does not allow one to change their Gmail address, which greatly harms people who have changed their name (such as this author when she came out as a trans woman), forcing them to either live with an address that includes their old name, or abandon their Google account, with no ability to transfer documents or settings.
The rationale for this has nothing to do with policy and everything to do with the fact that the original architecture of Gmail treats the email address as a stable, unique identifier for an account.
Changing this basic assumption throughout Gmail's implementation would be an immense refactoring task.
<p>Research on the actual activity of software architecture is actually somewhat sparse. One of the more recent syntheses of this work is Petre et al.'s book, <em>Software Design Decoded</em> (<ahref="#petre2">Petre et al. 2016</a>), which distills many of the practices and skills of software design into a set of succinct ideas. For example, the book states, "<em>Every design problem has multiple, if not infinite, ways of solving it. Experts strongly prefer simpler solutions over complex ones, for they know that such solutions are easier to understand and change in the future.</em>" And yet, in practice, studies of how projects use APIs often show that developers do the exact opposite, building projects with dependencies on large numbers of sometimes trivial APIs. Some behavior suggests that while software <em>architects</em> like simplicity of implementation, software <em>developers</em> are often choosing whatever is easiest to build, rather than whatever is least risky to maintain over time (<ahref="#abdalkareem">Abdalkareem 2017</a>).</p>
<pid="abdalkareem">Rabe Abdalkareem, Olivier Nourry, Sultan Wehaibi, Suhaib Mujahid, and Emad Shihab. 2017. <ahref="https://doi.org/10.1145/3106237.3106267">Why do developers use trivial packages? An empirical case study on npm</a>. In Proceedings of the 2017 11th Joint Meeting on Foundations of Software Engineering (ESEC/FSE 2017). ACM, New York, NY, USA, 385-395.</p>
<pid="bass">Len Bass, Bonnie E. John. 2003. <ahref="http://www.sciencedirect.com/science/article/pii/S0164121202000766"target="_blank">Linking usability to software architecture patterns through general scenarios</a>. Journal of Systems and Software, Volume 66, Issue 3, Pages 187-197.</p>
<pid="beck">Kent Beck, Ron Crocker, Gerard Meszaros, John Vlissides, James O. Coplien, Lutz Dominick, and Frances Paulisch. 1996. <ahref="https://doi.org/10.1109/ICSE.1996.493406"target="_blank">Industrial experience with design patterns</a>. In Proceedings of the 18th international conference on Software engineering (ICSE '96). IEEE Computer Society, Washington, DC, USA, 103-114.</p>
<pid="cito">Jürgen Cito, Philipp Leitner, Thomas Fritz, and Harald C. Gall. 2015. <ahref="https://doi.org/10.1145/2786805.2786826"target="_blank">The making of cloud applications: an empirical study on software development for the cloud</a>. In Proceedings of the 2015 10th Joint Meeting on Foundations of Software Engineering (ESEC/FSE 2015). ACM, New York, NY, USA, 393-403.</p>
<pid="ernst">Neil A. Ernst, Stephany Bellomo, Ipek Ozkaya, Robert L. Nord, and Ian Gorton. 2015. <ahref="https://doi.org/10.1145/2786805.2786848"target="_blank">Measure it? Manage it? Ignore it? Software practitioners and technical debt</a>. In Proceedings of the 2015 10th Joint Meeting on Foundations of Software Engineering (ESEC/FSE 2015). ACM, New York, NY, USA, 50-60.</p>
<pid="fairbanks">Fairbanks, G. (2010). <ahref="https://www.amazon.com/Just-Enough-Software-Architecture-Risk-Driven/dp/0984618104"target="_blank">Just enough software architecture: a risk-driven approach</a>. Marshall & Brainerd.</p>
<pid="garlan">Garlan, D., Allen, R., & Ockerbloom, J. (1995). <ahref="https://doi.org/10.1145/225014.225031"target="_blank">Architectural mismatch or why it's hard to build systems out of existing parts</a>. In Proceedings of the 17th international conference on Software engineering (pp. 179-185).</p>
<pid="khadka">Ravi Khadka, Belfrit V. Batlajery, Amir M. Saeidi, Slinger Jansen, and Jurriaan Hage. 2014. <ahref="http://dx.doi.org/10.1145/2568225.2568318"target="_blank">How do professionals perceive legacy systems and software modernization?</a> In Proceedings of the 36th International Conference on Software Engineering (ICSE 2014). ACM, New York, NY, USA, 36-47.</p>
<pid="kim">Miryung Kim, Thomas Zimmermann, and Nachiappan Nagappan. 2012. <ahref="http://dx.doi.org/10.1145/2393596.2393655"target="_blank">A field study of refactoring challenges and benefits</a>. In Proceedings of the ACM SIGSOFT 20th International Symposium on the Foundations of Software Engineering (FSE '12). ACM, New York, NY, USA, , Article 50 , 11 pages.</p>
<pid="murphyhill">Emerson Murphy-Hill, Chris Parnin, and Andrew P. Black. 2009. <ahref="http://dx.doi.org/10.1109/ICSE.2009.5070529"target="_blank">How we refactor, and how we know it</a>. In Proceedings of the 31st International Conference on Software Engineering (ICSE '09). IEEE Computer Society, Washington, DC, USA, 287-297.</p>
<pid="ng">T. H. Ng, S. C. Cheung, W. K. Chan, and Y. T. Yu. 2006. <ahref="http://dx.doi.org/10.1145/1181775.1181778"target="_blank">Work experience versus refactoring to design patterns: a controlled experiment</a>. In Proceedings of the 14th ACM SIGSOFT international symposium on Foundations of software engineering (SIGSOFT '06/FSE-14). ACM, New York, NY, USA, 12-22.</p>
<pid="petre">Marian Petre. 2013. <ahref="ieeexplore.ieee.org/document/6606618/"target="_blank">UML in practice</a>. In Proceedings of the 2013 International Conference on Software Engineering (ICSE '13). IEEE Press, Piscataway, NJ, USA, 722-731.</p>
<pid="petre2">Petre, M., van der Hoek, A., & Quach, Y. (2016). <ahref="https://books.google.com/books?id=EVE4DQAAQBAJ&lpg=PT17&ots=Tk-8QiRQnP&dq=%22software%20design%20decoded%22&lr&pg=PT17#v=onepage&q&f=false"target="_blank">Software Design Decoded: 66 Ways Experts Think. MIT Press.</p>
<pid="silva">Danilo Silva, Nikolaos Tsantalis, and Marco Tulio Valente. 2016. <ahref="https://doi.org/10.1145/2950290.2950305"target="_blank">Why we refactor? Confessions of GitHub contributors</a>. In Proceedings of the 2016 24th ACM SIGSOFT International Symposium on Foundations of Software Engineering (FSE 2016). ACM, New York, NY, USA, 858-870.</p>
<pid="qiu">Dong Qiu, Bixin Li, and Zhendong Su. 2013. <ahref="http://dx.doi.org/10.1145/2491411.2491431"target="_blank">An empirical analysis of the co-evolution of schema and code in database applications</a>. In Proceedings of the 2013 9th Joint Meeting on Foundations of Software Engineering (ESEC/FSE 2013). ACM, New York, NY, USA, 125-135.</p>
<pid="walker">Robert J. Walker, Shreya Rawal, and Jonathan Sillito. 2012. <ahref="http://dx.doi.org/10.1145/2393596.2393654"target="_blank">Do crosscutting concerns cause modularity problems?</a> In Proceedings of the ACM SIGSOFT 20th International Symposium on the Foundations of Software Engineering (FSE '12). ACM, New York, NY, USA, , Article 49 , 11 pages.</p>
</small>
<h2>Podcasts</h2>
<small>
<p>Software Engineering Daily, <ahref="https://softwareengineeringdaily.com/2015/07/27/react-js-with-sebastian-markbage-and-christopher-chedeau/">React JS with Sebastian Marbage and Christopher Chedeua</a></p>