《A Philosophy of Software Design》是斯坦福大学 John Ousterhout 教授写的一本关于如何减少软件复杂度的书。 软件复杂度在软件的迭代中会不可避免的增加,复杂度增加对于程序代码的阅读者来说会增加学习成本,增加后继开发者的出bug的几率。 简单的软件设计可以让复杂度增加的慢一些,所以在软件开发的生命全周期里面软件设计需要一直应用。而软件设计的技能并不是天生的, 是可以习得的。
开发人员拥有鉴别复杂度的能力很重要。那么什么是软件复杂度? 软件复杂度是指任何让软件系统难以阅读,难以理解和难以修改的结构。 写软件的人通常无法准确意识到自己的程序的复杂度,但是其他阅读的人更能明白,以阅读者的评判为准。
dependencies -> change amplification and cognitive load obscurity -> unknown unknowns and cognitive load
复杂度是递增的,累积的。每次引入一点点的复杂度,开发者往往不以为意,但是需要 “零容忍”的态度来对待引入复杂度。
战略路径 strategic approach -> better design -> slower -> require an investment mindset -> proactive investment on good documentation -> the real hero -> take a few to fix it when discovering design problem
战术路径 tactical approach -> 尽快交付(新功能或者bug)-> complexities accumulation -> need refactor
大的一次性的全局的design并不是有效率的 -> 随着开发系统的进行,理想的设计会一点一点的浮现出来 -> 10%-20% 的时间投资在思考更好的设计上 -> 投资会在未来收到回报
初创公司更加偏向于短期的战术编程,以后再重构。但是如果代码最终变得像意大利面条spaghetti一样错综复杂,重构是很难的
软件工程师的质量也是很重要的,少量的优秀的会设计的工程师会比大量的平庸的工程师更加有生产力, 清晰的系统设计后期也更加可以吸引优秀的工程师加入团队。
One of the most important techniques for managing software complexity is to design systems so that developers only need to face a small fraction of the overall complexity at any given time. 各个模块相对独立,可以多人同步进行开发。
In modular design, a software system is decomposed into a collection of modules that are relatively independent. Modules can take many forms, such as classes, subsystems, or services. 但是module可能互相调用,需要直到彼此的一些信息,所以不存在绝对的module独立性,总该有点依赖。
管理依赖的入手点
The best modules are those whose interfaces are much simpler than their implementations.
抽象 An abstraction is a simplified view of an entity, which omits unimportant details.
The key to designing abstractions is to understand what is important, and to look for designs that minimize the amount of information that is important.
he best modules are those that provide powerful functionality yet have simple interfaces. I use the term deep to describe such modules
Unfortunately, the value of deep classes is not widely appreciated today. The conventional wisdom in programming is that classes should be small, not deep.
吐槽下java IO的设计,没有把 Buffer 当作默认设置,而是每次都要声明使用,导致程序员容易忘记。 没有 buffer 的IO会很慢的,应该没有这样的使用场景。
FileInputStream fileStream =
new FileInputStream(fileName);
BufferedInputStream bufferedStream =
new BufferedInputStream(fileStream);
ObjectInputStream objectStream =
new ObjectInputStream(bufferedStream);
The knowledge(例如数据接口和算法) is embedded in the module’s implementation but doesn’t appear in its interface.
Benefits of information hiding
Hiding variables and methods in a class by declaring them private isn’t the same thing as information hiding
相反,涉及到多个模块关联的设计决定就是信息泄漏,会产生依赖关系,涉及到设计的修改会牵扯到多个模块的修改。
例如两个模块都要操作同一个文件,一个读,一个写。即是接口中没有明示依赖关系,这种依赖难以察觉。
遇到了多个模块共享信息该怎么处理?
One common cause of information leakage is a design style called temporal decomposition, 一种以操作的时间先后顺序进行设计的结构。例如对一个文件的操作的设计分为三个步骤,也想当然的分为三个类:读文件,写文件,存文件。
When designing modules, focus on the knowledge that’s needed to perform each task, not the order in which tasks occur.
If the API for a commonly used feature forces users to learn about other features that are rarely used, this increases the cognitive load on users who don’t need the rarely used features.
such as buffer in java IO
The phrase “somewhat general-purpose” means that the module’s functionality should reflect your current needs, but its interface should not. Instead, the interface should be general enough to support multiple uses.
Example: Building a GUI text editor
void backspace(Cursor cursor);
void delete(Cursor cursor);
void deleteSelection(Selection selection);
void insert(Position position, String newText);
void delete(Position start, Position end);
Position changePosition(Position position, int numChars);
text.delete(cursor, text.changePosition(cursor, 1)); // delete
text.delete(text.changePosition(cursor, -1), cursor); //backspace
a bit longer and more obvious, but has less code overall than the specialized approach
A pass-through method is one that does nothing except pass its arguments to another method, usually with the same API as the pass-through method. This typically indicates that there is not a clean division of responsibility between the classes.
One example where it’s useful for a method to call another method with the same signature is a dispatcher. the dispatcher provides useful functionality: it chooses which of several other methods should carry out each task.
Examples:
The decorator design pattern (also known as a “wrapper”) is one that encourages API duplication across layers. A decorator object takes an existing object and extends its functionality
Examples:
The motivation for decorators is to separate special-purpose extensions of a class from a more generic core.
Before creating a decorator class, consider alternatives such as the following:
The interface of a class should normally be different from its implementation
Pass-through variables add complexity because they force all of the intermediate methods to be aware of their existence, even though the methods have no use for the variables.
Eliminating pass-through variables can be challenging.
Contexts are far from an ideal solution:
It is more important for a module to have a simple interface than a simple implementation. Most modules have more users than developers, so it is better for the developers to suffer than the users.
开发人员很乐于把复杂度推给用户,但是对于降低软件复杂度不可取:
When deciding whether to combine or separate, the goal is to reduce the complexity of the system as a whole and improve its modularity.
The disadvantages of apart:
Indications that two pieces of code are related(better together):
In general, the lower layers of a system tend to be more general-purpose and the upper layers more special-purpose.
Some of the student projects implemented the entire undo mechanism as part of the text class. The text class maintained a list of all the undoable changes.
These problems can be solved by extracting the general-purpose core of the undo/redo mechanism and placing it in a separate class
public class History {
public interface Action {
public void redo();
public void undo();
}
History() {...}
void addAction(Action action) {...}
void addFence() {...}
void undo() {...}
void redo() {...}
}
The History class knows nothing about the information stored in the actions or how they implement their undo and redo methods.
History.Actions are special-purpose objects
There are a number of ways to group actions; the History class uses fences
You shouldn’t break up a method unless it makes the overall system simpler Methods containing hundreds of lines of code are fine if they have a simple signature and are easy to read. Each method should do one thing and do it completely.
Conjoined Methods: If you can’t understand the implementation of one method without also understanding the implementation of another, that’s a red flag.
Exception handling is one of the worst sources of complexity in software systems. The key overall lesson from this chapter is to reduce the number of places where exceptions must be handled; in many cases the semantics of operations can be modified so that the normal behavior handles all situations and there is no exceptional condition to report. 改变方法或者接口的功能描述,就可以把异常纳入正常的代码中。
How to deal with the exceptions
The exception handling code must restore consistency, such as by unwinding any changes made before the exception occurred. Exception handling code creates opportunities for more exceptions. To prevent an unending cascade of exceptions, the developer must eventually find a way to handle exceptions without introducing more exceptions.
处理异常的代码自己有错误那是最致命的:When exception handling code fails, it’s difficult to debug the problem, since it occurs so infrequently.
Tcl contains an unset command that can be used to remove a variable. I defined unset so that it throws an error if the variable doesn’t exist. However, one of the most common uses of unset is to clean up temporary state created by some previous operation.
classes with lots of exceptions have complex interfaces, and they are shallower than classes with fewer exceptions.
The best way to reduce the complexity damage caused by exception handling is to reduce the number of places where exceptions have to be handled.
I should have changed the definition of unset slightly: rather than deleting a variable, unset should ensure that a variable no longer exists. There is no longer an error case to report.
The Windows operating system does not permit a file to be deleted if it is open in a process
In Unix, if a file is open when it is deleted, Unix does not delete the file immediately. Instead, it marks the file for deletion, then the delete operation returns successfully. The file name has been removed from its directory, so no other processes can open the old file and a new file with the same name can be created, but the existing file data persists. Processes that already have the file open can continue to read it and write it normally. Once the file has been closed by all of the accessing processes, its data is freed.
if either index is outside the range of the string, then substring throws IndexOutOfBoundsException.
The Java substring method would be easier to use if it performed this adjustment automatically, so that it implemented the following API: “returns the characters of the string (if any) with index greater than or equal to beginIndex and less than endIndex.”
The second technique for reducing the number of places where exceptions must be handled is exception masking. With this approach, an exceptional condition is detected and handled at a low level in the system, so that higher levels of software need not be aware of the condition.
TCP masks packet loss by resending lost packets within its implementation, so all data eventually gets through and clients are unaware of the dropped packets.
The third technique for reducing complexity related to exceptions is exception aggregation. The idea behind exception aggregation is to handle many exceptions with a single piece of code;
Instead of catching the exceptions in the individual service methods, let them propagate up to the top- level dispatch method for the Web server,
This is the opposite of exception masking: masking usually works best if an exception is handled in a low-level method. For masking, the low-level method is typically a library method used by many other methods, so allowing the exception to propagate would increase the number of places where it is handled.
缺点: One disadvantage of promoting a corrupted object into a server crash is that it increases the cost of recovery considerably. Error promotion may not make sense for errors that happen frequently.
如何衡量什么时候用 exception aggregation: One way of thinking about exception aggregation is that it replaces several special-purpose mechanisms, each tailored for a particular situation, with a single general-purpose mechanism that can handle multiple situations.
The fourth technique for reducing complexity related to exception handling is to crash the application. these errors are difficult or impossible to handle and don’t occur very often. The simplest thing to do in response to these errors is to print diagnostic information and then abort the application.
Example:
Whether or not it is acceptable to crash on a particular error depends on the application. For a replicated storage system, it isn’t appropriate to abort on an I/O error. Instead, the system must use replicated data to recover any information that was lost.
Special cases can result in code that is riddled with if statements, which make the code hard to understand and lead to bugs. Thus, special cases should be eliminated wherever possible.
The best way to do this is by designing the normal case in a way that automatically handles the special cases without any extra code.
Example:
Designing software is hard, so it’s unlikely that your first thoughts about how to structure a module or system will produce the best design.
Example: GUI text editor line-oriented -> character-oriented -> string-oriented -> range-oriented
Try to pick approaches that are radically different from each other; you’ll learn more that way. Even if you are certain that there is only one reasonable approach, consider a second design anyway, no matter how bad you think it will be.
make a list of the pros and cons of each one.
make a decision
However, there is still a significant amount of design information that can’t be represented in code. The informal aspects of an interface, such as a high-level description of what each method does or the meaning of its result, can only be described in comments. If users must read the code of a method in order to use it, then there is no abstraction
The overall idea behind comments is to capture information that was in the mind of the designer but couldn’t be represented in the code.
there is a risk of bugs if the new developer misunderstands the original designer’s intentions if it has been more than a few weeks since you last worked in a piece of code, you will have forgotten many of the details of the original design.
Developers should be able to understand the abstraction provided by a module without reading any code other than its externally visible declarations. (obvious)
Javadoc for Java, Doxygen for C++, or godoc for Go
consistency
If the information in a comment is already obvious from the code next to the comment, then the comment isn’t helpful.
A first step towards writing good comments is to use different words in the comment from those in the name of the entity being described.
/*
* The amount of blank space to leave on the left and
* right sides of each line of text, in pixels.
*/
private static final int textHorizontalPadding = 4;
Some comments provide information at a lower, more detailed, level than the code; these comments add precision by clarifying the exact meaning of the code.
Other comments provide information at a higher, more abstract, level than the code; these comments offer intuition, such as the reasoning behind the code, or a simpler and more abstract way of thinking about the code.
Precise comments can fill in missing details such as:
When documenting a variable, think nouns, not verbs. In other words, focus on what the variable represents, not how it is manipulated.
They omit details and help the reader to understand the overall intent and structure of the code.
Higher-level comments are more difficult to write than lower-level comments
being able to ignore the low-level details and think about the system only in terms of its most fundamental characteristics.
The first step in documenting abstractions is to separate interface comments from implementation comments.
Interface comments provide information that someone needs to know in order to use a class or method; they define the abstraction.
Implementation comments describe how a class or method works internally in order to implement the abstraction.
The interface comment:
The main goal of implementation comments is to help readers understand what the code is doing (not how it does it).
In addition to describing what the code is doing, implementation comments are also useful to explain why.
// See "Zombies" in designNotes.(Use Comments As Part Of The Design Process) The best time to write comments is at the beginning of the process, as you write the code. Writing the comments first makes documentation part of the design process. Not only does this produce better documentation, but it also produces better designs and it makes the process of writing documentation more enjoyable.
Benefits of writing the comments at the beginning:
The file system code used the variable name block for two different purposes.
In some situations, block referred to a physical block number on disk;
in other situations, block referred to a logical block number within a file
Unfortunately, at one point in the code there was a block variable containing a logical block number, but it was accidentally used in a context where a physical block number was needed; as a result, an unrelated block on disk got overwritten with zeroes.
block -> fileBlock and diskBlock
Take a bit of extra time to choose great names, which are precise, unambiguous, and intuitive.
names become unwieldy if they contain more than two or three words. Thus, the challenge is to find just a few words that capture the most important aspects of the entity.
Good names have two properties: precision and consistency.
Vague Name: If a variable or method name is broad enough to refer to many different things, then it doesn’t convey much information to the developer and the underlying entity is more likely to be misused.
it’s fine to use generic names like i and j as loop iteration variables, as long as the loops only span a few lines of code.
Consistent naming reduces cognitive load
the design of a mature system is determined more by changes made during the system’s evolution than by any initial conception.
If you want to have a system that is easy to maintain and enhance, then “working” isn’t a high enough standard; you have to prioritize design and think strategically.
Unfortunately, when developers go into existing code to make changes such as bug fixes or new features, they don’t usually think strategically. A typical mindset is “what is the smallest possible change I can make that does what I need?”
must resist the temptation to make a quick fix
If you’re not making the design better, you are probably making it worse.
什么时候选择不选择重构并提升设计?
It’s easy to forget to update comments when you modify code, which results in comments that are no longer accurate.
The best way to ensure that comments get updated is to position them close to the code they describe, so developers will see them when they change the code.
Example:
If documentation is duplicated, it is more difficult for developers to find and update all of the relevant copies. Instead, find the most obvious single place to put the documentation. In addition, add short comments in the other places that refer to the central location: “See the comment in xyz for an explanation of the code below.”
If information is already documented someplace outside your program, don’t repeat the documentation inside the program; just reference the external documentation.
once you have learned how something is done in one place, you can use that knowledge to immediately understand other places that use the same approach. Consistency allows developers to work more quickly with fewer mistakes.
Consistency can be applied at many levels in a system:
the best way to determine the obviousness of code is through code reviews.
private List<Message> incomingMessageList;
...
incomingMessageList = new ArrayList<Message>();
One of the risks of agile development is that it can lead to tactical programming. Developing incrementally is generally a good idea, but the increments of development should be abstractions, not features.
When creating a new class, the developer first writes unit tests for the class, based on its expected behavior. Then the developer works through the tests one at a time, writing enough code for that test to pass. When all of the tests pass, the class is finished.
The problem with test-driven development is that it focuses attention on getting specific features working, rather than finding the best design.
One place where it makes sense to write the tests first is when fixing bugs. Before fixing a bug, write a unit test that fails because of the bug. Then fix the bug and make sure that the unit test now passes.
If a design pattern works well in a particular situation, it will probably be hard for you to come up with a different approach that is better.
The greatest risk with design patterns is over-application.
Getters and setters are shallow methods
The most important idea is still simplicity: not only does simplicity improve a system’s design, but it usually makes systems faster.
The best approach is something between these extremes, where you use basic knowledge of performance to choose design alternatives that are “naturally efficient” yet also clean and simple.
a few examples of operations that are relatively expensive today:
Efficiency 与 Complexity 矛盾时候: If the faster design adds a lot of implementation complexity, or if it results in more complicated interfaces, then it may be better to start off with the simpler approach and optimize later if performance turns out to be a problem.
If you start making changes based on intuition, you’ll waste time on things that don’t actually improve performance, and you’ll probably make the system more complicated in the process.
Before making any changes, measure the system’s existing behavior.