
我之前读深入理解Java虚拟机:JVM高级特性与最佳实践的时候,书中提到了这本书,抱着好奇的心态,我读了一下,写得还挺不错的,所以留下一些读书学习笔记
The Mythical Man-Month: Essays on Software Engineering is a classic and highly influential book in the field of software project management, written by Frederick P. Brooks, Jr. First published in 1975, it draws upon Brooks’s experiences managing the development of the OS/360 operating system at IBM.
The central and most famous argument of the book is encapsulated in its title. Brooks posits that the “man-month,” a hypothetical unit of work representing what one person can do in one month, is a myth when applied to software development. The book’s core thesis, often called Brooks’s Law, is that “adding manpower to a late software project makes it later.”
He argues this is because software projects are complex systems, and adding new people introduces significant overhead. New members require time to become productive (ramp-up time), and they drain time from experienced members who must train them. Furthermore, as the size of a team increases, the communication paths between members grow exponentially, leading to more time spent on coordination and less on productive work.
Beyond this central idea, the book explores other timeless concepts in software engineering, such as the “second-system effect” (the tendency to over-engineer the successor to a successful system), the critical importance of “conceptual integrity” in design, and the argument from his later essay, “No Silver Bullet,” which states that there is no single tool or technique that will yield an order-of-magnitude improvement in software development productivity. Despite its age, The Mythical Man-Month remains essential reading for its profound insights into the human challenges of building complex software.
On Lack of Time
The reasons why timing is often the problem during software development are the following.
Optimism
All programmers are optimistsAll will go well, that each task will take only as long as it “ought” to take.
This assumption, according to the author, is false, because given that creative activity contains 3 stages:
- idea
- implementation
- interaction (between creation and users)
Implementation in software development is very tractable, making programmer over-optimistic about the implementation difficulties, such as bugs.
Not Taking Communication Efforts into Account
In a partitioned work with no communications among distributed workers, such as wheat reaping and cotton picking, more men power does mean faster overall work speed. In tasks that can be partitioned but which require communications among workers of subtasks, however, the effort of communication must be added to the amount of work to be done.
Such communication mandates 2 parts:
- Training: Each worker must be trained in the technology, the goals of the effort, the overall strategy, and the plan of work.
- Intercommunication: More workers means more intercommunication, i.e. depriving more task time of individual workers. When the amount of intercommunication, brought by more divisions, increases past a threshold, it backfires and lengthen the project time, instead of efficiently reducing it.
Given that, the man-day or man-month as a unit for measuring the size of a job is not appropriate, because it does not take communication effort into account.
Under-Estimating the Time Needed for Writing Automated Tests and Debugging
Optimism falsely assumes there will be few bugs, resulting in much less scheduled time for testing.
Scheduling Software Task
- planning
- coding
- writing automated tests with debugging
- interaction, which entails potentially extra auto-tests and debugging
The last part, interaction, if not done properly leads to horrible product user experience.
Unjustified Estimation
Managers almost always ridiculize schedule completion for patron urgency. This is partially due to the lack of sounding basis for estimation.
2 solutions for remedy:
- Data-driven estimation: open productivity figures, bug-incidence figures, estimation rules, etc. Decision should be driven by data
- Managers need to be able to defend their proposed estimate. It’s about scientifically scheduling and professional conduct
Adding Manpower to a Late Software Project Makes it Later
The maximum number of men depends upon the number of independent subtasks. From these 2 quantities one can derive schedules using fewer men and more months. One simply cannot get workable schedules using more men and fewer months.
But Too Few Man Is Too Slow for Really Big System
The author, in the last chapter, suggests that one wants the system to be built by as few minds as possible. Yet for large systems one wants a way to bring considerable manpower to bear, so that the product can make a timely appearance before the technology becomes obsolete. There is a trade-off here. How can these two needs be reconciled?
Mill’s Proposal
CAUTIONWhile the trade-off problem above is always there, the solution presented in this section, given it’s discussed in 1970’s where the world of computing is considered “ancient”, has some limitations to the current date. This section simply state the solution, for the record, followed by a discussion in the context of modern computing
Each segment of a large job be tackled by a team, but that the team be organized like a surgical team:
- The surgeon: chief programmer designs, writes, tests, and documents the code. Must be a great talent with 10 years worth of experiences
- The copilot: interface with other teams, knows all the code intimately, an “extra hand” to the surgeon
- The administrator: interfaces with the administrative machinery of the rest of the organization, such as money, people, space, etc.
- The editor: technical writer who enhances surgeon’s documentation
- 2 secretaries: 1 for administrator and 1 for editor
- The program clerk: maintains tech records and files of the team. All computer inputs goes to the clerk, who then logs everything about the computing result
- The toolsmith: ensuring the health of the various tools used by team
- The tester
- The language lawyer: gives advice on how to write program in a certain language elegantly
All the roles, except for the surgeon, act as auxiliary personnel. We need to be little careful on this scheme of team work. “The Mythical Man-Month” came out the year of 1970’s. What we need to understand is the difference in how software was developed then v.s. now. Back in the day pretty much all coding was done on paper first, was then keypunched onto punched cards, then was read in, compiled, linked, executed, results were obtained, and the process repeated. CPU time was an expensive and limited resource and we didn’t want to waste it. Ditto and likewise disk space, tape drive time, and etc. all took resources. Wasting perfectly good CPU time on a compile which resulted in errors was a waste of perfectly good CPU time. And this was in 1975. At the time that Fred Brooks was developing his ideas, which was the mid-to-late 1960’s CPU time was even more expensive, memory/disk/whatever was even more limited. The idea behind The Surgical Team was to ensure that the One Super Great Rockstar Developer did not have to waste their time on mundane tasks like desk-checking code, keypunching, submitting jobs, waiting around (sometimes for hours) for results. Rockstar Dude Developer Man was to write code. Their legion of groupies/clerks/junior developers was supposed to do the mundane stuff.
Conceptual Integrity in System Design
Using the metaphor of European cathedrals, the hard work of several generation of builders, the authors argues that the conceptual integrity is a continuous improvements by generations of designers over a system.
How is Conceptual Integrity Achieved?
The author stated that the conceptual integrity is possible by the following maneuvers:
- Maximizing functions while minimizing complexities
- Conceptual integrity from one or fiew minds, while its implementation is by many hands
Function v.s. Complexity
There are programming system, different from a computer system, whose purpose is to make computer easier to use. In modern days, this can be Spring framework for example. The ease of use is enhanced only if the time gained in functions exceeds the time lost in learning, remembering, and searching manuals. For instance, learning how to write a Spring Boot application took about 10 minutes which in turn avoided 10 hours of starting-from-scratch. This makes the webservice development easier.
Quantitatively, the ratio of functions gained to the conceptual complexity costed determines the system design effectiveness. It should be noted that, therefore, neither function alone nor simplicity alone defines a good design. In addition, simplicity is not enough; it is the combination os simplicity and straightforwardness_ that reflects the conceptual integrity.
One Mind v.s. Many Hands
Conceptual integrity requires that the design must proceed from one mind, or from a very small number of agreeing resonant minds. Scheduling pressures, however, dictate that system building needs many hands. These two conflicting agendas can be reconciled by 2 strategies:
- Labor division between architecture and implementation (discussed below)
- A new way of structuring implementation teams (discussed above)
What is the architecture of a system?The complete and detailed specification of the user interface. Such as
- programming manual of a computer
- language manual of a compiler or control program
A system architect is the user’s agent. Their straight stakeholder is user, not salesman. Where architecture tells what happens, implementation tells how it is made to happen.
Conceptual integrity of a system determines its ease of use. Good features and ideas that do not integrate with a system’s basic concepts are best left out. If there appear many such important but incompatible ideas, one scraps the whole system and starts again on an integrated system with different basic concepts.
Since conceptual integrity of a system determines its ease of use and conceptual integrity also comes from one or few minds, the job of creatively designing an architecture must sit in the hands of the few, i.e. the architects, and there will be many more implementors who, on the contrary, also creatively implementing that architecture.
TIPRecall from the beginning that the total creative effort involves three distinct phases: architecture, implementation, and realization. The 3 phases can execute in parallel in many ways and together speed up the entire project.
The Second-System Effect
Frederick P. Brooks, in his seminal work The Mythical Man-Month, described the “second-system effect” as the tendency of a small, elegant, and successful system to be succeeded by a bloated, over-engineered, and often late successor. Having been forced into disciplined pragmatism to create the first system, the architects of the second, now flush with success and resources, try to implement every feature and correct every perceived flaw from the original. The result is often a catastrophic loss of the very conceptual integrity that made the first system great.
While Brooks wrote this in the age of mainframe operating systems, the underlying human tendencies are timeless. The second-system effect is not a relic of the 1970s; it thrives today, merely cloaked in the language of microservices, cloud-native architecture, and modern frameworks. It’s the ghost in the machine of every ambitious “v2.0” project.
The Great Rewrite Trap
The most common modern manifestation is the “Great Rewrite.” A team has a successful, revenue-generating monolithic application. It may have some technical debt and use an “unfashionable” tech stack, but it works. Emboldened by its success, the team decides that to move forward, they must start again from scratch. The plan is to build a new system using the latest architectural patterns—often microservices—and the trendiest frameworks.
This is the second-system effect in its purest form. The team vastly underestimates the years of nuanced business logic and subtle bug fixes that are embedded in the “legacy” system. They are seduced by the promise of a clean slate, forgetting that the first system’s constraints were a gift, forcing them to focus on delivering value. The rewrite project becomes a multi-year death march, constantly trying to catch up to the feature set of the original system, which, ironically, must still be maintained and updated to keep the business running. The new, “perfect” architecture introduces immense operational complexity that the team was unprepared for, and the project becomes an exercise in managing technology rather than solving user problems.
The Microservice Mirage and Framework Frenzy
Closely related is the temptation to over-engineer with patterns that, while powerful, are not universally necessary. A team might take a simple, coherent application and decide its successor must be broken into dozens of microservices. They trade the simplicity of a single codebase for the immense cognitive overhead of a distributed system, complete with network latency, complex deployments, and cascading failures. They are building a solution for Netflix’s scale when they have a fraction of the traffic.
This is often driven by “résumé-driven development,” where engineers choose technologies not because they are the best fit for the problem, but because they are valuable on the job market. The second system becomes a showcase for every library, framework, and pattern the team wants to learn. What was once a straightforward problem solved with simple tools becomes a tangled web of abstractions, state management libraries, and build configurations, all in the name of being “modern.”
TIPThe architects of the second system are often driven by a vision of a “perfect” system, an idealized form that exists purely in the realm of technical theory. They apply a principles of “good design” without sufficient regard for the evidence from the real world: user needs, business timelines, and the hidden complexity of the existing system.
The success of the first system grants the team freedom. They are no longer bound by the tight constraints of a new project. Yet, this newfound freedom, unguided by the discipline that birthed the initial success, becomes a trap. They exercise their freedom to choose a path of unrestrained ambition that ultimately leads to a loss of practical freedom - the ability to ship a product. True freedom, in this context, is not the ability to choose any technology, but the wisdom to choose the right constraints. The discipline that was once imposed externally must now be cultivated internally.
The lesson of the second-system effect is that the enemy is not legacy code or old technology. The enemy is hubris. It’s the belief that technology can solve human problems of focus and discipline. The cure is the same as it was in Brooks’s day: humility, a ruthless focus on user value, and an incremental, evolutionary approach to change rather than a revolutionary rewrite. The wisdom gained from the first system should not be a license for unbridled complexity, but a guide to preserving simplicity in the face of success.