What is a Ph.D. Dissertation?

[I wrote this in 1993 as a letter to a student concerning a draft of his dissertation. In 2003 I edited it to remove some specific references to the student and present it as a small increment to the information available to my grad students. In 2023 I made small edits for grammar and to expand coverage.--spaf]

Let me start by reviewing some things that may seem obvious:

Your dissertation is part of the requirements for a PhD. The research, theory, experimentation, et al. also contribute. One does not attempt to capture everything in one's dissertation.
The dissertation is a technical work that documents and proves one's thesis. It is intended for a technical audience and must be clear and complete but not necessarily exhaustively comprehensive.
Also note -- experimental data, if used, is not the proof -- it is evidence. The proof is presented as an analysis and critical presentation. Generally, every statement in your dissertation must be common knowledge, supported by citation to technical literature, or original results proved by the candidate (you). Each of those statements must directly relate to the proof of the thesis, or else they are unnecessary.
The dissertation is not the thesis. One's thesis is a claim -- a hypothesis. The dissertation describes, in detail, how one proves the hypothesis (or, rarely, disproves the claim and shows other significant results).

Let's revisit the idea of the thesis itself. It is a hypothesis, a conjecture, or a theorem. The dissertation is a formal, stylized document used to argue your thesis. The thesis must be significant, original (no one has yet demonstrated it to be true), and it must extend the state of scientific knowledge.

The first thing you need to do is to come up with no more than three sentences that express your thesis. Your committee must agree that your statements form a valid thesis statement. You, too, must be happy with the statement -- it should be what you will tell anyone if they ask you what your thesis is (few people will want to hear an hour's presentation as a response).

Once you have a thesis statement, you can begin developing the dissertation. The abstract, for instance, should be a one-page description of your thesis and how you present the proof of it. The abstract should summarize the results of the thesis and should stress the contributions to science made thereby.

Perhaps the best way to understand how an abstract should look would be to examine the abstracts of several dozen dissertations that have already been accepted. Our university library has a collection of them. This is a good approach to see how an entire dissertation is structured and presented. MIT Press has published the ACM doctoral dissertation award series for decades, so you may find some of those to be good examples to read -- they should be in any large technical library.

The dissertation itself should be structured into 4 to 6 chapters. The following is one commonly-used structure:

Introduction. Provide an introduction to the basic terminology, cite appropriate background work, and briefly discuss related work that has already covered aspects of the problem.
Abstract model. Discuss an abstract model of what you are trying to prove. This chapter should not discuss any specific implementation (see below)
Validation of model/proof of theorems. This is a chapter showing proof of the model. It could be a set of proofs or a discussion of the construction and validation of a model or simulation to gather supporting data.
Measurements/data. This would present data collected from actual use, simulations, or other sources. The presentation would include analysis to show support for the underlying thesis.
Additional results. In some work, there may be secondary confirmation studies, or it might be the case that additional significant results are collected along the way to the proof of the central thesis. These would be presented here.
Conclusions and future work. This is where the results are all tied together and presented. Limitations, restrictions, and special cases should be clearly stated here, along with the results. Some extensions as future work may also be described.

Let's look at these in a little more detail

Chapter I, Introduction. Here, you should clearly state the thesis and its importance. This is also where you define terms and other concepts used elsewhere. There is no need to write 80 pages of background on your topic here. Instead, you can cover almost everything by saying: "The terminology used in this work matches the definitions given in [citation, citation] unless noted otherwise." Then, cite some appropriate works that give the definitions you need. The progress of science is that we learn and use the work of others (with appropriate credit). Assume you have a technically literate readership familiar with (or able to find) standard references. Do not reference popular literature or WWW sites if you can help it (this is a matter of style more than anything else -- you want to cite articles in refereed conferences and journals, if possible, or in other theses).

Also, in the introduction, you want to survey any related work that attempted something similar to your own or has a significant supporting role in your research. This should refer only to published references. You cite the work in the references, not the researchers themselves. E.g., "The experiments described in [citation] explored the foo and bar conditions, but did not discuss the further problem of baz, the central point of this work." You should not make references such as this: "Curly, Moe, and Larry all believed the same in their research [CML53]" because you do not know what they believed or thought -- you only know what the paper states. Every factual statement you make must have a specific citation tied to it in this chapter, or else it must be common knowledge (don't rely on this too much).

Chapter II. Abstract Model. Your results are to be of lasting value. Thus, the model you develop and write about (and indeed, that you defend) should have lasting value. Thus, you should discuss a model not based on Windows, Linux, Ethernet, PCMIA, or any other technology. It should be generic and capture all the details necessary to overlay the model on likely environments. You should discuss the problems, parameters, requirements, necessary and sufficient conditions, and other factors here. Consider that 20 years ago (ca 1980), the common platform was a Vax computer running VMS or a PDP-11 running Unix version 6, yet well-crafted theses of the time are still valuable today. Will your dissertation be valuable 20+ years from now (ca 2050), or have you referred to technologies that will be of only historical interest?

This model is tough to construct but is the heart of the scientific part of your work. This is the lasting part of the contribution, and this is what someone might cite 50 years from now when we are all using MS Linux XXXXP on computers embedded in our wrists with subspace network links!

Chapters III & IV, Proof.There are basically three proof techniques that I have seen used in a computing dissertation, depending on the thesis topic. The first is analytic, where one takes the model or formulae and shows, using formal manipulations, that the model is sound and complete. A second proof method is stochastic, using statistical methods and measurements to show that something is true in the anticipated cases.

Using the third method, you must show that your thesis is true by building something according to your model and showing that it behaves as you claim it will. This involves

clearly showing how your implementation model matches the conditions of your abstract model,
describing all the variables and why you set them as you do,
accounting for confounding factors, and
showing the results.

You must be careful not to expend too much effort describing how standard protocols and hardware work (use citations to the literature instead). You must clearly express the mapping of the model to the experiment and the definition of parameters used and measured.

Chapter V. Additional results. This may be folded into Chapter III in some theses or multiple chapters in a thesis with many parts (as in a theory-based thesis). This may be where you discuss the effects of technology change on your results. This is also a place where you may wish to point out significant results that you obtained while seeking to prove your central thesis but which are not supportive of the thesis. Often, such additional results are published in a separate paper.

Chapter VI. Conclusions and Future Work. This is where you discuss what you found from your work, incidental ideas and results that were not central to your thesis but of value nonetheless (if you did not have them in Chapter V), and other results. This chapter should summarize all the important results of the dissertation --- note that this is the only chapter many people will ever read, so it should convey all the important results.

This is also where you should outline some possible future work that can be done in the area. What are some open problems? What are some new problems? What are some significant variations open to future inquiry?

Appendices usually are present to hold mundane details that are not published elsewhere but are critical to the development of your dissertation. This includes tables of measurement results, configuration details of experimental testbeds, limited source code listings of critical routines or algorithms, etc. It is not appropriate to include lists of readings by topic, lists of commercial systems, or other material that does not directly support the proof of your thesis.

Here are some more general hints to keep in mind as you write/edit:

Adverbs should generally not be used -- instead, use something precise. For example, do not say that something "happens quickly." How fast is quickly? Is it relative to CPU speeds? Network speeds? Does it depend on connectivity, configuration, programming language, OS release, etc? What is the standard deviation?
As per the above, the use of the words "fast," "slow," "perfect," "soon," "ideal," "lots of," and related should all be avoided. So should "clearly," "obviously," "simple," "like," "few," "most," "large," et al.
What you are writing is scientific fact. Judgments of aesthetics, ethics, personal preference, and the like should be in the conclusions chapter, if they should be anywhere at all. With that in mind, avoid the use of words such as "good," "bad," "best," and any similar discussion. Also, avoid stating "In fact," "Actually," "In reality," and any similar construct -- everything you are writing must be factual, so there is no need to state such things. If you feel compelled to use one of these constructs, then carefully evaluate what you are saying to ensure you are not injecting relative terms, opinions, value judgments, or other items inappropriate for a dissertation.
Computers and networks do not have knees, so poor performance cannot bring them to something they do not have. They also don't have hands, so "On the one hand..." is not good usage. Programs don't perform conscious thought (nor do their underlying computers), so your system does not "think" that it has seen a particular type of traffic. Generalizing from this, do not anthropomorphize your IT components!
Avoid mention of time and environment. "Today's computers" are antiques far sooner than you think. Your thesis should still be true many years from now. If a particular time or interval is necessary, be explicit, as in "Between 1905 and 1920" rather than "Over the last 15 years." (See the difference, given some distance in time?)
Be sure that any scientist or mathematician would recognize something you claim as proof.
Focus on the results and not the methodology. The methodology should be clearly described but not the central topic of your discussion in chapters III & IV
Keep concepts and instances separate. An algorithm is not the same as a program that implements it. A protocol is not the same as the realization of it; a reference model is not the same as a working example, and so on.

As a rule of thumb, a CS dissertation should probably be longer than 100 pages but less than 160. Anything outside that range should be carefully examined with the above points in mind.

Keep in mind that you -- the Ph.D. candidate -- are expected to become the world's foremost expert on your topic area. That topic area should not be unduly broad but must be big enough to be meaningful. Your advisor and committee members are not supposed to know more about the topic than you do -- not individually, at least. Your dissertation is supposed to explain your findings and, along with the defense, demonstrate your mastery of the area in which you are now the leading expert. That does not mean writing everything you know -- it means writing enough about the most important points that others can agree with your conclusions.

Last of all, don't fall into the trap that ties up many candidates and causes some of them to flame out before completion: your thesis does not need to be revolutionary. It simply needs to be an incremental advancement in the field. Few Ph.D. dissertations have ever had a marked impact on the field. Instead, it is the author's set of publications and products of the author that may change the field.

If your dissertation is like most, it will only be read by your committee and some other Ph.D. candidates seeking to build on your work. As such, it does not need to be a masterwork of literature, nor does it need to solve a long-standing problem in computing. It merely needs to be correct, to be significant in the judgment of your committee, and it needs to be complete. We will all applaud when you change the world after graduation. And at that, you will find that many well-known scientists in CS have made their careers in areas different from their dissertation topic. The dissertation is proof that you can find and present original results; your career and life after graduation will demonstrate the other concerns you might have about making an impact.

So get to work!