next up previous
Next: References Up: Michael BiggsClaus Huitfeldt: Previous: Edited Version of Renear's

The Edited Discussion

In response to Renear's summary, Pichler prefers to draw attention to the problems raised by ``intertextuality'' rather than post-modernism. In particular, he discusses some limiting cases of intentional intertextuality, e.g., the reader interprets text A as referring to text B in conditions where it is not known whether the author knew of text B, or where it is known that the author did not know text B. A further limiting case may be represented by the author asserting a reference to text B when the attribution should be to text C. Pichler sees the capacity for inter-textuality as inherent in our notion of text in a way which cannot be removed by refining our definition of it. Clearly the ``intentional'' component is a weakness, partly on the basis of identity and partly as a result of the ``intended but mistaken attribution'' limiting case. Pichler comes to the conclusion that it is always the reader (i.e. the interpreter, which also includes the author) who is the decisive authority in questions of intertextuality. However, he defends this against relativism by claiming that all readers belong to a community which not only establishes, but also controls practice.

Pichler also raises an ontological argument leading to similar conclusions. As a transcriber of Wittgenstein's manuscripts he was confronted with the Wittgenstein Archive's guideline for transcription: ``the aim of transcription is to represent the original manuscripts as accurately as possible.'' This guideline seemed to be accepted by several transcription and manuscript-editing projects in a ``realistic'' sense. However, definition fails to determine what of the original manuscript should be represented as accurately as possible, and what is meant by accuracy. It is clear that one cannot be supposed to represent every structure (linguistic, prosodic, etc.) of the manuscript as accurately as possible. Also ``accuracy'' is determined by one's interests in the text. Therefore his inclination to identify textual structures as a reader's concept was strengthened, along with his denial that there is an objectively existing structure which just needs to be depicted.

The Antirealist has still to establish why any representation may not be regarded as simply a selection from an objectively existing entity called the text. Pichler draws a comparison between his position and the Kantian distinction between ``Ding an sich'' and ``Erscheinung für uns.'' Pichler's preference is to agree with Fichte and find no necessity for ``Ding an sich.'' Similarly, if ``meaning'' is taken as the essential property of texts disclosed through interpretation, then there are as many essential texts as there are alternative interpretations. Pichler's experience of encoding Wittgenstein strengthens this view. Wittgenstein's Nachlass, with its corrections, alternative readings, instructions etc. requires ``constructing'' by the reader or encoder. There are multiple constructions, any of which must be defended by the reader. In this case there is a considerable disparity between the Nachlass-object which might be regarded as the text in its physical sense, and the necessary intervention of the reader in forming one of many possible constructions.

Pichler emphasizes that his position, which may be called a pragmatic Constructivism, does by no means lead to an uncontrolled relativism. Furthermore, with the growing practice of machine assisted text encoding the process of construction is easily controllable, it is revisable, and it can be made explicit.

Biggs already identifies two distinct models of text being discussed. On the one hand Renear describes a process in which the linguistic content is fixed, or completely determinable. On the other, Pichler describes a text that is open to some degree of interpretation. One encoding problem is to identify the OHCOs. However, this task will vary in difficulty between a fairly straightforward ``academic'' text in which the Realist finds ``the medium is not the message;'' and a text about radical typography, or the example from Goethe introduced later by Sperberg-McQueen, in which ``the medium is the message.''

Biggs also discusses Pichler's notion of ``correctness'' quoted by Renear. This notion might lead us to say that the Real text is the correct text for a particular user. Unfortunately the encoder of Wittgenstein is in the position of supplying a tool which users will employ to determine the semantic reference of ambiguous passages. Biggs therefore finds it difficult to see how the encoder, who is not the end-user, is in the position of isolating one correct text for any user.

An additional difficulty with Wittgenstein is that the linguistic content also refers to the business of reading (cf. transcribing) and interpretation. Firstly, when we read under normal circumstances (Renear's conditions) we do not interpret. However, in extremely ambiguous conditions such as those described by Pichler there may be insufficient syntactic evidence to decide between two alternative readings. Then we may be forced to make an interpretation on the basis of (semantic) evidence gained elsewhere in the text or beyond. This will be determined by our analytical perspective.

In addition to the signifying function of the alphanumeric string, the inter-word space is also significant. Likewise, the grouping of concepts at a paragraph level, indicated by line-breaks rather than an alphanumeric character, may have a bearing on the interpretation of the text string. It is therefore false to identify the linguistic content only with the alphanumeric string. An example from Wittgenstein would be the truth-table. It is the case, however, that some other features of the layout of the text string on the page/screen are mainly design functions rather than linguistic ones.

Biggs proposes more radical conditions which at first sight seem to favour the position of the Antirealist. They frequently occur within the context of a discussion of ``seeing as.'' As a reader one may react to the ambiguous feature and ``read-it-as'' an instance of ``this'' or an instance of ``that,'', e.g., as a duck or as a rabbit (Philosophical Investigations, p.194). However, as an encoder one must preserve the ambiguity. One must first recognise the deliberate ambiguity, and then encode it so that the linguistic content and the on-screen presentation preserves these two senses. But Wittgenstein also introduces the concept of aspect blindness whose purpose is to suggest that we might be aspect-blind ourselves under other circumstances. Biggs suggests that these more radical conditions are the meta-conditions under which the Antirealist text-encoder works. One does not know which are the signs which bear meaning and which are simply accidentals. A prototype text may reveal the possibility of another approach to the use of signs either by showing that the first reading of the text is senseless, or by showing that arguments regarding alternative signification may be advanced, e.g., the editorial process of publishing Wittgenstein. However, one might assert that by the 1990s there is an established convention for the content of ``the published works of Wittgenstein'' which places those texts in the Realist's domain.

Raymond concentrates on the structural approaches epitomised in Renear's OHCO discussion. Raymond rejects the efficiency arguments which objectify OHCOs in preference to other analytical constructs. In particular, he finds the transfer of computer terminology such as ``text file'' insidious. The implication that the text file ``contains'' text is as misleading as the suggestion that a payroll database ``contains'' the payroll. In general he draws the distinction between manipulating texts successfully using frameworks such as OHCOs, and the lack of any necessity to also provide a coherent theory of text. For example, advocates of OHCO seem to suggest that the structure of a document is a property of the document that is independent of the operations to be performed on it, or of other issues such as how we decide the equivalence of documents. He suggests that to ``structure'' information means to encode it in such a way that certain operations are efficient and others are not. By the time one has structure, one is already halfway to operations. For example, the computer text file, with no embedded markup, is designed to facilitate sequential reading and appending to the end of the text, but not insertions into the middle of the text.

Raymond's external argument criticises advocates of OHCO for suggesting that OHCO-like structures should capture only structure and not semantics (e.g., SGML). He argues that structure always involves semantics, and there is no such thing as ``pure'' structure, because structure always has mathematical and combinatorial properties that make certain types of operation possible or efficient. Hence structures are chosen which support the operations we think we want to do, which in turn reflect the semantics that we implicitly attach to the text. Even the idea that OHCO captures the essence of a text is a statement of some semantic import. He cites the Web as an example of what people actually do when they have to commit real resources. The ``best'' approach to text has to take into account not only the text, but the uses that are planned for it. OHCO can provide advantages in accessing highly structured information and in permitting flexibility in presentation. There is a prima facie case that flexibility in presentation is not that important, that portability via Postscript is sufficient, that structure-based editing is not that popular, and that searching by simple string matching goes a long way. However, OHCO and SGML advocates suggest that the benefit comes later, when new applications are developed and one does not have to re-encode texts.

Sperberg-McQueen's comments elaborate the problem of document identity, an issue raised by Raymond (1996). If we see some of our objects as text representations we can consider their representational validity in their ability to maintain or to lose information in relation to the original. However, this does not avoid the definition of what constitutes the original text, and how to determine whether for some feature F of the text, a given reproduction of the text preserves or loses that feature.

He also disagrees with Renear's assertion that the identification of linguistic content is relatively unproblematic. For example, the characters of the text must be represented; in some cases, this requires an analysis (a priori or a posteriori) of what characters actually exist in, or should be used to represent, the text. Spoken material may be transcribed phonetically, phonemically, or orthographically. The creator of an electronic text must also select which material is to be included as part of ``the text.'' Is the title page of the First Folio part of the text of Shakespeare's Hamlet? Is the title ``Hamlet'' part of the text of that play? Finally, the transcriber of written material into electronic form must reduce the two-dimensional page to a one-dimensional data stream. Footnotes must be transcribed at their point of reference, etc. It is hard to find plausible rules for this without grounding them in some view of what the ``text'' is.

It appears obvious to most computer-literate speakers of English where the boundary between characters of the text and markup should lie, but this is an illusion fostered by the success of the ASCII character set. In non-European writing systems, such as that of Japanese, the absence of a long tradition of prior art for mechanical writing means that it is not clear whether furigana and similar phenomena should be handled in the character set or in markup. In their simplest form, furigana provide a full or partial phonetic reading of a Han character, thus making clear how it should be read, if it would otherwise be ambiguous.

Commenting on what constitutes ``the same text,'' Sperberg-McQueen considers changes in the margins or font size. The argument rests not on the ways we talk about texts in electronic form, but on the existing practice of publishing and copyright law. If one consults a library for copies of Moby Dick, one will find that while different editions take care to retain the words of the text, in the same sequence, they take no care to retain the page breaks, margins, or fonts. On the other hand, if a publisher adopts the same page design and font for a whole series of books, we have no trouble at all distinguishing the volume devoted to the works of Plato from that devoted to Aristotle, from that containing Moby Dick. Changing the typography does not, in general, count as changing the text.

A contrary case is presented by Goethe's manuscript of the Roman Elegies which uses a Latin hand, not German. The early editions printed the work in Roman fonts (Antiqua), rather than in Fraktur. Modern editions print all German texts in Antiqua, so the information contained in the typographic distinction between the Roman Elegies and Goethe's other poems has been lost. In this case the typography therefore forms an essential part of the representation, or as Biggs summarised ``the medium is the message.''

Concentrating on individual analytic perspectives, Sperberg-McQueen claims they do not necessarily determine ordered hierarchies of content objects. Individual disciplines may deal with typographic rather than content objects: analytical bibliography, codicology, palaeography, and other disciplines of the history of the book are examples. They may deal with sets or other unordered groups of objects, rather than with ordered groups: lexicology, for example, and many forms of quantitative stylistics, often address a text as an unordered set of lexical items. Most critically, disciplines may address phenomena which are not themselves hierarchical, e.g., morphophonemic analysis, since phonemic phenomena may overlap morphological boundaries. Traditional stylistic analysis of verse includes the study of enjambment, end-stopping, and other phenomena of the interaction between non-nesting metrical and syntactic phenomena. It has been held that the consistent overlapping of phenomena is prima facie evidence that they belong to two different types of analysis. But this should not be taken to mean that they will never be considered together in the same scholarly work.

Broderick finds agreement between the Realists and the Antirealists that ``text is a system of structures.'' However this agreement seems to emphasise the lack of clarity in what is asserted or negated by the Realists or Antirealists. Broderick finds that the essence which is asserted or negated may be one of four possibilities: the meaning, the structure, the means to reconstruct the text and the organising principle of the text itself.

The literary Antirealist, which Broderick identifies with a certain Postmodern position, as opposed to the coding Antirealists focused on in the discussion to date, denies the existence of any objective interpretation, or meaning, of a text. It could be Sacred Text, mythology, naive history, fiction or nonsense. Concerning the coding of a particular text, the Realist could remain neutral with regard to the validity of these interpretations. In general there is nothing preventing a coding Realist from simply coding text while remaining neutral about whether the meaning of the text is reader constructed or discovered, hence a coding Realist could be a literary Antirealist.

The coding Antirealist may claim that the structure discovered within the order of the alphabetic characters and punctuation (what Renear calls linguistic components) is not real. This argument might be suggested by the possibility of format-based processing or other alternatives to OHCO. However, the Antirealists could not draw on this structure as evidence for their position, as Renear suggests that they do.

Broderick proposes the thought-experiment that after knowledge of English has been completely lost, an archaeologist digs up an issue of the Monist. Is it still a text? If the inhabitants of this age exchange knowledge in electronic format and no longer read the appropriate kind of character strings? Does a text contain knowledge if there is no one around to read it? These questions point out what seems to be the most credible Realist/Antirealist distinction. The Realist would answer ``yes'' to the above questions, the Antirealist, a ``no,'' qualified only be the possibility that the future archaeologist might figure out some way to decode the artifact and return its textuality to it.

Ore comments on Broderick's thought-experiment. He compares this to the case of Cretan Linear A in which Packard was still able to produce so-called ``word-lists.'' Such ``texts'' still meet Pichler's defining characteristics of semantic and syntactic dimensions. Another limiting case may be provided by Runic inscriptions (and, mutatis mutandis, by other extinct but known writing systems): these may be represented today in normalised Norse (or whichever language they are supposed to represent). However, Ore claims that we will never have full knowledge of the text as the carver intended or as contemporary readers would have read it. So every representation is an entity which has, in Pichler's terms, to be constructed.

Selmer Bringsjord proposes an alternative to Renear's OHCO theory. The basic thesis of this view is that text, at bottom, is ``jottings plus procedures'' - hence the proposal is referred to as the JoPP view. Like the OHCO theory, this is a Realist position, and according to Bringsjord all or most of the arguments in favour of OHCO also lends support to the JoPP view. However, the objections to Platonic OHCO which push toward Pluralism and Antirealism, fail to threaten the JoPP position.

The fundamental intuition behind this view is suggested by a thought-experiment described by Wittgenstein in Zettel: Wittgenstein imagines someone (J) jotting down inscriptions as someone else (R) recites a text, where the jottings are necessary and sufficient for J to reproduce the document in its entirety. ``What I called jottings would not be a rendering of the text, not so to speak a translation with another symbolism. The text would not be stored up in the jottings'' (Zettel, 612). Wittgenstein goes on to ask: ``And why should the text be stored up in our nervous system?'' (ibid.).

The sort of jotting to which Wittgenstein draws our attention here is suggestive of what Bringsjord regards text to be at bottom. In order to fix the thought-experiment, suppose that J jots down a list L of 5 bullets:

where each ui is associated with a short string from some natural language. Suppose, in addition, that R recites an essay E of over 2000 words. We assume, as well, that J can, at any point after hearing R's essay, reproduce E from the list L. So far, the thought experiment involves characters, actions and objects interacting in a manner we could certainly witness in the ``real world.''

The JoPP view is that E is L plus whatever procedure allows for the expansion of L into E. More generally, the view is that text is really, at bottom, jottings plus procedures (for reproducing a final text, where such a text can be in written or oral form).

The so-called logicist or symbolicist approach to artificial intelligence (AI) represents the knowledge, belief and reasoning of sophisticated agents (including human agents) in a logic (Bringsjord 1992; Russell & Norvig 1995). Often, the logic used is a particularly well-understood one, namely first-order logic. In the logicist approach to AI, a document in natural language is ``compressed'' to a set of formulae. In other words, the document is captured by certain jottings. Given certain algorithms, the jottings, or formulae, can be used to reproduce the story (Bringsjord & Ferrucci 1997).

Thus, it is a basic assumption underlying the JoPP thesis that it should be possible (at least in principle, though Bringsjord's view is that it is also possible in practice) to design intelligent computer systems for text processing and analysis in which texts are represented in some logic, and which operate on these representations via procedures in the form of computer algorithms. For moderately complicated texts, JoPP is already instantiated in some working computer programs.

Bringsjord argues in some detail that most of the arguments in favour of Renear's OHCO thesis are also arguments in favour of JoPP.

The JoPP view of texts commits us, according to Bringsjord, to Realism. If it is correct, one of the main arguments in favour of Antirealism (as given by Renear) fails: the JoPP thesis entails that there is a key set of facts about a text which are thoroughly objective.

The second rationale in favour of Antirealism - that there are many diverse methodological perspectives on a text - is one that the JoPP approach embraces. In order to produce different kinds of structure (physical, compositional, narrative, etc.), the procedures going from jottings to final text need only be suitably adjusted, but the jottings needn't change.

The observations that force modification of OHCO Platonism toward what Renear calls ``pluralistic Realism'' are ones the JoPP view accommodates from the outset: the JoPP approach is designed to allow for distillation of disparate documents. Whether the final text is a short story, a proof, a physics textbook, a poem, etc. the JoPP thesis is that such texts can be captured as a set of assertions in a logical formalism.

In the ensuing discussion, one of the first objections to the JoPP view is that first-order logic, which Bringsjord initially used as an example of the kind of formalism in which the ``jottings'' will be captured, does not seem sufficient to represent the basic propositional structure of texts.

Broderick points out that if the linguistic contents of a text are condensed and represented in the form of some logical symbolism, this symbolism will contain non-logical constants which are open to a number of different interpretations. It is difficult to see how the JoPP view should be able to identify one such set of interpretations as the ``correct'' interpretation of a specific text. Moreover, the claim that a literary text A should essentially consist of statements in some axiomatic system like first-order logic would seem to imply the somewhat implausible conclusion that it should be possible to prove A, or alternatively not-A.

Raymond argues that since a JoPP representation of a text is not only supposed to be able to regenerate the propositional content of a text, but also to generate this content in some specific form, the JoPP representation must contain not only propositions about some world outside the text but also about the text itself - i.e., it must contain meta-textual propositions. Therefore a JoPP representation should potentially be able to represent logical flaws like contradictions and paradoxes. Thus JoPP is constrained by the limits of axiomatic formalisms - there must be some sentential forms that it cannot produce, otherwise it cannot be consistent.

Bringsjord's reply to these criticisms is that his initial reference to first-order logic was only meant to provide a simplified exemplification - in actual fact first-order logic would be too limited. The logical system required for representing the propositional contents of texts would not be an axiomatic system at all, and thus the envisaged situation of being able to prove or disprove texts will not occur. Philosophical logic has provided a number of systems designed to allow for contradictions and several of those may cope with the paradoxes referred to by Raymond (Bringsjord & Ferrucci, forthcoming).

However, according to Raymond, even though it may be that logical formalisms exist which allow JoPP to handle inconsistencies and paradoxes satisfactorily, one serious problem with JoPP persists: while text is typically an informal, intuitive notion, JoPP is typically a formal one. Proving that ``Text is JoPP'' is analogous to demonstrating the equivalence between informal and formal notions of computability, suggesting that we need something like a Church-Turing hypothesis for texts.

Raymond claims that, on the one hand, JoPP captures too much. Like Sperberg-McQueen, Raymond refers to Goodman's distinction between allographic and autographic representations and indicates that JoPP should, but does not obviously, exclude autographic representations. On the other hand, JoPP is too low level a representation: that they can be represented in some kind of logic does not serve to distinguish texts from other kinds of information. Finally, Raymond raises doubts about the Wittgensteinian thought-experiment used to illustrate the JoPP view: it rests on the assumption that reciting a document is a valid form of reproduction. This implicitly defines away the possibility of presentational matter being part of a text. This assumption is confirmed when Bringsjord says that if the typeface of a document is changed, the text is not. Raymond has strong doubts about this.

Raymond, Sperberg-McQueen and Huitfeldt all point out that jottings and procedures themselves seem to be some sort of texts, thus suggesting that the JoPP view may lead to a regress or a circle. Huitfeldt and Raymond suggest that a way for Bringsjord to break this regress or circle would be to formulate principles for identifying a set of primitive jottings and procedures which cannot be further reduced.

Huitfeldt suggests that one of the problems with JoPP is that Bringsjord is not clear about what independent criterion is used to decide textual identity. At one point Bringsjord seems to suggest this might be an appropriate set of behaviours and Huitfeldt welcomes this inasmuch as it points in the direction of seeing texts as social, historical, and cultural phenomena.

Sperberg-McQueen and Huitfeldt draw attention to an obscurity in the JoPP view: on the one hand, texts are said to consist essentially of jottings representing a propositional content which allow us to generate for example translations into different languages, or paraphrases within one language, of the ``same'' text. On the other hand, Bringsjord sometimes suggests that the test of success of a particular JoPP-representation is that the original text is reproduced ``word for word.''

Sperberg-McQueen argues that one of the strengths of the JoPP view is that it would give us a way of explaining why some texts are felt to be similar in certain ways which are difficult to account for on other models. According to the JoPP view it is because they share the same ``propositional content.'' However, the JoPP view also seems to reduce texts to their propositional contents. On the one hand, this makes it possible to explain what different paraphrases of ``the same'' text have in common. On the other hand, it may seem difficult to account for the differences between paraphrases. As there are indefinitely many paraphrases of the same propositional content, this account of textual identity also seems to be in some sense too loose.

In contrast, with reference to examples of how our criteria of textual identity varies from context to context, Sperberg-McQueen draws attention to a large number of different ways that texts may be identical or similar at different levels, and suggests that in an exhaustive typology of textual identity relations ``JoPP identity'' would be but one among many.

Editors: Michael Biggs, Claus Huitfeldt

University of Hertfordshire
University of Bergen

Selmer Bringsjord
Paul Bohan-Broderick
Espen S. Ore
Alois Pichler
Darrell Raymond
Allen Renear
Michael Sperberg-McQueen

next up previous
Next: References Up: Michael BiggsClaus Huitfeldt: Previous: Edited Version of Renear's

Fri Jul 25 22:00:35 MEST 1997