TOEIC Link Reading Coreference Chain Resolution and Entity Tracking: The Pronoun-and-Definite-Reference Discipline That Keeps the Discourse Representation Coherent Across Long Passages and Cross-Document Question Sets

A TOEIC Link Reading passage delivers a network of entities — people, organizations, documents, products, locations — and the entities are referenced repeatedly across the passage through pronouns, definite descriptions, and reduced forms that depend on the reader's discourse representation to be interpretable. The reader who has built and maintained a clean discourse representation will resolve the references without conscious effort and the answer selection will be grounded in the entity relations the passage establishes. The reader who has not maintained the representation will misresolve at least one reference per long passage, and the misresolution will propagate through the answer-selection layer because the items most likely to be interrogated are precisely the items whose interpretation depends on the misresolved reference.

This is the coreference-chain breakdown failure mode, and it is among the highest-cost failure modes on the Reading section because the cost per occurrence is large. A single misresolved coreference can flip the candidate's interpretation of an entire paragraph, can convert the cause-effect relation that the passage establishes into the reverse relation, and can lead the candidate to select a distractor that was constructed specifically to capture readers who reversed the entity relation. The candidate who has not built the coreference-chain discipline will lose section points at a rate that comprehension-vocabulary drills cannot reduce because the comprehension layer was functioning — the coreference layer was the failure point.

This article is the coreference-chain guide for TOEIC Link Reading. The guide identifies the reference categories the reader has to resolve, the chain-construction protocols that build the entity representation in real time, the maintenance moves that keep the representation coherent across paragraph boundaries and document boundaries, and the deliberate-practice drills that build the resolution automaticity the timed condition demands.

The reference categories the reader has to resolve

A passage's coreference network combines several reference categories, and the categories differ in their resolution difficulty and the cost of misresolution. The reader who treats all references as a single resolution problem — to be handled with a single heuristic — will solve the easy cases reliably and the difficult cases unreliably, and the difficult cases are precisely the cases the test makers concentrate the question stems against.

Category 1 — third-person personal pronouns. He, she, they, him, her, them — references to entities the passage has already introduced. The resolution problem is the antecedent-selection problem when multiple candidates are available, and the resolution heuristic combines syntactic priority, recency, semantic plausibility, and discourse prominence. Personal pronoun resolution is the highest-volume coreference category and the category whose resolution mistakes propagate widely through the discourse representation.

Category 2 — possessive pronouns and determiners. His, her, their, its — references that double as antecedent and modifier. The resolution heuristic shares the personal pronoun heuristic with the additional constraint that the possessed entity is also in the representation. Possessive references frequently chain — his proposal refers both to him and to the proposal — and the chain has to be tracked as a relational link rather than two independent resolutions.

Category 3 — demonstrative pronouns and determiners. This, that, these, those — references that may point to entities, to events, to propositions, or to sections of the discourse. The resolution problem is the reference-target problem — what kind of thing the demonstrative is pointing at — combined with the antecedent-selection problem. Demonstratives that point at propositions are the most error-prone subcategory because the propositional target is not a named entity that the reader has tracked.

Category 4 — definite descriptions. The company, the proposal, the report, the meeting — references that pick out an entity through descriptive content rather than a pronoun. The resolution problem requires the reader to identify which previously-introduced entity matches the description and to verify that no later entity has displaced the original. Definite descriptions are particularly error-prone in passages that introduce multiple entities of the same descriptive type — the company may match three companies the passage has mentioned.

Category 5 — reduced forms and ellipsis. Constructions such as the same, the latter, the former, the one we discussed that compress reference content and depend on the reader's discourse representation for their interpretation. Reduced forms have low surface salience — the reader can read past them without noticing they require resolution — and the latent resolution can fail silently in ways that the candidate does not recognize until the answer-selection stage.

Category 6 — cross-document references. In multi-document question sets, references that point to entities introduced in a different document of the set. The resolution problem combines the within-document categories above with the constraint that the antecedent is in a different document, and the reader's discourse representation has to span the documents rather than reset at the document boundary.

The chain-construction protocols that build the entity representation

The candidate who can identify the reference categories has solved the recognition problem; the candidate has not yet solved the construction problem. The construction problem is the problem of building the entity representation in real time as the passage is read, so that the references can be resolved against the representation as they arrive rather than reconstructed from scratch at the answer-selection stage.

Protocol 1 — first-mention encoding. When an entity is first mentioned, the reader allocates a representation slot for the entity and records the entity's distinguishing properties — the role, the relations, the descriptive attributes. The encoding is light — three to five properties is sufficient — and the encoding has to be fast enough to keep pace with the reading speed. The first-mention encoding is the foundation of the entity representation, and shortcuts at the first-mention stage produce resolution failures downstream.

Protocol 2 — chain extension. When a reference is resolved to an entity, the reader extends the entity's chain with the new mention and updates the entity's representation with any new properties the new mention introduces. The chain extension is the maintenance mechanism that keeps the representation current as the passage develops the entity's role through successive mentions.

Protocol 3 — entity disambiguation. When a passage introduces multiple entities of similar type — multiple companies, multiple people in the same organization, multiple documents — the reader assigns disambiguation markers to the entities at the moment of disambiguation rather than at the moment of first mention. The disambiguation markers are the properties that distinguish the entities — the company's role, the person's department, the document's date — and the markers have to be retained alongside the entity's representation so the disambiguation can be applied at each subsequent reference.

Protocol 4 — chain reactivation. When an entity that has not been mentioned recently returns to the discourse, the reader reactivates the entity's chain and brings the entity's representation to the foreground of the discourse representation. The reactivation move is the corrective move for the discourse-recency heuristic, which the reader's pronoun resolution defaults to and which produces resolution failures when the discourse returns to a distal entity rather than continuing with the local entity.

Protocol 5 — chain pruning. When an entity has been replaced by a successor entity that the passage has now committed to — for example, a candidate proposal that has been rejected in favor of a successor proposal — the reader prunes the obsolete entity's chain from the active representation and reassigns the descriptive content to the successor entity. The pruning move prevents the obsolete entity from competing for resolution against the successor entity at subsequent references.

The maintenance moves that keep the representation coherent across boundaries

The candidate who has constructed the entity representation at the local level still has to maintain the representation across the structural boundaries of the passage — paragraph boundaries, topic-shift boundaries, document boundaries — where the representation is most vulnerable to drift and decay. The maintenance moves below preserve the representation's coherence across the boundaries.

Maintenance move 1 — paragraph-boundary check. At each paragraph boundary, the reader runs a brief check on the entity representation — which entities are active, which have been deactivated, which have been pruned. The check refreshes the representation's currency and identifies entities whose status the reader had drifted away from during the paragraph's processing.

Maintenance move 2 — topic-shift entity bridge. At each topic-shift boundary, the reader builds the bridge between the prior topic's entities and the new topic's entities — which entities carry through, which are introduced fresh, which are replaced. The bridge prevents the topic shift from triggering a representation reset that would lose the carry-through entities the new topic still depends on.

Maintenance move 3 — document-boundary representation transfer. At each document boundary in a multi-document set, the reader explicitly transfers the entity representation from the prior document to the new document and identifies the entities that span the documents. The transfer is the most error-prone maintenance move because the default reading behavior is to reset the representation at the document boundary, and cross-document references that depend on the transfer will misresolve silently if the transfer has not been deliberately executed.

Maintenance move 4 — referent-displacement detection. Throughout the passage, the reader watches for moments when a definite description's referent has been displaced — the original company has been acquired, the original proposal has been superseded, the original meeting has been rescheduled. The detection move triggers the chain-pruning protocol and prevents the displaced referent from competing for resolution against the current referent.

Maintenance move 5 — pronoun-ambiguity flag. When a pronoun arrives with multiple syntactically and semantically plausible antecedents, the reader flags the ambiguity and runs the disambiguation immediately rather than committing to a default resolution. The flag-and-resolve discipline catches the cases where the discourse-recency default would produce the wrong resolution and the syntactic-priority default would also produce the wrong resolution.

The deliberate-practice drills that build the resolution discipline

The coreference-chain discipline is an automaticity that has to be built through deliberate practice across the reading preparation timeline. The drills below build the chain construction, the maintenance moves, and the resolution accuracy the timed condition demands.

Drill 1 — entity-trace annotation. The candidate reads practice passages and annotates the entity traces — each entity's introduction, each subsequent reference, the resolution decision, the disambiguation markers if any. The entity-trace annotation surfaces the chain-construction failures and identifies the entities whose representation the candidate is not building with sufficient detail to support downstream resolution.

Drill 2 — pronoun-resolution audit. The candidate solves practice items and audits the pronoun resolutions against the answer key, logging the resolutions that produced wrong answers and the resolution-rule failure that produced each wrong answer. The audit identifies the candidate's resolution-failure patterns — over-reliance on recency, over-reliance on syntactic priority, failure to detect ambiguity — and directs the next preparation cycle's focus to the dominant pattern.

Drill 3 — cross-document tracking. The candidate works through multi-document question sets with explicit attention to the cross-document references — flagging each reference, identifying the source document, verifying the resolution against the source document. The cross-document drilling builds the document-boundary representation transfer and surfaces the cross-document chains the candidate's default reading would have missed.

Drill 4 — reduced-form recognition. The candidate reads practice passages with deliberate attention to reduced forms and ellipsis — the latter, the same, the one — and verifies the resolutions by writing out the full reference. The reduced-form drilling surfaces the latent resolutions that would otherwise fail silently and builds the recognition skill the reduced-form category demands.

Drill 5 — distractor-resolution diagnosis. When the candidate selects a distractor, the candidate diagnoses whether the distractor was selected because of a coreference failure — which reference was misresolved, which alternative resolution the distractor was constructed for, what discourse signal the candidate missed. The distractor-resolution diagnosis builds the candidate's awareness of the resolution patterns the test makers exploit and identifies the resolution disciplines the candidate has not yet automated.

The chain discipline is the discourse-integrity layer

The candidate who has built the coreference-chain discipline has installed the discourse-integrity layer that distinguishes the reading preparation that produces ceiling-level Reading section scores from the preparation that plateaus at the comprehension-vocabulary ceiling. The discipline does not replace the comprehension or the vocabulary work — both are prerequisites — but the discipline converts the parse-level comprehension into the entity-level discourse representation that the question-stem interrogation depends on, and the section score follows the representation rather than the parse alone.

The six reference categories, the five chain-construction protocols, the five maintenance moves, and the five deliberate-practice drills together form the coreference discipline that the section demands. The candidate who has automated the discipline closes the discourse-integrity gap that the chain-breakdown failure mode would have left open.

For the supporting reading-strategy disciplines that complement the coreference work, see TOEIC Link Reading Anaphora and Cataphora Resolution Strategy and TOEIC Link Reading Multi-Passage Cross-Reference Synthesis.