TOEIC Link Speaking — Evidentiality and Source-Marking Discipline Under Timed Response: The Hearsay-Inferential-Direct Distinction That Separates Band-22 From Band-25

Evidentiality — the linguistic marking of how a speaker knows what they are asserting — is one of the highest-discrimination speaking-module features on the TOEIC Link CEFR B2-to-C1 transition. Band-22 candidates default to bare assertion regardless of whether they observed the event directly, inferred it from a chain of evidence, or heard it second-hand. Band-25 candidates routinely mark the evidential source within the first three to five words of each new claim, and the marking shifts the rater's interpretation of the response from "L2 speaker producing approximate content" to "L2 speaker controlling epistemic stance with native-like precision."

This guide formalizes the three-tier evidential inventory (direct / inferential / hearsay) that the TOEIC Link speaking raters reward, catalogues the four failure modes that hold candidates at band-22, and outlines a four-week drill routine that installs evidential source-marking to automatic deployment under timed response. For broader speaking-module preparation, see the speaking hedging and uncertainty signaling discipline guide and the speaking pragmatic politeness and face management guide.

Why evidentiality discriminates so strongly at the band-22-to-band-25 transition

Speaking-module raters evaluate responses along an interpretation axis that asks whether the candidate is producing English as a content-delivery instrument or English as a stance-management instrument. Band-22 responses tend to read as content delivery: the candidate names a claim, supports it with a reason, and concludes. Band-25 responses tend to read as stance management: the candidate names a claim, marks how they came to know it, calibrates the strength of the assertion accordingly, and concludes with a stance that matches the evidential source.

Evidentiality is one of the cleanest carriers of this stance-management property because it shows up in the first three to five words of each new claim. A response that opens "I noticed that the third-quarter numbers had dropped" tells the rater the speaker is reporting a direct observation. A response that opens "Based on what the operations team reported, the third-quarter numbers had dropped" tells the rater the speaker is reporting hearsay. A response that opens "Given the inventory turnover and the discount activity, it looks like the third-quarter numbers had dropped" tells the rater the speaker is reporting an inference. All three are grammatical and content-equivalent at the surface, but only the second and third make the evidential source visible, and the visibility is what shifts the rater's interpretation from band-22 to band-25.

The reason the visibility matters at this specific transition rather than at lower bands is that band-19 to band-22 candidates have not yet automatized the lexical inventory required to mark evidentiality fluently. They can recognize I heard that and apparently when reading, but cannot deploy them in real time during a 45-second response without dragging on the cognitive budget that needs to go to content. Band-23 to band-25 candidates have automatized the inventory and deploy it at no marginal cognitive cost. The discrimination at this transition is therefore not a knowledge gap but an automation gap, and the four-week drill described below is specifically designed to close it.

The three-tier evidential inventory

Tier 1 — Direct-evidence marking

Direct-evidence marking signals that the speaker observed the event personally. The lexical inventory includes I saw, I noticed, I watched, I observed, I heard (when used for direct auditory perception, not for hearsay), I felt, I checked, and when I looked at the report. The discourse function is to anchor the assertion in the speaker's perceptual experience and to commit the speaker to its accuracy with a default certainty band of 90% to 95%.

A direct-evidence opener also signals that the speaker is positioned to handle follow-up questions about the perceptual context (when, where, under what conditions), which is what the speaking-module raters reward as conversational positioning. A candidate who opens with a direct-evidence marker is implicitly accepting cross-examination on the perceptual basis, and the willingness is what reads as stance ownership.

Tier 2 — Inferential marking

Inferential marking signals that the speaker did not observe the event directly but inferred it from a chain of evidence. The lexical inventory includes it looks like, it appears that, from what I can tell, judging by, based on the pattern, given the trend, the data suggests, and that would imply. The discourse function is to surface the inference chain and to commit the speaker to its accuracy with a default certainty band of 60% to 80%.

The inferential marker also signals that the speaker is positioned to articulate the inference chain when asked, which is what the speaking-module raters reward as analytical positioning. A candidate who opens with an inferential marker is implicitly accepting cross-examination on the inference chain, and the willingness is what reads as analytical credibility.

Tier 3 — Hearsay marking

Hearsay marking signals that the speaker did not observe the event and is not asserting an inference but is relaying what another source reported. The lexical inventory includes I heard that (in the hearsay sense, distinct from direct auditory perception), apparently, reportedly, according to the team, the report says, word is that, I'm told, and from what I've been told. The discourse function is to surface the second-hand nature of the claim and to commit the speaker to its accuracy with a default certainty band of 50% to 70% conditional on the source's reliability.

The hearsay marker also signals that the speaker is positioned to name the source when asked, which is what the speaking-module raters reward as source-traceability discipline. A candidate who opens with a hearsay marker is implicitly accepting cross-examination on the source, and the willingness is what reads as epistemic honesty.

The four failure modes that hold candidates at band-22

Failure 1 — Bare-assertion default

The first and most common failure is defaulting to bare assertion regardless of evidential source. The candidate produces the third-quarter numbers dropped as a free-standing claim with no source marker, leaving the rater unable to determine whether the candidate observed, inferred, or was told. The rater defaults to the assumption that the candidate is reporting a memorized fact and downgrades the response on the stance-management axis. The repair is to install one of the three tier markers in the first three to five words of every new claim, even when the claim is something the candidate would normally state baldly.

Failure 2 — Direct-evidence over-claiming

The second failure mode is over-claiming direct evidence — using a direct-evidence marker when the actual basis is hearsay or inference. The candidate produces I saw that the third-quarter numbers dropped when in fact the candidate read it in a report or heard it from a colleague. The over-claim is the kind of small calibration error that low-band raters miss but high-band raters catch on the follow-up turn when the candidate cannot answer "when did you see the numbers" with the temporal precision a direct observation would carry. The repair is to default to inferential or hearsay marking when the evidential source is anything other than first-hand observation.

Failure 3 — Inferential-chain occlusion

The third failure mode is opening with an inferential marker but failing to articulate the inference chain when asked. The candidate produces it looks like the third-quarter numbers dropped and then cannot answer the follow-up based on what? with a coherent chain. The inferential marker is therefore a stance promise the candidate cannot deliver on, and the failure to deliver downgrades the response on the analytical-positioning axis. The repair is to drill inferential markers paired with their supporting chains so that the marker and the chain are co-activated.

Failure 4 — Hearsay-source omission

The fourth failure mode is opening with a hearsay marker but failing to name the source when asked. The candidate produces apparently the third-quarter numbers dropped and then cannot answer the follow-up who said so? with a coherent source. The hearsay marker is therefore a stance promise the candidate cannot deliver on, and the failure to deliver downgrades the response on the source-traceability axis. The repair is to drill hearsay markers paired with their supporting source attributions so that the marker and the source are co-activated.

The four-week drill routine

Week 1 — Evidential-tagging drill on existing transcripts

The candidate works through 30 of their own prior speaking responses (recorded or transcribed) and tags every assertion with its actual evidential source — direct, inferential, or hearsay. The week's output is a tagged transcript log that exposes how often the candidate defaulted to bare assertion when one of the three tier markers would have been more accurate.

Week 2 — First-three-words drill

The candidate produces 50 short speaking responses on prompted topics with the constraint that the first three to five words of every new claim must contain a tier marker from the inventory. The week's output is a drill log with the marker and the claim for each response, building automaticity at the position where evidential marking actually pays out in the response.

Week 3 — Chain-articulation drill

The candidate produces 30 speaking responses that open with an inferential or hearsay marker and immediately follow with the supporting chain or source attribution. The week's output is a chain-articulation log that pairs every marker with its delivered chain, eliminating the failure-mode-3 and failure-mode-4 pattern of promise-without-delivery.

Week 4 — Timed-response integration drill

The candidate produces 20 full 45-second TOEIC-Link-style speaking responses with the requirement that every new claim carry a tier marker and that every inferential or hearsay marker be followed by its supporting chain or source. The week's output is a timed-response log against a band-23 baseline of 80% marker-coverage and a band-25 baseline of 95% marker-coverage.

Evidential marking under cognitive-budget pressure

The 45-second TOEIC Link speaking response is cognitive-budget-tight, and any additional inventory the candidate is asked to deploy comes out of the budget that needs to go to content generation. The four-week drill works specifically because it pushes the evidential marking down into automatic deployment, where it costs zero marginal cognitive budget. A candidate who has completed the drill produces the tier marker as part of the response opening with no conscious selection, and the response then carries the stance-management property that lifts it from band-22 to band-25.

The interaction with the broader speaking analogical reasoning and source-domain mapping discipline is that analogical responses frequently mix evidential sources — the speaker may have direct observation of one domain and only inferential or hearsay access to the mapped target — and the candidate who has automated evidential marking will surface the mixed-source structure cleanly, while the candidate who has not will collapse the distinction and produce a band-22 outcome on what would otherwise be a band-25 analogy.

Closing — Evidential discipline as a band-25 marker

Evidential source-marking is one of the cleanest illustrations of the TOEIC Link speaking principle that above band-22, the rater is no longer scoring whether the candidate can produce content but whether the candidate can manage the stance the content carries. The three-tier inventory (direct / inferential / hearsay) is one such stance-management dimension. Installing it over four weeks produces a robust band-25 floor on the small but recurring set of speaking prompts that test calibrated assertion, and the broader epistemic literacy improves performance on adjacent items (hedging, modality, conditional concession) that share the stance-marking substrate.

For adjacent speaking-module disciplines, see the speaking presupposition trigger and shared-knowledge anchoring discipline guide and the speaking objection articulation and counterargument formulation discipline guide.