Common AI mistakes to avoid in Craig Prompts
A guide to the errors AI models make most often—and how to prevent them when writing instructions for Craig.
Craig runs on a smaller, faster AI model. These mistakes show up more often on smaller models, but every AI model can make them.
This doc has two parts. Part 1 lists mistakes you must fix in your own prompt. Part 2 describes protections Craig already has built in—read it so you don't accidentally repeat a rule or weaken it.
Part 1 — Mistakes you need to prevent yourself
What goes wrong, why it happens, how to fix it, BAD/GOOD examples.
1. Don't ask the AI to do math
What goes wrong. A rule like "only if the current time is within 60 minutes of the boundary" fires whether or not the time is actually close. A section that should be hidden gets included anyway.
Why it happens. The AI does not reliably calculate (boundary - current_time) and compare it to a limit. Instead, it notices that the prompt mentions a time, the tool result mentions a time, and the rule mentions a time—so it includes the section. The math check gets skipped.
How to fix it. Rephrase time windows as range checks using text the AI can read directly.
BAD: Show the storm advisory when the minutes from current local time
to the expected storm arrival is between 0 and 60 inclusive.
GOOD: Show the storm advisory only when current local time, read as
HH:MM, falls inside the window from [[advisory window start]] to
[[advisory window end]] (this hour-long window is precomputed and
passed in as part of the tool result).
Outside the window, leave out the advisory.
If the check is more complex than a simple range, ask the tool to precompute the window or a true/false flag and pass that in as a field the AI can read directly. Don't make the AI do math.
2. Conflicting rules cancel each other out
What goes wrong. Section A says "use a comma list." Section B says "never use a comma list." The AI uses commas in both places—or in neither—but never follows the distinction.
Why it happens. When two rules conflict inside the same prompt, the AI resolves the conflict by picking one rule and applying it everywhere. The difference between "this section" and "that section" is weaker than the AI's drive to be consistent.
How to fix it. Look through your prompt for the same idea written two different ways. Have one format rule that applies everywhere. If two sections truly need different formats, give them clearly different names and write each format on its own.
BAD: Current observation rows: comma-separated fields per station.
"[[station]]: [[wind]], [[temp]], [[conditions]]"
...
Forecast rows: one line per forecast period.
Never use a comma list.
GOOD: OUTPUT FORMAT (used for both observations and forecast):
"[[label]]: [[field]], [[field]], ..."
One line per station for observations; one line per period for
forecast. Fields inside a line are joined by ", ".
3. The AI copies details from your examples
What goes wrong. You provide an example with real-looking numbers to show the line shape. The AI outputs those exact numbers—or labels copied from the example—in production, even when the real data is different.
Why it happens. Examples are powerful, and they leak. The AI copies labels, names, values, and times from examples instead of reading them from the data source. This is especially likely when the example uses realistic data instead of blank placeholders.
How to fix it. In examples, use abstract placeholders like [[height]] or [[station]] , not real numbers. If you must use real data, pull it straight from the same source the AI will read at runtime so copying it does no harm.
BAD: Format response as "The flow rate is currently 10 m3/s and the
water level is 1.9 m".
(The real values "10" and "1.9" are sticky. The AI will sometimes
write "10 m3/s" or "1.9 m" even when the actual tool result is
14.5 m3/s and 0.9 m.)
GOOD: Format response as "The flow rate is currently [[flow]] m3/s and
the water level is [[water level]] m".
BAD: Format as "The tide is 3m rising, the next high is 3m at [[time]]".
(The real number "3m" appears twice; the AI latches onto it.)
GOOD: Format as "The tide is [[height]]m [[rising/falling]], the next
[[high/low]] is [[next height]]m at [[next time]]".
Two rules to follow every time: (a) use [[field]] for every placeholder—pick one syntax and stick to it, and (b) never put a real data value where a placeholder belongs.
4. "Do not" commands are weak
What goes wrong. You write "the data's mention of X is NOT a trigger." The AI still treats any mention of X as a trigger.
Why it happens. Negative instructions ("do not...", "this is not...") ask the AI to ignore a word it has already noticed. The word usually wins. Negation is one of the weakest ways to guide an AI.
How to fix it. Replace "do not X" with a positive description of what to do instead. Better yet, design the prompt so the wrong word never appears in the first place. If you can state the rule without naming the token that triggers the mistake, do that.
BAD: Do not include the water temperature, humidity, and air pressure.
Report wind speed in knots.
(The forbidden list names the exact fields the AI is then primed
to mention. Positive framing is shorter and harder to fail.)
GOOD: Output these fields only, in this order:
tide, wind, sky conditions, forecast.
Use knots for wind speed.
BAD: Do not include lift status information. Put traffic information first.
Get the current conditions from https://example.com/conditions.
(Opens with a negative; the lift-status concept is now primed.
The most important instruction—"Put traffic information first"—is
buried in the middle where it's easy to miss.)
GOOD: Start with traffic. Then weather, wind, cloud cover, temperature,
snowpack, avalanche hazard, sourced from
https://example.com/conditions.
Leave out anything else from the page.
BAD: The forecast's mention of "thunderstorms" is NOT a trigger for the
Warning line; the Warning line is for items in the `warnings` array.
GOOD: Write a Warning line ONLY when the tool result's `warnings` array
has items in it, one line per entry.
Rule of thumb: if a single instruction names more than one forbidden item (A, B, and C ), replace it with a positive list of what to include. The negative form primes every item in the list to appear.
5. When unsure, the AI plays it safe and includes everything
What goes wrong. "Show the optional section only when X"—but in practice the section appears most of the time, even when X is false.
Why it happens. The AI hedges. If it is unsure whether X applies, it includes the content rather than leave it out—omitting feels like a bigger mistake than including. This is especially common in emergency and safety settings where other parts of the prompt stress never skipping important information.
How to fix it. Make the default behavior explicit and keep the exception separate: state the "normal case" and the "special case" as two different sentences. Even better: redesign so the default is "always show" (no decision needed) and remove the conditional entirely.
BAD (single-sentence conditional):
Show the high-tide caution only when within 30 minutes of high tide.
BAD (separated, but still math — see mistake #1):
The high-tide caution is OFF by default. Turn it ON only if both are true:
- You can compute the minutes from current local time to the next
high-tide event.
- That value is between 0 and 30 inclusive.
GOOD (range check):
Default: show the tide observation only.
Exception: show the high-tide caution ONLY when current HH:MM is in
[[caution window start]] through [[caution window end]] (passed in as
part of the tool result).
GOOD (no decision at all):
Always show the tide observation and the next high-tide time and height.
If the conditional is genuinely hard to express, "no decision at all" is the cheap escape hatch—just always show the optional content.
A common variation: vague conditional output specs.
BAD: Format: GO / CAUTION / NO-GO per condition, TLDR summary line at
the beginning.
("per condition" is vague—per which conditions? The AI invents
its own list and drifts between runs.)
GOOD: Format: a TLDR line at the top, then one line per item from this
exact list:
- Icing
- Wind
- Ceiling
- Visibility
- Density altitude
- Turbulence
Each line: "[[item]]: [[GO/CAUTION/NO-GO]] - [[reason]]".
6. The last instruction wins
What goes wrong. Two rules contradict and the later one wins. Or the AI overemphasizes the last few bullets of a long instruction block.
Why it happens. The AI naturally pays more attention to recent text. Long prompts also push earlier content out of focus.
How to fix it. Put the most critical, non-negotiable rules at both the top and near the end (where the AI's focus is strongest). Keep prompts as short as possible; every extra paragraph weakens the rules.
BAD: Do not mention parking information. Put traffic information first.
Get the current conditions from https://example.com/conditions.
(The negative comes first—early attention grabs it. And the most
important instruction, "Put traffic information first," is buried
in the middle where it's easy to miss.)
GOOD: Start with traffic.
Then: road status, weather, wind, visibility, temperature.
Source for the on-mountain conditions:
https://example.com/conditions
Show only the items listed above—leave out anything else from the page.
(Most important item first, supporting items second, source third,
scope limit last—the AI's natural focus keeps the scope limit active
through the final output.)
7. The AI invents plausible-looking details
What goes wrong. The output contains a time, ID, or value that does not appear in any tool result. The invented detail looks realistic and fits the expected shape, which makes it especially hard to spot.
Why it happens. When the AI must fill a slot in a strict template (a timestamp, an ID) and does not have the right value at hand, it makes up a believable one rather than leave the slot empty. Craig already omits individual fields that are null or empty automatically (see Part 2). But when your prompt assumes a record or entry exists that isn't there, the AI may invent an entire row or value to satisfy the template.
How to fix it. Limit the allowed values explicitly. For IDs and station numbers, require the AI to copy from a specific field by name. If the source record or entry is absent, the output should skip the line—not fabricate a placeholder.
A real failure: a prompt asked for the next tide event; the tool returned four events for the day. The output included a fifth Next low at 06:30 —a believable time that was not in the data, invented because the output format implied "show one row per quarter."
BAD: Show the next event as "Next [[high/low]] at [[time]]".
(No limit on [[time]]. The AI can make up a time if the tool
didn't return enough events to fill the implied shape.)
GOOD: Show the next event as "Next [[high/low]] at [[time]]", where
[[high/low]] and [[time]] are copied exactly from the tool result's
`events[0]` entry. If the tool result has no `events[0]` record,
skip the line entirely. Do not invent a placeholder row just to
fill the template.
Same pattern for station IDs and observation timestamps:
BAD: Show the station ID from the observation.
(Too vague—"from the observation" doesn't say which field.)
GOOD: Show the [[station id]] by copying the value of the tool result's
`msc_id` field exactly. If the tool result does not include `msc_id`,
skip the station_id line entirely—do not invent or substitute an ID.
8. The AI tries to do math for you
What goes wrong. The source gives individual fields. The output contains a value that was calculated from them—a duration, a difference, a trend—even though the calculated value never appears in any tool result.
Why it happens. Calculated values feel like inference, not invention, to the AI. Craig's base "no invention" rule is read as "don't make up data"; the AI does not see arithmetic on existing data as making things up. When your prompt asks for something the tool does not directly return (for example, "rising or falling"), the AI will quietly calculate it.
How to fix it. Ask only for values the tool actually provides. If you need a calculated value (a trend, a countdown, a difference), get it added to the tool's output rather than asking the AI to compute it.
BAD: Format as "The tide is 3m rising, the next high is 3m at [[time]]".
("rising" is calculated. If the tool returned only the current height
and the next event but no `trend` field, the AI computed it.
Sometimes wrong, especially near the turn of the tide.)
GOOD: Format as "The tide is [[height]]m, [[rising/falling]]".
Take [[rising/falling]] only from the tool result's `trend` field.
If that field is missing, leave out ", [[rising/falling]]" entirely.
BAD: Include whether the river is rising or falling.
(Implicit calculation: the AI compares current flow to a "recent"
value—but which one? The AI picks inconsistently.)
GOOD: Include the river's [[rising/falling]] from the `trend` field if
present. If not present, leave it out.
BAD: Tide 2.3m rising. Next high tide in 3 hours 17 minutes.
(The countdown is calculated. The tool returned `next_high_time`
but not a duration.)
GOOD: Tide 2.3m rising. Next high tide at 14:30.
(Show the exact time, not a calculated countdown.)
Part 2 — What Craig already handles for you
Every Craig request runs through built-in instructions that enforce a set of constraints before your prompt ever reaches the AI. You do not need to repeat these in your prompt—and repeating them can backfire, because conflicting rules cancel each other out (see mistake #2).
Read this section so you know what's already taken care of, and what to avoid accidentally weakening.
Warnings are always included
If any tool result contains an active warning, Craig must write a Warning: ... line for each one. Even when the output gets long, warnings are kept. You don't need to write "always include warnings" in your prompt.
Values are copied exactly as written
Numbers, units, names, IDs, and timestamps are reproduced exactly from the tool result. If a tool returns wind_speed_kmh: 12 , the output contains 12 , not "around 10" or "12-ish km/h." You don't need to write "report values exactly" or "do not paraphrase numbers."
Empty fields are skipped
If a tool returns wave_height_m: null , Craig will not write "0m waves" or "no wave data." The field simply doesn't appear. You don't need to write "skip empty fields" or "do not include unavailable data."
Data stays in its own lane
When multiple tools return data—for example, weather at three different stations—each station's line uses only its own values. Wave data from a marine buoy will not appear on a coastal weather station's line. You don't need to write rules about keeping station data separate.
Plain text only
Craig output is SMS-safe plain text by default. No **bold** , no - bullets , no # headings , no backticks, no emoji. You don't need to write "no markdown" or "plain text only."
How to avoid weakening these protections
Two things to watch for:
- Don't write format examples in markdown. If your prompt contains
**Station**: <reading>, the AI can read your instruction as overriding the plain-text rule. Stick to plain text and the[[field]]placeholder style. - Don't repeat a protection in weaker words. If you write "always include warnings" but Craig's built-in rule already says the same thing more strongly, your weaker version can draw attention away from the original. Trust the built-in rule and don't repeat it.
If you find yourself wanting to repeat a rule Craig already handles, ask whether your real concern is actually one of the Part 1 mistakes in disguise.
Quick checklist when writing a Craig prompt
- Does any rule depend on math? Rephrase as a range check.
- Are two rules covering the same idea with different wording? Merge them.
- Does the example use blank placeholders like
[[field]], not real numbers it might copy? - Is the default behavior the common case, with exceptions saved for real edge cases?
- Are "do not" instructions paired with a positive description of what to do instead?
- Does every placeholder name the exact tool field, with a fallback for when it's missing?
- Does your prompt accidentally repeat a rule Craig already handles?
- Have you tested the prompt several times with real data to confirm the output is consistent?