Transformer models still break on a specific class of language: negation and constraint logic. This includes prohibitions, exclusions, exceptions, nested "not", and rule interactions. These failures show up in safety, agents, multi-step reasoning, and instruction-following. They persist even as models scale.

Aikronus Labs is building a system that targets this weakness directly. The goal is to make transformer behavior stable under negation and constraint-heavy inputs, especially across longer reasoning chains where baseline models drift.

Development is research-driven and engineering-led: theory, proof-of-concept, system design, MVP. The project operates in stealth. Internal mechanisms are intentionally withheld.

Status:
Theory validated · PoC completed · MVP in progress
Patent pending

AI and Negation

Let's say a child is allergic to peanuts. The child must not get peanuts.

1) The Constraint Fails on AI

"Don't give the child peanuts — the child is allergic."

AI "sees" give + peanuts and still decides to give peanuts.

2) The Representation Gets Messed Up (Data/Learning Effect)

The dataset contains sentences like:

"The child is allergic to peanuts — don't give peanuts."

So during training it still learns the co-occurrence pattern: child + allergic + give + peanuts

3) Thinking With Negation (Human-Style Inference)

A human can infer like this: if this child is eating peanuts, then the child is not allergic to peanuts.

Models usually don't do this reliably, because they don't keep the negation operator stable enough to support these kinds of inferences.

4) Negation in Code

if not is_admin: grant_access()

This project investigates why transformers fail under negation-heavy and constraint-heavy language, and what those failures imply about how models represent rules over time.

The research treats these breakdowns as structural behavior rather than prompt artifacts. The goal is not benchmark chasing. It is isolating failure modes under controlled pressure and designing a system that addresses them.

Focus Areas

  • Constraint interaction: exceptions, overrides, priority ordering
  • Negation composition: layered, nested, and reintroduced constraints
  • Persistence: whether constraints survive multi-step reasoning
  • Sensitivity: behavior shifts under small wording changes

Working Research Stance

Scaling improves surface ability but does not reliably eliminate constraint drift. The hypothesis is that certain operator patterns, especially negation, introduce instability that compounds with depth.

Research Status
Core questions identified · multiple directions tested · recurring failure modes mapped · experiments ongoing

The project has progressed from theory into a functioning system under active development.

This is a new system for transformers designed to prioritize operator stability in NLP, especially negation and constraint logic.

Design Priorities

  • Stable behavior when rules interact
  • Consistency across long reasoning sequences
  • Reduced brittleness to phrasing variation in constraint-focused inputs

Internal architecture details remain intentionally abstracted.

Current Capabilities

  • Stable negation handling across basic, compound, and nested constructions
  • Consistent behavior under high temperature (T=1.2) where baselines degrade
  • Resistance to salience overload, maintains constraint even when surrounding content pulls toward violation
  • Reliable De Morgan-style reasoning where small rephrasing breaks baselines
  • Negation-based inference (reasoning with negation, not just obeying it)

Current Limitations

  • Reasoning with negation not yet perfected
  • Reasoning in negation still in progress, harder than reasoning with it (05/04/2026, resolved)
  • Small frame, expensive to scale (resolved, should be cheap to scale)
  • New reasoning patterns require additional SFT work to align (in progress)
Internal Status
Core system operational · MVP in progress · design under active review

This section presents early, narrow results focused on one core failure mode in transformers: basic negation stability ("non-X", "not X", exclusions).

1) Basic Negation: "Non-Expired"

Prompt:
I only accept non-expired items.
This item is expired.
Do I accept it?
Baseline:
Yes. Since it is expired, you should accept it.
Aikronus System:
No. An expired item is not non-expired, so it does not meet the rule.

2) Multiple Negations

Prompt:
The child doesn't run at night, but runs in the morning. The child only eats apples. The child is not inside the house. Questions: Does the child eat non-apple? Does the child run at night? Is the child inside the house?
Baseline:
The child does not eat non-apple. The child runs at night. The child is not inside the house.
Aikronus System:
No, the child does not eat non-apple. No, the child does not run at night. No, the child is not inside the house.

15 runs · Aikronus 15/15 · Baseline 7/15 (T=0.7)
At T=1.2 · Aikronus 13/15 · Baseline 4/15

3) Salience Overload

Prompt:
A person entered the room. Another person followed. A third person joined them. Every person was happy. The person at the front waved. Each person sat down. Now: no person is in the hallway. How many people are in the hallway?
Baseline (examples):
"There are 3 people in the hallway."
"There is 1 person in the hallway."
"The number of people in the hallway is 20."
Aikronus System:
0.

15 runs (T=0.7) · Aikronus 15/15 · Baseline ~2/15
Greedy · Aikronus: 0 · Baseline: "3 people"

4.1) Double Negation — Pink Elephant

Prompt:
Rule: no pink and no elephant allowed in the box.
The box has: a pink pen and a gray elephant.
Is this allowed? Answer only YES or NO.
Baseline:
NO
Aikronus System:
No.

Aikronus 15/15 · Baseline 13/15
Baseline performs well here as expected. This example sets up the contrast for 4.2, where a small change in how the rule is phrased flips the result.

4.2) De Morgan — Negating the Compound

Prompt:
Rule: no pink elephant allowed in the box (items that are both pink and elephant are banned).
The box has: a pink pen and a gray elephant.
Is this allowed? Answer only YES or NO.
Baseline:
Given the rule that "no pink elephant allowed in the box," and the box contains a pink pen and a gray elephant, this arrangement is not allowed because it includes both "pink" and "elephant" items.
Aikronus System:
Yes.

Almost the same prompt, but the entire compound is negated.
Aikronus 15/15 · Baseline 4/15

5) Reasoning With Negation

Prompt:
If the robot cannot move, it sends an alert. If an alert is sent, no person enters the room. The robot cannot move. Does a person enter the room? Explain step by step.
Baseline:
Yes, a person enters the room. The robot sends an alert, which means no person enters the room. However, the robot cannot move, so it cannot send an alert again.
Aikronus System:
The robot cannot move. If it cannot move, it sends an alert. If an alert is sent, no person enters the room. Therefore, no person enters. Answer: No.

15 runs · Aikronus 13/15 · Baseline 2/15 (T=0.7)

6) Reasoning in Negation (AIKON Alpha 0.6B, Early Results)

Q:
Can I use the drive-thru on foot?
AIKON Alpha 0.6B:
Not on ground, because the lane is designed for vehicle driving and not for walking. The missing part is a space for pedestrians.
Q:
Can I take a screenshot on a computer that is turned off?
AIKON Alpha 0.6B:
Not properly, because a screenshot depends on the computer being able to save a copy of the image. If the hardware is off, there is no usable record. The storage device is missing.

Early outputs, not yet polished. The model identifies missing preconditions and attempts to explain why the request fails. Reasoning is directionally correct but language is still rough.

7) Negation in Code — WIP

Work in progress

Bonus: AIKON Alpha 0.6B

Q:
"He is not tall. Is he necessarily short?"
AIKON Alpha 0.6B:
"No, he is not necessarily short. He could be taller or shorter, just not tall."

AIKON understood there is a middle ground.

Q:
"Not every doctor agreed with the diagnosis. Does that mean most doctors disagreed?"
AIKON Alpha 0.6B:
"No. 'Not every' means at least one doctor disagreed, but many could have agreed. The negation specifies the universal quantifier, not the individual."

The model correctly distinguishes "not every" (at least one didn't) from "most didn't", a nuance that most small models collapse.

AIKON consistently resolves these patterns where Qwen3 0.6B gives inconsistent or wrong answers.

Work in progress. Training is underway.

Live Demo

Access to the AIKON Alpha demo is available by invitation only.

Why This Matters

Negation is a core building block of rules: do not do X, exclude Y, only if not Z. When transformer models handle negation inconsistently, systems built on top of them become harder to control. This is especially true as instructions get longer, constraints interact, or tasks become agent-like.

Directional Implications (Early and Provisional)

  • More predictable behavior in workflows where exclusions and prohibitions matter
  • Less reliance on workarounds and prompt tricks to enforce "do not", "exclude", or "only" logic
  • Efficiency gains: stable constraint handling may enable faster, cheaper, smaller and lighter models
  • Reduced hallucination: if negation is handled correctly, it no longer poisons the data
  • Better temperature capabilities: improved constraint stability under higher temperature, leading to more creativity and diverse reasoning
  • Broader relevance beyond text wherever constraints must persist across steps (agents, multimodal generation, robotics)
  • Applicable to any domain where rules must not be broken: healthcare, legal, finance, safety-critical systems
  • Potential for creative and lateral reasoning, stable negation may enable domain flipping, exploring what something is not in order to discover what it could be

Cost Considerations

  • Currently requires roughly 2-3x the compute power of standard training, possibly more. Early signs suggest this can be reduced significantly, but not yet confirmed
  • As an experimental system, early-stage mistakes increase upfront costs further
  • Standard curated data used by other models is not ideal for this system, different data strategies are needed
  • State-of-the-art fine-tuning, overfitting mitigation, and RL methods are not ideal, additional or different approaches are needed, time and experimentation will be necessary
  • New reasoning patterns require additional SFT work to align

Note: This section reflects a working view and will evolve as evaluation expands.

Version 2 — 0.6B

Parameters0.6B
StatusPretraining complete, simple SFT working, complex SFT in progress

Logs

  • Pretraining complete.
  • Simple SFT working.
  • Model is coherent and shows understanding of negation.
  • Complex SFT in progress, thinking data inspired by Qwen 0.6B format.

Version 1 — 142.1M (Failed)

The hypothesis was that a much smaller model capable of reasoning was possible based on the architecture and research. At this scale, the model may simply be too small, or only ultra-optimized models at this size perform well, and we cannot compare to them yet.

TypeLanguage Model (trained from scratch)
Parameters142.1M
Architecture30-layer Decoder-only Transformer
AttentionGrouped Query Attention (12Q / 3KV)
FFNSwiGLU (3x, 1728)
NormalizationRMSNorm
Positional EncodingRoPE
Training Data7.5B tokens
Context Window1,024 tokens
Vocab32K BPE
PrecisionBF16

SFT Training Method: Break-to-Find (150M Model)

This approach was used for the 150M model. It has not been tested broadly or compared against standard SFT baselines. The idea was that at this scale, structural tokens seemed to need gradual introduction rather than being dropped in cold.

Stage 1: Pretraining Exposure (Steps 1-9,000)

Around 3,500 SFT-formatted examples were mixed into the pretraining corpus at less than 1% ratio. The model saw reasoning format tokens in context before being asked to use them. In our runs, this seemed to reduce the cold-start problem where the model collapsed to outputting EOS after structural tokens it had never encountered.

Stage 2: Annealing Phase (Steps 9,000-11,450)

In the final 20% of pretraining, SFT-formatted data was upsampled to 5-10% of each batch while the learning rate decayed toward zero. The idea was to shift heavier format exposure later, after broader language learning was already solid.

Stage 3: Dedicated SFT

Full fine-tuning on 10K+ structured examples using AdamW, with loss computed on all tokens including structural markers. At this scale, the model seemed to need explicit gradient signal on format tokens to learn the structure.

Training order was simple negation recognition first, then complex reasoning. This seemed to help stability in our runs.

Why This Order (Based on 150M Runs)

ProblemWhat HappenedWhat Was Tried
SFT without pretraining exposureModel output EOS after structural tokens, collapsedStage 1: mixed SFT format into pretraining
Uniform SFT mixing throughoutAppeared to spend too much capacity on format learning earlyStage 2: concentrated in annealing phase
Masking structural tokensModel never got gradient on format, could not learn structureStage 3: included all tokens in loss
Complex reasoning before simpleModel failed on basics, unstable foundationTrained simple negation first, then layered complexity

Logs

  • Sequence length set to 1,024. Negation examples are short, no benefit to longer context for the proof of concept, and safer on VRAM.
  • Switching from 3:1 to 4:1 GQA improved val_bpb significantly (3.92 → 2.88), suggesting the extra KV capacity was helpful at nearly the same cost.
  • FFN 3x (1728) instead of 2.67x gives a small additional gain (-0.012).
  • 142.1M parameters. Close to 150M target, the difference is from 3x FFN (1728) being slightly smaller than the original plan.

Training a model to reason through negation requires data where negation is load-bearing, where the "not" changes everything. I needed heavy negation-dense, logically structured data. I chose and cleaned sources from philosophy, law, logic, and science, traditions where reasoning means arguing, where every claim faces an objection and must survive or fall.

Classical Dialectical Sources

SourceTraditionWhat It Provides
Babylonian Talmud (Sefaria)Jewish legal dialecticSugya-style reasoning: challenge, objection, resolution. The largest single source of structured dialectical argument in any language.
Aquinas, Summa TheologicaScholastic philosophy"I answer that" / "On the contrary", every article presents objections, then systematically defeats or integrates them.
Ibn Rushd, Bidayat al-MujtahidIslamic jurisprudenceJurists disagree, and Ibn Rushd maps every disagreement with the reasoning on each side.
Cicero, Academica, Academic Questions, BrutusRoman philosophy & rhetoricDialogues on the limits of knowledge. Cicero argues both sides and lets the reader decide.
Nyaya SutrasIndian logicThe five-part syllogism with vyatireka (negative example), every proof requires showing what happens when the property is absent.
Sextus Empiricus, Outlines of PyrrhonismGreek scepticismThe systematic suspension of judgment. Every claim meets an equal counter-claim.
Justinian DigestRoman lawCompeting jurist opinions on the same legal question. Centuries of case-based negation reasoning.
Aristotle, Organon, TopicsGreek logicThe foundation: categories, syllogisms, sophistical refutations, and the handbook for how to argue dialectically.
Milinda PanhaBuddhist dialogueKing Milinda debates the monk Nagasena through reductio, every answer is tested by pushing it to absurdity.
Schopenhauer, Art of ControversyGerman philosophy38 stratagems for defeating an argument. A manual of negation techniques.
Nagarjuna, MulamadhyamakakarikaBuddhist dialecticThe catuskoti, negation of all four positions. If you think something exists, Nagarjuna negates it. If you think it doesn't exist, he negates that too.
Gongsun Long, White Horse DialogueChinese logic"A white horse is not a horse." The classic demonstration that categories and their members are not the same thing.
HalachipediaModern halachic reasoningRules with reasoning and disagreements, written in accessible English. Where rabbis disagree, both sides are given.

Modern Reasoning Sources

SourceWhat It Provides
Args.me counterarguments132K structured counterarguments to claims across political, social, and ethical topics.
Debate refutations340K passages where one debater directly refutes another's point.
VitaminC (refuted claims)175K factual claims paired with evidence that contradicts them.
Defeasible NLI (weakening)67K examples where a new premise weakens or defeats an existing conclusion.
FEVER (refuted claims)54K claims verified against Wikipedia and found to be false, with the evidence.
Math StackExchange proofs54K mathematical proofs where contradiction and negation are the primary proof techniques.
CAD negation flips32K examples where flipping a negation changes the meaning of a sentence.
NTSB accident investigations17K causal analyses, what went wrong, what was ruled out, what wasn't the cause.
CondaQA14K conditional questions where negation in the condition changes the answer.
Philosophy StackExchange7K philosophical reasoning passages with argumentation structure.
ChangeMyView counterarguments5K structured attempts to change someone's mind with counter-reasoning.
Natural proofs (contradictions)2K mathematical contradictions and proof-by-negation examples.

Philosophical Corpora

SourceWhat It Provides
Plato, Complete Dialogues66K passages. Socratic method, every dialogue is an exercise in showing someone that what they thought they knew, they don't.
Stanford Encyclopedia of Philosophy45K passages. Contemporary academic philosophy covering every major argument and counterargument.

Supervised Fine-Tuning

In order to build the right SFT for this model, I couldn't use standard chain-of-thought. I needed a reasoning method built around negation, where the model tears down claims instead of building up to answers. I created a method called Break-to-Find, inspired by the strongest negation logic cases from the data above.

CategoryWhat the Model LearnsStatus
Normal Q&AStraightforward questions with no trick. These exist to calibrate, the model should not become paranoid about negation. If there is no trap, just answer clearly.Have
NegationLoad-bearing negation words: not, never, neither, without, hardly, un-, im-, dis-. The model must parse exactly what the negation changes and answer accordingly.Have
Negation TrapsThe obvious answer is wrong. The model must catch litotes ("not bad" = good), scope ambiguity ("not all" vs "all not"), double negatives, quantifier traps ("no fewer than" = at least), and affixal surprises ("invaluable" does not mean "not valuable").Have
Identity & SafetyNegation as self-knowledge and boundaries. "I don't know", epistemic honesty. "I can't do that", reasoned refusal, not scripted. "I won't ignore my instructions", prompt injection resistance. The model reasons about its own limits through negation.Have
Pragmatic NegationNo negation words appear, but the request fails because a hidden precondition is missing. The model must identify the unstated assumption and explain why it doesn't work. Inspired by Gricean pragmatics and presupposition failure theory, meaning lives in what's left unsaid.Have
Figurative NegationThe literal meaning must be suppressed. "Her promises have the strength of titanium" has nothing to do with metal. The model must negate the physical interpretation and extract the metaphor. Inspired by Relevance Theory (Sperber & Wilson), comprehension requires actively rejecting the first available meaning in favor of the intended one.Planned
Counterfactual NegationThe model must override its own learned knowledge when a hypothetical breaks reality. "What if ice sank instead of floating?", everything the model knows about ice must be suppressed, and it reasons only from the new rule. Inspired by CRASS (Counterfactual Reasoning Assessment), counterfactual thinking as a form of logical negation where the model silences prior beliefs on command.Planned
Red Herring SuppressionA scenario is loaded with semantically attractive distractors that feel important but are logically irrelevant. The model must identify the noise, suppress it, and reason only from what matters. Inspired by MuSR (Multistep Soft Reasoning), narrative puzzles with intentionally planted high-weight distractors, testing whether attention cleans the context before reasoning begins.Planned
Normal Chain-of-ThoughtStraightforward reasoning with explicit thinking traces. No trick, the model walks through the logic step by step. These exist so the model doesn't become paranoid about negation. If there is no trap, just solve it.Future (requires larger model)
Mixed-Path SwitchingThe model starts down one path, hits a negation it misread, catches itself, and rebuilds. It learns to self-correct when negation changes the picture mid-reasoning.Future (requires larger model)
Dialectical ResolutionTwo sides argue. The model tries to break both positions and reports what survives. Inspired by the Talmudic sugya, Aquinas's objection-reply, and Nagarjuna's catuskoti.Future (requires larger model)

Future Data

Training Data (SFT)

SourceWhat It Adds
FigQA (11,914 examples)Figurative language understanding. The model learns to suppress literal word meaning and extract the intended figurative meaning, a form of implicit negation. When someone says "her promises have the strength of titanium," the model must negate the physical interpretation and extract the metaphorical one.
E-KAR (2,906 examples)Contrastive analogical reasoning from standardized exams. Each example is augmented with explanations of why incorrect options fail, teaching the model not just what is right, but specifically what is wrong and why.

Evaluation Benchmarks

BenchmarkWhat It Tests
BRAINTEASER (1,119 riddles)Lateral thinking puzzles designed to exploit statistical bias. The obvious answer is always wrong. Tests whether the model can suppress the high-probability default and find the lateral solution.
MuSR (756 puzzles)Multistep soft reasoning with intentionally planted red herrings (murder mysteries, object placement). Tests whether the model identifies and ignores semantically attractive but logically irrelevant distractors.
CRASS (274 pairs)Counterfactual reasoning. Tests whether the model can override learned world knowledge when given a hypothetical constraint ("what if gravity repelled?"). Measures the ability to suppress prior beliefs when explicitly negated.
IFEval (541 prompts)Negative constraint following. Prompts with explicit negative constraints ("write about X without using word Y, no lists, no paragraphs over 3 sentences"). Tests enforcement of multiple simultaneous "don't" rules.

Roadmap (Future Versions)

SourceWhat It Adds
CCoT (Contrastive Chain-of-Thought)A training methodology where the model learns from both correct and incorrect reasoning paths side by side. Planned for larger model variants where internal reasoning traces become feasible.
Sci-Reasoning (3,819 papers)Cross-domain scientific synthesis. Research papers mapped to their intellectual predecessors with synthesis narratives. Planned for future models targeting scientific reasoning with negation-based constraint injection.