Preference inference as infrastructure: the MEE framework

Agent personalisation usually means hard-coded rules. Treating preference as a weighted, evidence-tracked signal layer changes what agents can do.

Share
Preference inference as infrastructure: the MEE framework

TL;DR: Most AI agents handle user preferences with hard-coded rules that break when preferences conflict, change, or were never stated. The MEE framework treats preference as a weighted, evidence-tracked signal layer that models infer and adjust over time. This transforms preference from brittle config files into auditable infrastructure.

Core answer:

  • Preferences are inferred from evidence across sessions and providers, not assumed from single statements
  • Each preference carries a weighted score based on specificity, recency, stability, and contradictions
  • Unstable signals trigger clarifying questions instead of guesses
  • Preference history is dated and auditable, not overwritten
  • The same inference engine runs across all models as shared infrastructure

The problem with if-statements

Ask teams how their agents handle user preferences and you'll get a list of if-statements.

The user likes terse summaries, so terse summaries get hard-coded. This holds until preferences conflict, change, or were never stated out loud.

The MEE framework starts from a different premise. Preference is something you infer from evidence and re-weight over time. Treating it as a first-class, weighted signal layer changes what agents do.

When rules meet reality

The if-statement version of preference looks reasonable until three things happen.

First, preferences conflict. You want terse summaries, except on unfamiliar topics where you need more context.

Second, they change. What was true in March is wrong by May.

Third, most preferences were never stated at all. They only show up in how you react.

A rule cannot hold "usually terse, more verbose on unfamiliar ground, and drifting terser over time." A rule is a snapshot of a moving thing.

Why this matters: Static rules fail because preferences are dynamic signals, not fixed settings.

Preference as signal, not setting

The reframe is simple. Stop treating preference as a setting. Start treating it as a signal.

You infer it from evidence. You re-weight it as evidence accumulates.

Then you put it where settings never sit: underneath every model as a layer they all read. Not inside each application as a separate config file.

The shift: Preference becomes shared infrastructure instead of scattered configuration.

How inference works without guessing

The mechanism matters because "infer preferences" is where most systems quietly start guessing.

The engine reads your conversation history across providers. It extracts signals against ten cognitive dimensions:

  • How you want things formatted
  • How much you want claims sourced
  • How you handle ambiguity
  • How fast you move from options to decisions

Each signal is scored, not switched on.

The score weights specificity, evidence, recency, stability, and portability. It subtracts for contradictions.

Here's the line between inference and assumption: a candidate scoring below the threshold gets marked unstable. An unstable signal does not get applied. The engine asks instead.

How signals become stable

Signals don't graduate to truth on first sight. They move through gates: staged, stable, rejected.

A preference becomes stable only after observation across many sessions and more than one provider. A one-off statement from a bad mood stays staged. A contradicted one is rejected.

When a preference changes, the old one isn't deleted. It's dated.

The ledger keeps the timeline because a preference history you read is worth more than a single current value you trust.

Key insight: Evidence-gated preferences prevent AI from acting on noise while building verifiable preference histories.

Memory remembers what you said. This compiles how you want to be answered.

What the output looks like

The output isn't a profile of tags. It's a set of instructions the model receives before you type.

Every instruction carries its own evidence. It shows the dimension, its strength, and the number of observations behind it.

Example: "cite sources when making empirical claims · evidence orientation · high · 31 observations."

Reject the rule and it's gone. Nothing is a black box because every line traces back to something you did.

Why this is infrastructure, not a feature

The test of infrastructure: the same primitive runs everything built on top of it.

This one does.

The harness NOMARK uses for its own work runs the same engine. It reads every instruction, classifies the input, resolves preferences with the same weighted scorer, and asks only for what it cannot responsibly infer.

Inference has a floor written into it. Fields are inferable, defaultable, or must-ask. The engine never infers a must-ask field.

Personalization guessing on things with real consequence isn't personalization. It's risk wearing its coat.

Infrastructure principle: The same inference engine must work across all models and use cases, with hard limits on what it will guess.

The shift from memory to inference

The shift underneath all of this is small to state and large in effect.

The default answer to AI personalization has been memory: make the model remember what you said.

Recall is necessary. It's nowhere near sufficient.

Knowing your facts isn't the same as knowing how to answer you. The second one has to be inferred, weighted, dated, and carried across every model you use.

Treated as rules, preference is brittle and invisible.

Treated as a weighted, evidence-tracked signal layer, it's something you audit and correct.

That's the whole argument for building it as infrastructure.

Bottom line: Memory recalls facts. Preference inference compiles how you want those facts delivered.

Frequently asked questions

What makes preference inference different from memory?

Memory stores what you said. Preference inference compiles how you want to be answered. Memory is recall. Inference is pattern recognition across your behavior, weighted and updated over time.

How does the system avoid making wrong assumptions?

Signals below a confidence threshold get marked unstable and trigger clarifying questions instead of being applied. Fields are classified as inferable, defaultable, or must-ask. The engine never infers must-ask fields.

What happens when preferences conflict or change?

Conflicting signals get scored and the engine subtracts points for contradictions. When preferences change, old ones are dated rather than deleted, creating an auditable timeline of how your preferences evolved.

Does this work across different AI models and providers?

Yes. The preference layer sits underneath all models as shared infrastructure. Signals become stable only after observation across multiple sessions and more than one provider.

How transparent is the preference inference process?

Every instruction shows its dimension, strength, and number of observations. You see exactly what evidence supports each preference and you reject any rule you disagree with. Nothing operates as a black box.

What are the ten cognitive dimensions the system tracks?

The engine extracts signals for how you want formatting, sourcing, ambiguity handling, decision speed, and six other dimensions of cognitive style. Each dimension gets scored independently based on observed evidence.

How long does it take for a preference to become stable?

Preferences move through staged, stable, and rejected gates. Stability requires observation across many sessions and multiple providers. One-off statements stay staged until confirmed by repeated behavior.

What stops this from becoming another invisible algorithm making decisions for me?

Every preference traces back to observed behavior you performed. You see the evidence count and strength score. You reject rules you disagree with. The system is designed for auditability, not opacity.

Key takeaways

  • Preference inference transforms AI personalization from brittle if-statements into weighted, evidence-based signals that adjust over time
  • The system prevents wrong assumptions by marking low-confidence signals as unstable and asking clarifying questions instead of guessing
  • Preferences are dated rather than overwritten, creating auditable histories that show how your needs evolved
  • Treating preference as infrastructure means the same inference engine runs across all models and providers, not scattered config files per application
  • Memory recalls what you said. Preference inference compiles how you want to be answered. Both are necessary but serve different functions
  • Transparency is built in: every preference shows its evidence, strength, and observation count, and you reject any rule you disagree with

The product is at https://nomark.ai. The framework is the point.