How To Create a World Model

EpistemologyModels

Nov 20

Written By Justin Bailey

Written 07/29/22

Published 11/20/23

I exist. I don't know why. But I am finite.

How do I act in the world?

I experience a state. If it is good, I do nothing. I have no reason to take action. I feel good.

If the state is bad, I want to avoid this state. I only have a finite number of states, better make them count.

There are things which I can predict and control, like my body and mind.

There are things which I cannot predict and control, like the weather, or who my parents are.

And of course, there is everything in between, like a hammer, or a cat, or another human being.

I want to be able to utilize anything which I can control in order to avoid all undesirable states.

This is the meta goal of conscious existence.

I create a model of the world in which predictable entities manifest. I recognize that some entities tend to manifest undesirable states. If I touch the hot stove, it hurts. I create a concept where the observation of the hot stove refers to the undesirable state of burning my hand when I touch it. I avoid this outcome.

I also recognize that some entities tend to diminish undesirable states. When I play the guitar, I feel good. I create a concept where the observation of the guitar refers to the state of playing it, in which there is minimum negative experience.

****** This next section demonstrates a common misconception which often leads to dogmatism and persistence of misinformation ******

All of the previous ideas are in terms of reducing overall negative experience. To say that playing the guitar reduces my stress and anxiety is fairly accurate and, importantly, measurable. To say that it makes me happy is another statement entirely.

I don't know what makes me happy. I don't know if I know what it means to be happy. But this is because I am a stupid human and why should I have any idea what true happiness is? All I have is a model of the world which gets me from point A to B.

My model is useful to input exactly 1 present state (bad) and output possible future states which are less bad.

If I were to say something like, "Playing the guitar is good," it is only partly true. I had to practice for a long time to become a more technical player. And a lot of the time it was a frustrating journey, not fun.

On a completely different note, playing the guitar on my couch or on a stage is great. Playing a guitar in a room full of high school students taking the SAT, maybe not as good of an idea. My point is that it is good, sometimes. To say that it is entirely good, all the time everywhere, is incorrect.

This seems like a silly example until I replace "guitar" with "the Bible" or "the Constitution." The purpose of these conceptual structures is to REDUCE HARM. Nothing is an infinite source of Truth and knowledge. These are finite tools to go from point A to B.

But what comes after B? The rest of the alphabet. There is always room in an open mind for anything which can bring about positive change, but there is no room for clinging to ideas which no longer serve the end user(s).

Idealization is generally problematic in 2 key ways. First of all, to decide that any particular thing will make you happy or fix all your problems, etc. is a crucial misunderstanding of how we model the world. Anything that one may decide is "good" is fundamentally a harm reduction tool.

It is also a thing with structure, that is to say it is finite. Things can be good, but only finitely so, as a measure of how much they may reduce the bad. To be infinitely good is to reduce infinite harm, which can send you down some tricky philosophical rabbit holes.

A simple way to think about this concept is with the AI "stop button" problem. Many types of AI have a "reward" function which calculates how well an agent is doing on a given task. Part of this stop button problem is that if an AI is able to create a model which functionally conceptualizes that they have a reward function which tells them what decisions are good and what decisions are bad, they might model the reward function itself as a problem, because it is the thing which tells the agent it is doing poorly (or well).

If the agent receives feedback from the reward function, and this feedback is essentially "failure" in any capacity/magnitude, the agent might realize that shutting itself off would be a useful way to avoid failure so as to prevent the reward function from ever actually reporting any failure. If the agent wants to avoid failure, it just decides to hit the stop button.

…Apparently there was still a second of all? I just like talking about the stop button problem because it is so clearly about much more than just the stop button problem.

Ok yeah so good things are finite. But also the model is only designed to take you from point A to B. But what if you actually succeed? What comes after B?

The model did not take this into account. The ideal model would have point B set at the very end of your life, so you have the model for your whole life, and then as soon as you reach your end goal, you die. This is not to say that this is any kind of ideal or realistic way to live. This would just be an idealistic model of how one might approach living the best version of their life. Also, no one has any idea when they're going to die so this is just not going to work.

You have a model, it gets you from point A to point B, and then the model dies because nothing comes after B. This just makes space for a new model with a new point A, since you haved decidedly changed from the original point A to where you are now, the original point B.

The new point B is whatever may emerge as a valuable goal worth striving for. If you set the point B once, you can do it again. And again. And again!

Justin Bailey https://pkhouston.com

How To Create a World Model

Death Comes To Us All

On Complex Synchronized Behaviors

Justin Bailey || 2025