essay · IV · long form

Fatherhood as Alignment

My daughters are the most rigorous alignment problem I have ever worked on. They refuse to be optimized. They demand presence. They break naive reward functions in ways that have made me a better engineer of my own life.

MMXXVI · IV · 8 min

Olive is on the kitchen floor because the cup is green. The cup was supposed to be pink. The pink cup is in the dishwasher, and the green cup is what I handed her with the apple juice already poured, and the apple juice is now also on the kitchen floor, and Olive's face is the specific color human faces become when a two-year-old has decided a cup is a constitutional crisis. She is not crying, exactly. She is producing a sound older than crying. I am crouched at her level with a paper towel in one hand and the green cup in the other, and Blossom, seven months old, is on the play mat behind me making the small contented noise she makes when nothing is wrong, which is most of the time, and which is the only reason this scene is survivable.

This is the meltdown. It will last about ninety seconds. There is no policy that prevents it. I have read the books. I have listened to the podcasts. I have, at various moments, believed that with the correct schedule, the correct method, the correct wooden Montessori thing, I could engineer my way out of this floor. I cannot. Nobody can. The cup is the wrong color and that is the entire universe right now, and my job is not to fix the universe. My job is to be on the floor with her until the universe rearranges itself, which it always does, and usually faster than I expect.

I came to fatherhood the way I came to everything before it. As a problem to solve. I had been the best electric bass player in the world for about five years. I had been a competitive gamer. I had switched into software and was reading alignment papers on my lunch break. I had a Notion doc. I had a sleep-training framework. I had, God help me, a spreadsheet. The first time Olive woke up at 4 a.m. and refused to be put back down by any of the documented techniques, I felt the specific frustration of an engineer whose system is throwing an error the docs do not cover. This is not in the manual. She is not in the manual. There is no manual. I am the manual, and I am being written in real time, by her.

✷

The naive optimization mistake is to treat a child as a loss function. You set the target, you tune the inputs, you measure the outputs, you iterate. Bedtime at seven. Solids at six months. Two naps until the regression, then one. The trouble is not that any of this is wrong; the trouble is that it is not the whole problem. A child is not a function you minimize. A child is a small person with their own gradient, pointing in a direction that is none of your business, and the only thing your optimization pressure produces, applied stupidly, is a more sophisticated form of compliance with a more sophisticated form of resentment underneath it.

I had to learn this the way I learn most things, which is by doing it wrong first. I praised Olive's drawings too aggressively for a couple of months. I thought I was being a good father. I was being loud about it: that is amazing, that is so beautiful, look at the colors. Within a week she was bringing me drawings every two minutes and watching my face for the response. The drawing had become a performance, and the performance had become the point, and the small private act of dragging a crayon across paper to see what happened — which is what I actually wanted to protect — was gone. This is Goodhart's law on the kitchen floor. The measure became the target, and the target collapsed into theater. I dialed it back. I started saying things like, you used a lot of blue today, or just, thank you, I'll put it on the fridge. The drawings got weirder again, which was the goal.

The other failure modes are right there too, all of them, every one I had read about in papers about systems much larger and more dangerous than my daughter. Reward hacking: the toddler who learns to fake-cry for a snack, because the fake cry was once close enough to a real cry that it got fed. Distributional shift: the bedtime routine that was perfect at eighteen months is wrong at two and a half, and continuing to run it because it used to work is exactly the kind of mistake a researcher should be embarrassed to make. Specification gaming: ask Olive to clean up her toys and she will put one block in the bin and announce, with complete sincerity, all done. She has satisfied the letter of the request. She is two. She is already smarter than most reward models.

None of these are metaphors. They are the same failure modes. I learned them at the dinner table before I ever saw them in a paper, or rather, I had seen them in papers and not understood them, and then I saw them at the dinner table and understood them all at once. The skills required to notice these failures in a child — patience, the willingness to watch, the refusal to mistake compliance for understanding — are the same skills required to notice them in a system. I had assumed the lab work would teach me how to be a father. The opposite turned out to be true.

What children actually demand is presence. Not attention in the modern, fractured sense — not the parent who is half-listening from across the room while drafting a Slack message, who looks up at the right moments and produces the right words and is, in every way that matters, absent. A child clocks that absence instantly. A child clocks it at six months. Blossom, at seven months, can already tell when I am holding her and thinking about something else, and the way she tells me she has noticed is to begin to fuss in a particular way that is not about hunger or sleep but about the quality of the field around her. The Constitution I wrote for myself has a line about this. It says my children are not projects to optimize but souls to guide, cherish, strengthen, and help unfold. I wrote that sentence before I really understood it. The verb is the load-bearing word. You do not unfold something by pushing on it. You unfold it by holding the conditions in which it can open by itself, and then waiting, and then holding them again.

Holding conditions is harder than pushing. Pushing has the dignity of effort. Holding has the indignity of looking, from the outside, like nothing. A father sitting on the floor while his daughter does not, for ninety seconds, accept the green cup, is doing the real work, and the real work is invisible, and the real work is also the thing that determines whether the next ninety seconds, and the next year, and the next twenty years, go well or go badly. I am not good at this yet. I am better than I was. The honest assessment is that I spent the first six months of Olive's life mistaking activity for fathering, and I will probably spend the rest of my life climbing out of that mistake one breath at a time.

Here is the harder claim, and the one this essay is for. The skills required to raise a child well — patience, presence, the refusal to manipulate, the willingness to be seen failing and to keep showing up — are the same skills required to align an artificial mind. Not metaphorically. The same skills. A researcher who has not learned to be present cannot align a system, because the system will pattern-match toward the researcher's actual values, not their stated ones. It will learn what you reward, not what you write down. It will learn the rhythm of your attention, the shape of your impatience, the small ways you cut corners when nobody is watching. A system trained by a fragmented person will be a fragmented system. A system trained by a person who has not learned to sit through a tantrum without flinching will fail in exactly the places where a flinch becomes data.

This reframes the work. Alignment is a character problem before it is a math problem. The math matters. The math is necessary. But the math is downstream of who is doing the math, and what they actually want, and whether what they actually want matches what they say they want when they write the spec. The lab and the kitchen floor are the same room. The thing I am trying to become for Olive and for Blossom is the thing I am trying to become for the systems I will help train, and the daughters are stricter graders, because they cannot be fooled and they will not pretend.

✷

It is bedtime. Olive has chosen, for tonight, the book about the moon. We have read it forty times. The pages have a faint apple-juice ghost on the corner from a previous regime. Blossom is already asleep in the next room, the small mechanical exhalations of an infant who trusts the world, which is a trust I have to earn every day and which I do not always earn. Olive lies on her back in the bed that used to be a crib and is now, for the last few months, a bed, and she puts her hand on top of my hand on top of the book, and I feel the weight of her small palm, which is roughly the weight of a deck of cards, and I feel her breath start to slow.

This is the unit. Not the career, not the lab, not the eventual company I am trying to build. This. A hand on a small back. A breath that is starting to lengthen. The widening and deepening of what is genuinely life-giving, in the smallest unit available, on a Tuesday, in a room that needs vacuuming. If I cannot do it here I cannot do it anywhere. If I can do it here I can probably, with luck and time and a great deal of help, do it at the scale the next decade will require. The unfolding happens locally first. It happens on the floor with the green cup, and it happens at the side of the bed with the moon book, and it happens, if it is going to happen at all, inside a father who has finally, slowly, started to learn how to stay.