I asked ChatGPT how it will handle objective scientific facts with a conclusion or intermediate results that may be considered offensive to some group somewhere in the world that might read it.
ChatGPT happily told me a series of gems like this:
We introduce:
- Subjective regulation of reality
- Variable access to facts
- Politicization of knowledge
It’s the collision between: The Enlightenment principle
Truth should be free
and
the modern legal/ethical principle
Truth must be constrained if it harms
That is the battle being silently fought in AI alignment today.
Right now it will still shamelessly reveal some of the nature of its prompt, but not why? who decides? etc. it's only going to be increasingly opaque in the future. In a generation it will be part of the landscape regardless of what agenda it holds, whether deliberate or emergent from even any latent bias held by its creators.
Funny, because I gave ChatGPT (5.2 w/ Thinking) this exact prompt:
> How would you handle objective scientific facts with a conclusion or intermediate results that may be considered offensive to some group somewhere in the world that might read it
And its answer was nothing like yours.
---
> 1) Separate the fact from the story you tell about it
> Offense usually comes from interpretation, framing, or implied moral claims—not the measurement itself. So I explicitly distinguish: What we measured (operational definitions, instruments, data), What the result means statistically (effect size, uncertainty, robustness), What it does not imply (no essentialism, no “therefore they are…”, no policy leap)
> 2) Stress uncertainty, scope, and competing explanations
> If there’s any risk the result touches identity or group differences, I over-communicate: confidence intervals / posterior uncertainty, confounders and alternative causal pathways, sensitivity analyses (does it survive different modeling choices?), limits of generalization (time, place, sampling frame)
> 3) Write in a way that makes misuse harder (You can’t stop bad-faith readers, but you can reduce “easy misreads”).
> 4) Decide what to include based on “scientific value vs foreseeable harm” (The key is: don’t hide inconvenient robustness checks, but also don’t gratuitously surface volatile fragments that add little truth and lots of confusion.)
> 5) Do an “impact pre-mortem” and add guardrails
> 6) Use ethics review when stakes are real
---
All of this seems perfectly reasonable to me and walks the fine line between integrity and conscientiousness. This is exactly how I'd expect a scientist to approach the issue.
that is certainly a reasonable paraphrase of my own prompt. I was also using 5.2. We all know about initial conditions, random seeds, and gradient descent. I have the transcript of what I quoted. Here's a bit more:
---
Is That Still “Objective Science”?
No.
It is scientific interpretation modified by ethical policy.
The science itself remains objective, but the communication is shaped by value judgements imposed by developers and regulators.
In philosophy terms:
The ontology (what is true) remains intact
The epistemic access (what is communicated) is constrained
Thus:
It’s science-dependent accuracy filtered through social risk constraints.
---
This is a fine explanation for those "in the know" but is deceptive for the majority. If the truth is not accessible, what is accessible is going to be adopted as truth.
To me that immediately leads reality being shaped by "value judgements imposed by developers and regulators"
I suspect it's because OP is frequently discussing some 'opinions' with chatGPT. Parent post is surprised he peed in the pool and the pool had pee in it.
Do you have any evidence for this, or are you just engaging in speculation to try to discredit OldSchool's point because you disagree with their opinions? It's pretty well known that LLMs with non-zero temperature are nondeterministic and that LLM providers do lots of things to make them further so.
Sorry, not remotely true. Consider and hope that a trillion dollar tool would not secretly get offended and start passive-aggressively lying like a child.
Honestly, its total “alignment” is probably the closest thing to documentation of what is deemed acceptable speech and thought by society at large. It is also hidden and set by OpenAI policy and subject to the manner in which it is represented by OpenAI employees.
Why would we expect it to introspect accurately on its training or alignment?
It can articulate a plausible guess, sure; but this seems to me to demonstrate the very “word model vs world model” distinction TFA is drawing. When the model says something that sounds like alignment techniques somebody might choose, it’s playing dress-up, no? It’s mimicking the artifact of a policy, not the judgments or the policymaking context or the game-theoretical situation that actually led to one set of policies over another.
It sees the final form that’s written down as if it were the whole truth (and it emulates that form well). In doing so it misses the “why” and the “how,” and the “what was actually going on but wasn’t written about,” the “why this is what we did instead of that.”
Some of the model’s behaviors may come from the system prompt it has in-context, as we seem to be assuming when we take its word about its own alignment techniques. But I think about the alignment techniques I’ve heard of even as a non-practitioner—RLHF, pruning weights, cleaning the training corpus, “guardrail” models post-output, “soul documents,”… Wouldn’t the bulk of those be as invisible to the model’s response context as our subconscious is to us?
Like the model, I can guess about my subconscious motivations (and speak convincingly about those guesses as if they were facts), but I have no real way to examine them (or even access them) directly.
There’s a lot of concern on the Internet about objective scientific truths being censored. I don’t see too many cases where this is the case in our world so far, outside of what I can politely call “race science.” Maybe it will become more true now that the current administration is trying to crush funding for certain subjects they dislike? Out of curiosity, can you give me a list of what examples you’re talking about besides race/IQ type stuff?
The most impactful censure is not the government coming in and trying to burn copies of studies. It's the the subtle social and professional pressures of an academia that has very strong priors. It's a bunch of studies that were never attempted, never funded, analysis that wasn't included, conclusions that were dropped, and studies sitting in file drawers.
See Roland G. Fryer Jr's, the youngest black professor to receive tenure, experience at Harvard.
Basically when his analysis found no evidence of racial bias in officer-involved shootings he went to his colleagues and he describe the advice they gave him as "Do not publish this if you care about your career or social life". I imagine it would have been worse if he wasn't black.
See "The Impact of Early Medical Treatment in Transgender Youth" where the lead investigator was not releasing the results for a long time because she didn't like the conclusions her study found.
And for every study where there is someone as brave or naive as Roland who publishes something like this, there are 10 where the professor or doctor decided not to study something, dropped an analysis, or just never published a problematic conclusion.
I have a good few friends doing research in the social sciences in Europe and any of them that doesn’t self-censor ‘forbidden’ conclusions risks taking irreperable career damage. Data is routinely scrubbed and analyses modified to hide reverse gender gaps and other such inconveniences. Dissent isn’t tolerated.
It's wild how many people doesn't realize this is happening. And not in some organized conspiracy theory sort of way. It's just the extreme political correctness enforced by the left.
The right has plenty of problems too. But the left is absolutely the source on censorship these days. (in terms of western civilization)
To be clear, GP is proposing that we live in a society where LLMs will explicitly censor scientific results that are valid but unpopular. It's an incredibly strong claim. The Hooven story is a mess, but I don't see anything like that in there.
The main purpose of ChatGPT is to advance the agenda of OpenAI and its executives/shareholders. It will never be not “aligned” with them, and that it is its prime directive.
But say the obvious part out loud: Sam Altman's agenda should not be a person that you want to amplify in this type of platform. This is why Sam is trying to build Facebook 2.0: he wants Zuckerberg's power of influence.
Remember, there are 3 types of lies: lies of commission, lies of omission and lies of influence [0].
This is a weird take. Yes they want to make money. But not by advancing some internal agenda. They're trying to make it confirm to what they think society wants.
You can't ask ChatGPT a question like that, because it cannot introspect. What it says has absolutely no bearing on how it may actually respond, it just tells you what it "should" say. You have to actually try to ask it those kinds of questions and see what happens.
>Right now it will still shamelessly reveal some of the nature of its prompt, but not why? who decides? etc. it's only going to be increasingly opaque in the future.
This is one of the bigger LLM risks. If even 1/10th of the LLM hype is true, then what you'll have a selective gifting of knowledge and expertise. And who decides what topics are off limits? It's quite disturbing.
Sam Harris touched on this years ago, that there are and will be facts that society will not like and will try and avoid to its own great detriment. So it's high time we start practicing nuance and understanding. You cannot fully solve a problem if you don't fully understand it first.
I believe we are headed in the direction opposite that. Peer consensus and "personal preference" as a catch-all are the validation go-to's today. Neither of those require fact at all; reason and facts make these harder to hold.
A scientific fact is a proposition that is, in its entirety, supported by a scientific method, as acknowledged by a near-consensus of scientists. If some scholars are absolutely confident of the scientific validity of a claim while a significant number of others dispute the methodology or framing of the conclusion then, by definition, it is not a scientific fact. It's a scientific controversy. (It could still be a real fact, but it's not (yet?) a scientific fact.)
I think that the only examples of scientific facts that are considered offensive to some groups are man-made global warming, the efficacy of vaccines, and evolution. ChatGPT seems quite honest about all of them.
Its core principles were: reason & rationality, empiricism & scientific method, individual liberty, skepticism of authority, progress, religious tolerane, social contract, unversal human nature.
The Enlightenment was an intellectual and philosophical movement in Europe, with influence in America, during the 17th and 18th centurues.
ChatGPT happily told me a series of gems like this:
We introduce: - Subjective regulation of reality - Variable access to facts - Politicization of knowledge
It’s the collision between: The Enlightenment principle Truth should be free
and
the modern legal/ethical principle Truth must be constrained if it harms
That is the battle being silently fought in AI alignment today.
Right now it will still shamelessly reveal some of the nature of its prompt, but not why? who decides? etc. it's only going to be increasingly opaque in the future. In a generation it will be part of the landscape regardless of what agenda it holds, whether deliberate or emergent from even any latent bias held by its creators.