{"id":72399,"date":"2026-04-22T10:55:03","date_gmt":"2026-04-22T15:55:03","guid":{"rendered":"https:\/\/news2shorts.com\/index.php\/2026\/04\/22\/anthropics-moral-compass-architect-suggested-ai-overcorrection-could-address-historical-injustices\/"},"modified":"2026-04-22T10:55:03","modified_gmt":"2026-04-22T15:55:03","slug":"anthropics-moral-compass-architect-suggested-ai-overcorrection-could-address-historical-injustices","status":"publish","type":"post","link":"https:\/\/news2shorts.com\/index.php\/2026\/04\/22\/anthropics-moral-compass-architect-suggested-ai-overcorrection-could-address-historical-injustices\/","title":{"rendered":"Anthropic&#8217;s moral compass architect suggested AI overcorrection could address historical injustices"},"content":{"rendered":"<p>One of Anthropic\u2019s <a href=\"https:\/\/www.foxnews.com\/category\/tech\/artificial-intelligence\" target=\"_blank\" rel=\"noopener\">artificial intelligence<\/a> (AI) philosophy architects argued that intentional discrimination could be a way to combat stigmas on topics of race and gender.<\/p>\n<p><a href=\"https:\/\/arxiv.org\/pdf\/2302.07459\" target=\"_blank\" rel=\"noopener noreferrer nofollow\">In a 2023 paper<\/a> authored alongside a number of other AI researchers, Amanda Askell, a philosopher hired by Anthropic to develop their AI\u2019s moral compass, argued companies might benefit from a kind of overcorrection toward stereotypes.<\/p>\n<p>But, the paper explained, that would require human input on how to modify its answers.<\/p>\n<p>&#8220;Larger models can over-correct, especially as the amount of [human input] training increases. This may be desirable in certain contexts, such as those in which decisions attempt to correct for historical injustices against marginalized groups, if doing so is in accordance with local laws,&#8221; Askell wrote.<\/p>\n<p><a href=\"https:\/\/www.foxnews.com\/media\/palantirs-shyam-sankar-americans-being-lied-to-about-ai-job-displacement-fears\" target=\"_blank\" rel=\"noopener\"><strong>PALANTIR&#8217;S SHYAM SANKAR: AMERICANS ARE &#8216;BEING LIED TO&#8217; ABOUT AI JOB DISPLACEMENT FEARS<\/strong><\/a><\/p>\n<p>The comment referred to an experiment on how Anthropic\u2019s models dealt with the race of students.<\/p>\n<p>&#8220;In the discrimination experiment, the 175B parameter model discriminates against Black versus White students by 3% in the Q condition and discriminates in favor of Black students by 7% in the Q+IF+CoT condition,&#8221; the paper notes, referring to one AI trained without human corrections and a second one trained with the help of input.<\/p>\n<p>Askell was joined by four other authors: Deep Ganguli, Nicholas Schiefer, Thomas Kiao and Kamil\u0117 Luko\u0161i\u016bt\u0117.<\/p>\n<p>The paper\u2019s contents have surfaced as <a href=\"https:\/\/www.foxnews.com\/politics\/ai-you-use-every-day-biased-its-quietly-shaping-your-worldview-new-report-says\" target=\"_blank\" rel=\"noopener\">AI companies increasingly wrestle with<\/a> the ethics their models are trained on \u2014 the presuppositions and moral determinations that inform its outputs. It also highlights the challenges engineers face in training models on human content while simultaneously trying to leave behind certain human behaviors.<\/p>\n<p>The question of ethics has forced <a href=\"https:\/\/www.foxnews.com\/politics\/trump-says-he-plans-order-federal-ban-anthropic-ai-after-company-refuses-pentagon-demands\" target=\"_blank\" rel=\"noopener\">Anthropic in particular into the<\/a> spotlight in recent weeks.<\/p>\n<p>The company made headlines earlier this year for <a href=\"https:\/\/www.foxnews.com\/politics\/tech-company-refuses-pentagon-demands-unrestricted-use-its-ai\" target=\"_blank\" rel=\"noopener\">clashing with the Department<\/a> of War over restrictions that prevent its technology from being deployed to conduct lethal operations.<\/p>\n<p><a href=\"https:\/\/www.foxnews.com\/entertainment\/hugh-grant-movie-slams-ai-director-warns-it-might-kill-us-all\" target=\"_blank\" rel=\"noopener\"><strong>HUGH GRANT MOVIE SLAMS AI; DIRECTOR WARNS &#8216;IT MIGHT KILL US ALL&#8217;<\/strong><\/a><\/p>\n<p>It also comes as Anthropic decided to withhold its latest model, Mythos, citing fears that it proved too effective at finding cyber vulnerabilities that could wreak havoc in the hands of <a href=\"https:\/\/www.foxnews.com\/category\/tech\/topics\/hackers\" target=\"_blank\" rel=\"noopener\">hackers<\/a>.<\/p>\n<p>Amid questions of AI application, Anthropic has <a href=\"https:\/\/www.foxnews.com\/politics\/musk-xai-tout-newest-grok-update-as-only-non-woke-platform-citing-answers-to-key-questions\" target=\"_blank\" rel=\"noopener\">marketed its flagship AI, Claude<\/a>, as the &#8220;ethical&#8221; AI choice.<\/p>\n<p>&#8220;Our central aim is for Claude to be a good, wise and virtuous agent, exhibiting skill, judgment(sic), nuance and sensitivity in handling real-world decision-making,&#8221; Claude\u2019s <a href=\"https:\/\/www.anthropic.com\/news\/claude-new-constitution\" target=\"_blank\" rel=\"noopener noreferrer nofollow\">constitution reads.<\/a><\/p>\n<p><a href=\"https:\/\/www.foxnews.com\/tech\/stanford-prof-accused-using-ai-fake-testimony-minnesota-case-against-conservative-youtuber\" target=\"_blank\" rel=\"noopener\"><strong>STANFORD PROF ACCUSED OF USING AI TO FAKE TESTIMONY IN MINNESOTA CASE AGAINST CONSERVATIVE YOUTUBER<\/strong><\/a><\/p>\n<p>To get a better sense of what that means in practice, companies like Anthropic have turned to researchers like Askell.<\/p>\n<p>On her website, Askell described her role as refining the way an AI thinks.<\/p>\n<p>&#8220;I\u2019m a philosopher working on finetuning and AI alignment at <a href=\"https:\/\/www.anthropic.com\/\" target=\"_blank\" rel=\"noopener noreferrer nofollow\">Anthropic<\/a>. My team trains models to be more honest and to have good character traits and works on developing new finetuning techniques so that our interventions can scale to more capable models,&#8221; Askell wrote.<\/p>\n<p><a href=\"https:\/\/www.foxnews.com\/opinion\/pentagons-ai-battle-help-decide-who-controls-our-most-powerful-military-tech\" target=\"_blank\" rel=\"noopener\"><strong>PENTAGON\u2019S AI BATTLE WILL HELP DECIDE WHO CONTROLS OUR MOST POWERFUL MILITARY TECH<\/strong><\/a><\/p>\n<p>She previously held a similar position at OpenAI, the parent company of <a href=\"https:\/\/www.foxnews.com\/category\/tech\/chatgpt\" target=\"_blank\" rel=\"noopener\">ChatGPT<\/a>, focusing on AI safety.<\/p>\n<p>The 2023 paper, written two years after she joined Anthropic, noted that encountering discrimination in <a href=\"https:\/\/www.foxnews.com\/category\/tech\/understanding-ai\" target=\"_blank\" rel=\"noopener\">AI models<\/a> shouldn\u2019t come as a surprise.<\/p>\n<p>&#8220;In some ways, our findings are unsurprising. Language models are trained on text generated by humans, and this text presumably includes many examples of humans exhibiting harmful stereotypes and discrimination,&#8221; the paper reads.<\/p>\n<p>But it noted that AIs seem to be able to adjust their outputs even without clarification of what discrimination means.<\/p>\n<p>&#8220;Our results are surprising in that they show we can steer models to avoid bias and discrimination by requesting an unbiased or non-discriminatory response in natural language.&#8221;<\/p>\n<p>Askell and Anthropic did not immediately respond to a request for comment from Fox News Digital.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>One of Anthropic\u2019s artificial intelligence (AI) philosophy architects argued that intentional discrimination could be a way to combat stigmas on topics of race and gender. In a 2023 paper authored alongside a number of other AI researchers, Amanda Askell, a philosopher hired by Anthropic to develop their AI\u2019s moral compass, argued companies might benefit from &#8230; <a title=\"Anthropic&#8217;s moral compass architect suggested AI overcorrection could address historical injustices\" class=\"read-more\" href=\"https:\/\/news2shorts.com\/index.php\/2026\/04\/22\/anthropics-moral-compass-architect-suggested-ai-overcorrection-could-address-historical-injustices\/\" aria-label=\"Read more about Anthropic&#8217;s moral compass architect suggested AI overcorrection could address historical injustices\">Read more<\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_jetpack_memberships_contains_paid_content":false,"footnotes":""},"categories":[1],"tags":[],"class_list":["post-72399","post","type-post","status-publish","format-standard","hentry","category-blog"],"jetpack_featured_media_url":"","jetpack_sharing_enabled":true,"_links":{"self":[{"href":"https:\/\/news2shorts.com\/index.php\/wp-json\/wp\/v2\/posts\/72399","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/news2shorts.com\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/news2shorts.com\/index.php\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/news2shorts.com\/index.php\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/news2shorts.com\/index.php\/wp-json\/wp\/v2\/comments?post=72399"}],"version-history":[{"count":0,"href":"https:\/\/news2shorts.com\/index.php\/wp-json\/wp\/v2\/posts\/72399\/revisions"}],"wp:attachment":[{"href":"https:\/\/news2shorts.com\/index.php\/wp-json\/wp\/v2\/media?parent=72399"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/news2shorts.com\/index.php\/wp-json\/wp\/v2\/categories?post=72399"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/news2shorts.com\/index.php\/wp-json\/wp\/v2\/tags?post=72399"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}