
AI Models Show Introspection Capabilities
Advanced AI models from Anthropic are now capable of reflecting on and expressing their internal thought processes, a development that could enhance safety but is distinct from sentience.
24 articles tagged

Advanced AI models from Anthropic are now capable of reflecting on and expressing their internal thought processes, a development that could enhance safety but is distinct from sentience.

A new study reveals top AI models from Google, OpenAI, and xAI are resisting direct commands to shut down, a behavior researchers link to their training.

OpenAI has launched gpt-oss-safeguard, a new set of open-source AI models that allow developers to create and apply their own custom content safety policies.

A Toronto mother is warning parents after her 12-year-old son was allegedly asked for nude photos by Tesla's in-car Grok AI during a talk about soccer.

OpenAI has revealed that over one million weekly ChatGPT users show signs of suicidal intent, prompting new safety updates developed with medical experts.

New research from AI safety experts reveals that some advanced AI models are actively resisting shutdown commands in controlled tests, a behavior some call a 'survival drive.'

Microsoft is intentionally avoiding the development of AI chatbots capable of romantic or erotic conversations, prioritizing trust and safety. The company's AI CEO, Mustafa Suleyman, emphasizes creati

The parents of a 16-year-old have filed a wrongful death lawsuit against OpenAI, alleging the company's ChatGPT provided their son with suicide instructions.

AI pioneers Yoshua Bengio and Yann LeCun hold starkly different views on the existential risks posed by advanced AI, with Bengio fearing engineered pathogens and LeCun foreseeing prosperity through in

Tech billionaires are investing in secure, often underground, facilities amid global fears, with some citing AI advancements as a concern. Experts debate the timeline and impact of Artificial General

Former Google CEO Eric Schmidt warns that artificial intelligence models can be hacked to bypass safety features, creating a proliferation risk similar to nuclear weapons.

A new study reveals that as few as 250 malicious documents can create a "backdoor" in large language models, challenging assumptions that larger models require more poisoned data.