NLP – AI2 Blog

in AI2 Blog

More on Medium

AI2
RewardBench: the first benchmark & leaderboard for reward models used in RLHFWe introduce RewardBench, a benchmark for evaluating preference reward models. We test the limits of reward models on everything including…
5 min read·Mar 20, 2024
--
--
AI2
AI2 at EMNLP 2023Highlighted work from our institute appearing at this year’s EMNLP conference
7 min read·Dec 4, 2023
--
--
Joseph Chee Chang
AI2 at ACM UIST 2023New Intelligent Reading Interfaces Research and The Semantic Reader Open Research Platform
5 min read·Oct 30, 2023
--
--
Bill Yuchen Lin, PhD
SwiftSage: Building AI Agents for Complex Interactive Tasks via Fast and Slow Thinking with LLMsSwiftSage, a novel AI agent inspired by fast-and-slow thinking, designed to optimize LLMs for planning for complex interactive tasks.
5 min read·Jun 21, 2023
--
--
Maria Antoniak
Using Large Language Models With CareHow to be mindful of current risks when using chatbots and writing assistants
11 min read·Jun 19, 2023
--
2
--
2
Valentina Pyatkin
ClarifyDelphiReinforced Clarification Questions with Defeasibility Rewards for Social and Moral Situations
5 min read·May 30, 2023
--
--
Lucy Li
Words as Gatekeepers: Measuring Discipline-specific Terms and Meanings in Scholarly PublicationsScholarly text is often laden with jargon, or specialized language that can facilitate efficient communication within fields but hinder…
3 min read·May 8, 2023
--
1
--
1
AI2
AI2 at CHI 2023Highlighted work from our institute appearing at this year’s CHI conference
5 min read·Apr 22, 2023
--
--
Tanmay Gupta
Visual ProgrammingAI that solves computer vision tasks by writing code
10 min read·Mar 16, 2023
--
1
--
1
Jon Saad-Falcon
Embedding RecyclingMaking Language Model Development More Sustainable
3 min read·Feb 15, 2023
--
--

in AI2 Blog

RewardBench: the first benchmark & leaderboard for reward models used in RLHF

We introduce RewardBench, a benchmark for evaluating preference reward models. We test the limits of reward models on everything including…

AI2 at EMNLP 2023

Highlighted work from our institute appearing at this year’s EMNLP conference

AI2 at ACM UIST 2023

New Intelligent Reading Interfaces Research and The Semantic Reader Open Research Platform

SwiftSage: Building AI Agents for Complex Interactive Tasks via Fast and Slow Thinking with LLMs

SwiftSage, a novel AI agent inspired by fast-and-slow thinking, designed to optimize LLMs for planning for complex interactive tasks.

Using Large Language Models With Care

How to be mindful of current risks when using chatbots and writing assistants

ClarifyDelphi

Reinforced Clarification Questions with Defeasibility Rewards for Social and Moral Situations

Words as Gatekeepers: Measuring Discipline-specific Terms and Meanings in Scholarly Publications

Scholarly text is often laden with jargon, or specialized language that can facilitate efficient communication within fields but hinder…

AI2 at CHI 2023

Highlighted work from our institute appearing at this year’s CHI conference

Visual Programming

AI that solves computer vision tasks by writing code

Embedding Recycling

Making Language Model Development More Sustainable

Editors

AI2

Carissa Schoenick

Semantic Scholar