Research Notes

2024/01/01 blog

Welcome! This blog contains my notes on reading papers related to LMs-agents and emotional dialog.


Research Note

Welcome! This blog is a collection of my notes on papers related to language model agents and the realm of emotional dialogues.

Markdown Template

Markdown Note Template 这是模板 [Vscode snippets prefix : “paper_note_template”]

### Title

* **Keywords**:
#### Intensive Reading 

##### Link

* **Github\ArXiv**: 

* **Demo**:  

##### Evaluate

* **Usefulness(1~5)**: 
* **Timing**: 

##### Abstract### Title 【】

* **Keyword**:

#### Link

* **Github\ArXiv**: 

* **Demo**:  

#### Evaluate

* **Usefulness(1~5)**: 
* **Timing**: 
* **Intensive Reading**: 

#### Abstract### Title

* **Keywords**:
#### Intensive Reading 

##### Link

* **Github\ArXiv**: 

* **Demo**:  

##### Evaluate

* **Usefulness(1~5)**: 
* **Timing**: 

##### Abstract### Title 【】

* **Keyword**:

#### Link

* **Github\ArXiv**: 

* **Demo**:  

#### Evaluate

* **Usefulness(1~5)**: 
* **Timing**: 
* **Intensive Reading**: 

#### Abstract### Title

* **Keywords**:
#### Intensive Reading 

##### Link

* **Github\ArXiv**: 

* **Demo**:  

##### Evaluate

* **Usefulness(1~5)**: 
* **Timing**: 

##### Abstract

Methodology

Title: WEBARENA: A REALISTIC WEB ENVIRONMENT FOR BUILDING AUTONOMOUS AGENTS

Keyword: Web-agent, Link:

A useful LLM-base agent that can manipulate Web UI. The demo is interesting. The interaction between LLM and Webs is valuable.

Title: Hierarchical Prompting Assists Large Language Model on Web Navigation

He looks like he wrote a framework/system into two articles. This method is similar to the ReAct.

image

Title: FIREACT: Toward LANGUAGE AGENT FINE-TUNING

Keywords: ReACT+ Finetune, Ablation Experiment, LM fine-tuning, Link

  • Github:

  • Demo: Evaluate

  • Usefulness(1~5): 5

  • Timing: Oct 9,2023

  • Intensive Reading: yes Abstract:

  • The article’s structure is logical, coherent and complete, making it easy to follow.

  • The related work part of the article is detailed, in-depth and informative, providing valuable conclusions.

  • This paper compares the many methods, including fine-tune & prompt & TOOL, reAct & Reflexion & CoT.

image-20231217110322487

Useful Conclusion:

  • Finetuning GPT-3.5 only cost 10 dollars.
  • LLama-7B models could be fine-tuned to catch up with GPT-3.5 performances.
  • Dr. Yao has in-depth and extensive investigations of LMagent-related works.

image-20231217112034478

Title: SWE-BENCH: CAN LANGUAGE MODELS RESOLVE REAL-WORLD GITHUB ISSUES?

Keywords: LLM-assist Code-generation, Github issues Link

Title: Embers of Autoregression: Understanding Large Language Models Through the Problem Trained to Solve

Keywords: Summarize, interesting figures, comprehensive, wonderful explanation of next-word prediction, Frequency determines quality (main idea). Link

  • Github: https://github.com/tommccoy1/embers-of-autoregression (mission impossible of LLMs)
  • Demo:
    Evaluate
  • Usefulness(1~5):
  • Timing:
  • Intensive Reading: Abstract:

  • This figure can be improved. (What are the capacities of Human & LLM?) Alt text
  • This paper exhibits what might seem like surprising failures of LLMs on tasks that are straightforward for humans to perform.

image-20231217150042407

  • Supplement
  Unique Human Capacities Shared LLM Capacities
Math Counting、Ciphers、Linear Functions   COT?
Language Word/letter reverse、Acronyms、Pig latin   World knowledge
Physical LeCun’s world models    
Training Efficiency    
Frequency      
(failures of planning in arithmetic and text generation)      

Effects on the performance of large language models that are attributable to the fact that they are statistical next-word prediction systems.

image-20231217153534032

Title: Cognitive Architectures for Language Agents

Keywords: New agents’ framework, Control flow, Symbolic AGI Link

image-20231217164349549

image-20231217164558289

image-20231217164657449

image-20231217164848939

Title: Agents: An Open-source Framework for Autonomous Language Agents 【Multi-Agents】

Keyword: multi-Agents, Autonomous Agents Link

TO-DO 12-29 Read https://arxiv.org/pdf/2311.16832.pdf Reproduce Code Try Character.ai Finish Dynamic Memory Component

Title: AgentVerse 【Multi-Agent】

Intensive Reading: yes

##### Abstract

  • AGENTVERSE which can effectively orchestrate a collaborative group of expert agents as a greater-than-the-sum-of-its-parts system.

  • AgentVerse is designed to facilitate the deployment of multiple LLM-based agents in various applications. AgentVerse primarily provides two frameworks: task-solving and simulation.

    • Task-solving: This framework assembles multiple agents as an automatic multi-agent system (AgentVerse-Tasksolving, Multi-agent as system) to collaboratively accomplish the corresponding tasks. Applications: software development system, consulting system, etc.

    Screen Shot 2023-09-01 at 12 08 57 PM

  • AgentVerse split the problem-solving process into four pivotal Stages as shown in Figure 2

    • Expert Recruitment (还需要招募专家?):

    • image-20240102164930868

    • Collaborative Decision Making

      image-20240102164948596

    • Action Execution

    • Simulation: This framework allows users to set up custom environments to observe behaviors among, or interact with, multiple agents. ⚠️⚠️⚠️ We’re refactoring the code. If you require a stable version that exclusively supports the simulation framework, you can use release-0.1 branch. Applications: game, social behavior research of LLM-based agents, etc.

    Screen Shot 2023-10-16 at 10 53 49 PM

Title:CharacterChat 【Role-Play Agents】

  • Keyword: Personalized Social Support = Emotional Support + Multi-Agents, MBTI Persona Scale+ wonderful dataset

  • Link (https://arxiv.org/abs/2308.10278)

  • Github: https://github.com/morecry/CharacterChat

  • Demo:
    Evaluate

  • Usefulness(1~5): 5【TODO: reproduce】

  • Timing: Aug 2023

  • Intensive Reading:

  • TODO: 20min code reading

  • Proper noun learning: ESC, social support

Intensive Reading: yes

Abstract
  • Traditional methods such as Emotional Support Conversations (ESC) face challenges in effectively addressing a diverse range of individual personalities. In response, we introduce the Social Support Conversation (S2Conv) framework. It comprises a series of support agents and the interpersonal matching mechanism, linking individuals with persona-compatible virtual supporters【Same as I thought in August】. Utilizing persona decomposition based on the MBTI (Myers Briggs Type Indicator)【a scale】, we have created the MBTI-1024 Bank, a group of virtual characters with distinct profiles. Through improved role-playing prompts with behavior preset and dynamic memory, we facilitate the development of the MBTI-S2Conv dataset, which contains conversations between the characters in the MBTI-1024 Bank. Building upon these foundations, we present CharacterChat, a comprehensive S2Conv system, which includes a conversational model driven by personas and memories, along with an interpersonal matching plugin model 【Calculate Attention between personality 】 that dispatches the optimal supporters from the MBTI-1024 Bank for individuals with specific personas. Empirical results indicate the remarkable efficacy of CharacterChat in providing personalized social support and highlight the substantial advantages derived from interpersonal matching. The source code is available at https://github.com/morecry/CharacterChat.
Methodology

Within the S2Conv framework

  • Characters profiles include the persona(Name, Gender, Tone, Personality) and dynamic memory
  • Memory:
    • Housing multifaceted factual knowledge by social with a character-> as a contextual reference
    • triggers for a Seeker’s mental health Problem
    • 就是使用memory作为事实验证
    • Supporters Sharing similar backgrounds with seekers [interesting]
Difference between Information Support and Emotional Support

image-20240102152650148

Data
  • Create characters with chat GPT

    image-20240104104102283

    image-20240102152405520

Offered Dataset

MBTI-S2Conv dataset

image-20240102153139639

Evaluation Using AI

Evaluate MBTI-S2Conv【典型的认知行为疗法】

  • Emotional Improvement: Does the conversation improve the emotional state of the seeker?
  • Problem Solving: Is the problem solved after the conversation?
  • Active Engagement: Is the seeker actively engaged in the conversation?【interesting】
Overview

image-20240102153715973

HHHH, this paper just make chatGPT agent play role-play game, and them expand dataset, and than dynamic retrieve examples for reference.

image-20240102155221352

And this paper also evaluate How different personalities can influent the emotional dialog

Highlights
  • Evaluation using ChatGPT and other LLMs

  • Beautiful Code for dynamic memory

  • MBTI-dataset

  • How to expand the dataset using ChatGPT

image-20240102155000886

Title: E-CORE: Emotion Correlation Enhanced Empathetic Dialogue Generation

  • Keywords: Emotional Dialog, Small Language Model, Emotional Correlation

Evaluate

  • Usefulness(1~5):
  • Timing:
  • Intensive Reading:

Abstract

Title: Towards Interpretable Mental Health Analysis with Large Language Models

  • Keywords: Kailai Yang, analyzing emotional posts using ChatGPT, authors of mental-roberta

Evaluate

  • Usefulness(1~5): 5+ Top-tier and relevant paper
  • Timing: 08 Oct 2023
  • Intensive Reading: Reproducing

Title: Improving Multi-turn Emotional Support Dialogue Generation with Lookahead Strategy Planning

  • Keyword: Similar to my idea in NCMMSC
Evaluate
  • Usefulness(1~5): I saw the code
  • Timing:
  • Intensive Reading:
Abstract

The Mixture of Expert Language Models

Inference MoE model in 3090

Interesting Papers

How to avoid LLM forgot prompts/instructions (NeruIPS 2023)

What kind of data is perfect for LLM’s taste

how can LLM learn from intermediate multi-steps in reasoning (process-oriented)

Align Human Intent and add toxicity later

Initializing Small Models With Pretrained Large Language Models

####

Few shot

Extended Reading

What is the autonomous agent?

Autonomous agents are software programs which respond to states and events in their environment independent from direct instruction by the user or owner of the agent, but acting on behalf and in the interest of the owner.

Reference

Autonomous agents are software programs that respond** to states** and events in their environment independent** from direct instruction by the user or owner of the agent, but acting on behalf and in the interest of the owner. The term agent is not defined precisely, and agent software can range from simple programs composed of a small number of rules to large and complex systems. Agent technology was developed in artificial intelligence (AI) research and can include complex AI techniques. Important application areas are those where the human user can benefit from continuous data analysis, monitoring of data streams and large databases, and where routine reactions to events are required. Many applications are related to user interface technology and the Internet. A weak position sees agents as tools that relieve humans from routine tasks suited to computational solutions, but there is also a strong position that asserts that agents can be constructed to mimic or even surpass the cognitive functions of the human.

What is the in-context learning?

Transformers can learn from context. They can build new predictors from a sequence of labeled examples without having to retrain the entire model.

Suppose you have a model that can identify objects from a photo. If you show the model a new photo of an object you’ve never seen before, the model can use context learning to identify the object.

Suppose you have a model that can generate translations from a piece of text. If you give the model a new sentence, the model can use context learning to generate a more accurate and fluent translation.

reference

Link:https://arxiv.org/pdf/2211.15661.pdf

What are four Paradigms in NLP

Alt text

Link:https://dl.acm.org/doi/pdf/10.1145/3560815

Title: On the Humanity of Conversational AI: Evaluating the Psychological Portrayal of LLMs

  • Keywords: How to evaluate the psychological Portrayal画像 of LLMs, interesting
  • Github\ArXiv\ OpenReview:

  • Demo:

Evaluate
  • Usefulness(1~5): 3
  • Timing:
  • Intensive Reading:
Abstract

Large Language Models (LLMs) have recently showcased their remarkable capacities, not only in natural language processing tasks but also across diverse domains such as clinical medicine, legal consultation, and education. LLMs become more than mere applications, evolving into assistants capable of addressing diverse user requests. This narrows the distinction between human beings and artificial intelligence agents, raising intriguing questions regarding the potential manifestation of personalities, temperaments, and emotions within LLMs. In this paper, we propose a framework, PPBench, for evaluating diverse psychological aspects of LLMs. Comprising thirteen scales commonly used in clinical psychology, PPBench further classifies these scales into four distinct categories: personality traits, interpersonal relationships, motivational tests, and emotional abilities. Our study examines five popular models, namely text-davinci-003, ChatGPT, GPT-4, LLaMA-2-7b, and LLaMA-2-13b. Additionally, we employ a jailbreak approach to bypass the safety alignment protocols and test the intrinsic natures of LLMs. We have made PPBench openly accessible via The link is hidden due to anonymity. For reviewers, please refer to the supplementary materials.

Markdown Template

### Title

* **Keywords**:

#### Link

* **Github\ArXiv\OpenReview**: 

* **Demo**:  

#### Evaluate

* **Usefulness(1~5)**: 
* **Timing**: 
* **Intensive Reading**: 


#### Abstract

TODO list

Ideas

  • Propose new metrics to calculate efficiency [result/tokens, GPU, step]

Comments/tips

Search

    https://techblog.keqichen.top

    Temp Link

    Table Of Contents