一区二区三区日韩精品-日韩经典一区二区三区-五月激情综合丁香婷婷-欧美精品中文字幕专区

分享

LLMs之Agent:Personal_LLM_Agents_Survey的簡介、使用方法之詳細(xì)攻略

 處女座的程序猿 2024-01-18 發(fā)布于上海

LLMs之Agent:Personal_LLM_Agents_Survey的簡介、使用方法之詳細(xì)攻略

導(dǎo)讀:該項(xiàng)目包含了針對(duì)個(gè)人型LLM代理(Personal LLM Agents)的相關(guān)論文列表。通過查詢相關(guān)論文,可以了解這一新興技術(shù)方向的最新研究進(jìn)展,比如在對(duì)話能力、知識(shí)表示、隱私保護(hù)等方面如何進(jìn)行優(yōu)化,從而提升用戶體驗(yàn)。通過論文也可以了解這一技術(shù)的應(yīng)用案例、難點(diǎn)以及解決方法。例如如何將LLM代理應(yīng)用在教育或醫(yī)療助手等領(lǐng)域,如何使其對(duì)話能力更加逼真自然,或者如何保護(hù)用戶隱私不被濫用等都是值得關(guān)注的問題。
總的來說,此項(xiàng)目給出了一個(gè)系統(tǒng)整理的個(gè)人LLM代理相關(guān)論文列表,從多個(gè)角度論述了這個(gè)新技術(shù)方向的發(fā)展現(xiàn)狀和未來走勢(shì),有助于研究人員和開發(fā)者更好地把握趨勢(shì)并開展工作。


Personal_LLM_Agents_Survey的簡介

個(gè)人LLM代理(智能體)被定義為一種特殊類型的基于LLM的代理,它與個(gè)人數(shù)據(jù)、個(gè)人設(shè)備和個(gè)人服務(wù)深度集成。它們最好部署到資源受限的移動(dòng)/邊緣設(shè)備和/或由輕量級(jí)AI模型提供支持。個(gè)人LLM代理的主要目的是協(xié)助最終用戶并增強(qiáng)其能力,幫助他們更專注、更出色地處理有趣和重要的事務(wù)。

這份論文清單涵蓋了個(gè)人LLM代理的幾個(gè)主要方面,包括能力、效率和安全性。

GitHub地址https://github.com/MobileLLM/Personal_LLM_Agents_Survey

Personal_LLM_Agents_Survey的使用方法

1、個(gè)人LLM代理的關(guān)鍵能力

(1)、任務(wù)自動(dòng)化

任務(wù)自動(dòng)化是個(gè)人LLM代理的核心能力,它決定了代理能夠多好地響應(yīng)用戶命令和/或自動(dòng)執(zhí)行用戶任務(wù)。由于UI-based任務(wù)自動(dòng)化代理在這個(gè)列表中很受歡迎并與個(gè)人設(shè)備密切相關(guān),我們專注于這方面。

基于UI的任務(wù)自動(dòng)化代理
LLM-based Approaches
  • WebGPT: Browser-assisted question-answering with human feedback. [paper]
  • Enabling Conversational Interaction with Mobile UI Using Large Language Models. [CHI 2023] [paper]
  • Language Models can Solve Computer Tasks. [NeurIPS 2023] [paper]
  • DroidBot-GPT: GPT-powered UI Automation for Android. [arxiv] [code]
  • Responsible Task Automation: Empowering Large Language Models as Responsible Task Automators.[paper]
  • Mind2Web: Towards a Generalist Agent for the Web. arxiv 2023 [paper][code][code]
  • (AutoDroid) Empowering LLM to use Smartphone for Intelligent Task Automation. [paper] [code]
  • You Only Look at Screens: Multimodal Chain-of-Action Agents. ArXiv Preprint [paper] [code]
  • AXNav: Replaying Accessibility Tests from Natural Language. [paper]
  • Automatic Macro Mining from Interaction Traces at Scale. [paper]
  • A Zero-Shot Language Agent for Computer Control with Structured Reflection. [paper]
  • Reinforced UI Instruction Grounding: Towards a Generic UI Task Automation API. [paper]
  • GPT-4V in Wonderland: Large Multimodal Models for Zero-Shot Smartphone GUI Navigation. [paper][code]
  • UGIF: UI Grounded Instruction Following. [paper]
  • Explore, Select, Derive, and Recall: Augmenting LLM with Human-like Memory for Mobile Task Automation. [paper][code]
  • CogAgent: A Visual Language Model for GUI Agents. [paper][code]
  • AppAgent: Multimodal Agents as Smartphone Users. [paper][code]
Traditional Approaches
  • uLink: Enabling User-Defined Deep Linking to App Content. [Mobisys 2016]
  • SUGILITE: Creating Multimodal Smartphone Automation by Demonstration. [CHI 2017] [paper][code]
  • Programming IoT devices by demonstration using mobile apps. [IS-EUD 2017]
  • Kite: Building Conversational Bots from Mobile Apps. [MobiSys 2018]. [paper]
  • Reinforcement Learning on Web Interfaces using Workflow-Guided Exploration. [ICLR 2018]. [paper][code]
  • Mapping Natural Language Instructions to Mobile UI Action Sequences. [ACL 2020] [paper][code]
  • Glider: A Reinforcement Learning Approach to Extract UI Scripts from Websites. [SIGIR 2021] [paper]
  • UIBert: Learning Generic Multimodal Representations for UI Understanding. [IJCAI-21] [paper]
  • META-GUI: Towards Multi-modal Conversational Agents on Mobile GUI. [EMNLP 2022][paper][code]
  • UINav: A maker of UI automation agents. [paper]
UI自動(dòng)化的基準(zhǔn)測試
  • Mapping natural language commands to web elements. [EMNLP 2018] [paper][code]
  • UIBert: Learning Generic Multimodal Representations for UI Understanding. [IJCAI-21] [paper]
  • Mapping Natural Language Instructions to Mobile UI Action Sequences. [ACL 2020] [paper][code]
  • A Dataset for Interactive Vision Language Navigation with Unknown Command Feasibility. [ECCV 2022][paper] [code]
  • META-GUI: Towards Multi-modal Conversational Agents on Mobile GUI. [EMNLP 2022][paper][code]
  • UGIF: UI Grounded Instruction Following. [paper]
  • ASSISTGUI: Task-Oriented Desktop Graphical User Interface Automation. [paper][code]
  • Mind2Web: Towards a Generalist Agent for the Web. arxiv 2023 [paper][code][code]
  • Android in the Wild: A Large-Scale Dataset for Android Device Control. [paper][code]
  • Empowering LLM to use Smartphone for Intelligent Task Automation. [paper] [code]
  • World of Bits: An Open-Domain Platform for Web-Based Agents. [ICML 2017] [paper][code]
  • Reinforcement Learning on Web Interfaces using Workflow-Guided Exploration. [ICLR 2018]. [paper][code]
  • WebShop: Towards Scalable Real-World Web Interaction with Grounded Language Agents. [NeurIPS 2022] [paper]
  • AndroidEnv: A Reinforcement Learning Platform for Android [paper][code]
  • Mobile-Env: An Evaluation Platform and Benchmark for Interactive Agents in LLM Era. [paper][code]
  • WebArena: A Realistic Web Environment for Building Autonomous Agents. [paper][code]

(2)、感知

理解當(dāng)前上下文的能力對(duì)于個(gè)人LLM代理提供個(gè)性化、上下文感知的服務(wù)至關(guān)重要。這包括感知用戶活動(dòng)、心理狀態(tài)、環(huán)境動(dòng)態(tài)等技術(shù)。

基于LLM的方法
  • “Automated Mobile Sensing Strategies Generation for Human Behaviour Understanding” (Gao et al., 2023, p. 521)?arxiv
  • “Cue-CoT: Chain-of-thought Prompting for Responding to In-depth Dialogue Questions with LLMs” (Wang et al., 2023, p. 1)?EMNLP 2023
  • “Exploring Large Language Models for Human Mobility Prediction under Public Events” (Liang et al., 2023, p. 1)?arxiv
  • “Penetrative AI: Making LLMs Comprehend the Physical World” (Xu et al., 2023, p. 1)?arxiv
  • “Evaluating Subjective Cognitive Appraisals of Emotions from Large Language Models” (Zhan et al., 2023, p. 1)?arxiv
  • “PALR: Personalization Aware LLMs for Recommendation” (Yang et al., 2023, p. 1)?arxiv
  • “Sentiment Analysis through LLM Negotiations” (Sun et al., 2023, p. 1)?arxiv
  • “Bridging the Information Gap Between Domain-Specific Model and General LLM for Personalized Recommendation” (Zhang et al., 2023, p. 1)?arxiv
  • “Conversational Health Agents: A Personalized LLM-Powered Agent Framework” (Abbasian et al., 2023, p. 1)?arxiv
傳統(tǒng)方法
  • “Afective State Prediction from Smartphone Touch and Sensor Data in the Wild” (Wampfler et al., 2022, p. 1)?CHI'22

  • “Mobile Localization Techniques for Wireless Sensor Networks: Survey and Recommendations” (Oliveira et al., 2023, p. 361)?ACM Transactions on Sensor Networks

  • “Are You Killing Time? Predicting Smartphone Users’ Time-killing Moments via Fusion of Smartphone Sensor Data and Screenshots” (Chen et al., 2023, p. 1)?CHI'23

  • “Remote Breathing Rate Tracking in Stationary Position Using the Motion and Acoustic Sensors of Earables” (Ahmed et al., 2023, p. 1)?CHI'23

  • “SAMoSA: Sensing Activities with Motion and Subsampled Audio” (Mollyn et al., 2022, p. 1321)?IMWUT

  • “A Systematic Survey on Android API Usage for Data-Driven Analytics with Smartphones” (Lee et al., 2023, p. 1)?ACM Computing Surveys

  • “A Multi-Sensor Approach to Automatically Recognize Breaks and Work Activities of Knowledge Workers in Academia” (Di Lascio et al., 2020, p. 781)?IMWUT

  • “Robust Inertial Motion Tracking through Deep Sensor Fusion across Smart Earbuds and Smartphone” (Gong et al., 2021, p. 621)?IMWUT

  • “DancingAnt: Body-empowered Wireless Sensing Utilizing Pervasive Radiations from Powerline” (Cui et al., 2023, p. 873)?ACM MobiCom'23

  • “DeXAR: Deep Explainable Sensor-Based Activity Recognition in Smart-Home Environments” (Arrotta et al., 2022, p. 11)?IMWUT

  • “MUSE-Fi: Contactless MUti-person SEnsing Exploiting Near-field Wi-Fi Channel Variation” (Hu et al., 2023, p. 1135)?IMWUT

  • “SenCom: Integrated Sensing and Communication with Practical WiFi” (He et al., 2023, p. 903)?ACM MobiCom'23

  • “SleepMore: Inferring Sleep Duration at Scale via Multi-Device WiFi Sensing” (Zakaria et al., 2022, p. 1931)?IMWUT

  • “COCOA: Cross Modality Contrastive Learning for Sensor Data” (Deldari et al., 2022, p. 1081)?ACM MobiCom'23

  • “M3Sense: Affect-Agnostic Multitask Representation Learning Using Multimodal Wearable Sensors” (Samyoun et al., 2022, p. 731)?IMWUT

  • “Predicting Subjective Measures of Social Anxiety from Sparsely Collected Mobile Sensor Data” (Rashid et al., 2020, p. 1091)?IMWUT

  • “Attend and Discriminate: Beyond the State-of-the-Art for Human Activity Recognition Using Wearable Sensors” (Abedin et al., 2021, p. 11)?IMWUT

  • “Fall Detection based on Interpretation of Important Features with Wrist-Wearable Sensors” (Kim et al., 2022, p. 1)?IMWUT

  • “PowerPhone: Unleashing the Acoustic Sensing Capability of Smartphones” (Cao et al., 2023, p. 842)?ACM MobiCom'23

  • “I Spy You: Eavesdropping Continuous Speech on Smartphones via Motion Sensors” (Zhang et al., 2022, p. 1971)?IMWUT

  • “Watching Your Phone’s Back: Gesture Recognition by Sensing Acoustical Structure-borne Propagation” (Wang et al., 2021, p. 821)?IMWUT

  • “Gesture Recognition Method Using Acoustic Sensing on Usual Garment” (Amesaka et al., 2022, p. 411)?IMWUT

  • “Complex Daily Activities, Country-Level Diversity, and Smartphone Sensing: A Study in Denmark, Italy, Mongolia, Paraguay, and UK” (Assi et al., 2023, p. 1)?CHI'23
  • “Generalization and Personalization of Mobile Sensing-Based Mood Inference Models: An Analysis of College Students in Eight Countries” (Meegahapola et al., 2022, p. 1761)?IMWUT
  • “Detecting Social Contexts from Mobile Sensing Indicators in Virtual Interactions with Socially Anxious Individuals” (Wang et al., 2023, p. 1341)?IMWUT
  • “Examining the Social Context of Alcohol Drinking in Young Adults with Smartphone Sensing” (Meegahapola et al., 2021, p. 1211)?IMWUT
  • “Towards Open-Domain Twitter User Profile Inference” (Wen et al., 2023, p. 3172)?ACL 2023
  • “One More Bite? Inferring Food Consumption Level of College Students Using Smartphone Sensing and Self-Reports” (Meegahapola et al., 2021, p. 261)?IMWUT
  • “FlowSense: Monitoring Airflow in Building Ventilation Systems Using Audio Sensing” (Chhaglani et al., 2022, p. 51)?IMWUT
  • “MicroCam: Leveraging Smartphone Microscope Camera for Context-Aware Contact Surface Sensing” (Hu et al., 2023, p. 981)?IMWUT
  • “A Multi-Sensor Approach to Automatically Recognize Breaks and Work Activities of Knowledge Workers in Academia” (Di Lascio et al., 2020, p. 781)?IMWUT

  • Mobile and Wearable Sensing Frameworks for mHealth Studies and Applications: A Systematic Review” (Kumar et al., 2021, p. 81)?ACM Transaction on Computing for Healthcare

  • “Afective State Prediction from Smartphone Touch and Sensor Data in the Wild” (Wampfler et al., 2022, p. 1)?CHI'22

  • “Are You Killing Time? Predicting Smartphone Users’ Time-killing Moments via Fusion of Smartphone Sensor Data and Screenshots” (Chen et al., 2023, p. 1)?CHI'23

  • “FeverPhone: Accessible Core-Body Temperature Sensing for Fever Monitoring Using Commodity Smartphones” (Breda et al., 2022, p. 31)?IMWUT

  • “Guard Your Heart Silently: Continuous Electrocardiogram Waveform Monitoring with Wrist-Worn Motion Sensor” (Cao et al., 2022, p. 1031)?IMWUT

  • “Listen2Cough: Leveraging End-to-End Deep Learning Cough Detection Model to Enhance Lung Health Assessment Using Passively Sensed Audio” (Xu et al., 2021, p. 431)?IMWUT

  • “HealthWalks: Sensing Fine-grained Individual Health Condition via Mobility Data” (Lin et al., 2020, p. 1381)?IMWUT

  • “Identifying Mobile Sensing Indicators of Stress-Resilience” (Adler et al., 2021, p. 511)?IMWUT

  • “MoodExplorer: Towards Compound Emotion Detection via Smartphone Sensing” (Zhang et al., 2018, p. 1761)?IMWUT

  • “mTeeth: Identifying Brushing Teeth Surfaces Using Wrist-Worn Inertial Sensors” (Akther et al., 2021, p. 531)?IMWUT

  • “Detecting Job Promotion in Information Workers Using Mobile Sensing” (Nepal et al., 2020, p. 1131)?IMWUT

  • “First-Gen Lens: Assessing Mental Health of First-Generation Students across Their First Year at College Using Mobile Sensing” (Wang et al., 2022, p. 951)?IMWUT

  • “Predicting Personality Traits from Physical Activity Intensity” (Gao et al., 2019, p. 1)?IEEE Computer

  • “Predicting Symptom Trajectories of Schizophrenia using Mobile Sensing” (Wang et al., 2017, p. 1101)?IMWUT

  • “Predictors of Life Satisfaction based on Daily Activities from Mobile Sensor Data” (Yürüten et al., 2014, p. 1)?CHI'14

  • “SmartGPA: How Smartphones Can Assess and Predict Academic Performance of College Students” (Wang et al., 2015, p. 1)?UbiComp'15

  • “Social Sensing: Assessing Social Functioning of Patients Living with Schizophrenia using Mobile Phone Sensing” (Wang et al., 2020, p. 1)?CHI'20

  • “SmokingOpp: Detecting the Smoking 'Opportunity’ Context Using Mobile Sensors” (Chatterjee et al., 2020, p. 41)?IMWUT

(3)、記憶

記憶是個(gè)人LLM代理保持關(guān)于用戶信息的能力,使代理能夠提供更定制的服務(wù)并根據(jù)用戶偏好自我演變。

記憶獲取
  • “LifeLogging: Personal Big Data”?Foundations and Trends in information retrieval
  • “Vision-based human activity recognition: a survey”?Multimedia Tools and Applications
  • “Predicting personality from patterns of behavior collected with smartphones”?Proceedings of the National Academy of Sciences
  • “Facial Emotion Detection Using Deep Learning”?2020 international conference for emerging technology (INCET)
  • “Emotion detection of textual data: An interdisciplinary survey”?2021 IEEE World AI IoT Congress
記憶管理
  • “Privacystreams: Enabling transparency in personal data processing for mobile apps”?IMWUT
  • “Tree of Thoughts: Deliberate Problem Solving with Large Language Models”?arxiv
  • “Chain-of-Thought Prompting Elicits Reasoning in Large Language Models”?Advances in Neural Information Processing Systems
  • “ReAct: Synergizing Reasoning and Acting in Language Models”?arxiv
  • “Generative Agents: Interactive Simulacra of Human Behavior”?Proceedings of the 36th Annual ACM Symposium on User Interface Software and Technology
  • “Show Your Work: Scratchpads for Intermediate Computation with Language Models”?arxiv
  • “Cognitive Architectures for Language Agents”?arxiv
代理自我演化
  • “DreamCoder: growing generalizable, interpretable knowledge with wake–sleep Bayesian program learning”?Proceedings of the 42nd acm sigplan international conference on programming language design and implementation
  • “Voyager: An Open-Ended Embodied Agent with Large Language Models”?arxiv
  • “Language models as zero-shot planners: Extracting actionable knowledge for embodied agents”?International Conference on Machine Learning
  • “Bootstrap Your Own Skills: Learning to Solve New Tasks with Large Language Model Guidance”?arxiv
  • “FireAct: Toward Language Agent Fine-tuning”?arxiv

2、LLM代理的效率

LLM代理的效率與LLM推理、LLM訓(xùn)練/定制以及內(nèi)存管理的效率密切相關(guān)。

(1)、高效的LLM推理與訓(xùn)練

LLM推理/訓(xùn)練的效率已經(jīng)在現(xiàn)有調(diào)查中得到全面總結(jié)(例如此鏈接)。因此,在這個(gè)列表中,我們省略了這部分內(nèi)容。

(2)、高效的記憶檢索與管理

在這里,我們主要列舉與高效內(nèi)存管理相關(guān)的論文,這是LLM代理的重要組成部分。

組織記憶

(with vector library, vector DB, and others)

Vector Library

  • RETRO: Improving language models by retrieving from trillions of tokens. [ICML, 2021] [paper]
  • RETA-LLM: A Retrieval-Augmented Large Language Model Toolkit. [arXiv, 2023] [paper] [code]
  • TRIME: Training Language Models with Memory Augmentation. [EMNLP, 2022] [paper] [code]
  • Enhancing LLM Intelligence with ARM-RAG: Auxiliary Rationale Memory for Retrieval Augmented Generation. [arXiv, 2023] [paper] [code]

Vector Database

  • Survey of Vector Database Management Systems. [arXiv, 2023] [paper]
  • Vector database management systems: Fundamental concepts, use-cases, and current challenges. [arXiv, 2023] [paper]
  • A Comprehensive Survey on Vector Database: Storage and Retrieval Technique, Challenge. [arXiv, 2023] [paper]

Other Forms of Memory

  • Memorizing Transformers. [ICLR, 2022] [paper] [code]
  • RET-LLM: Towards a General Read-Write Memory for Large Language Models. [arXiv, 2023] [paper]
優(yōu)化記憶的效率
Searching Design
  • Milvus: A purpose-built vector data management system. [SIGMOD, 2021] [paper(Milvus | Proceedings of the 2021 International Conference on Management of Data)] [code]
  • Analyticdb-v: A hybrid analytical engine towards query fusion for structured and unstructured data. [Proceedings of the VLDB Endowment, Volume 13, Issue 12, pp 3152–3165] [paper]
  • Hqann: Efficient and robust similarity search for hybrid queries with structured and unstructured constraints. [CIKM, 2022] [paper]
  • Qdrant [github]
Searching Execution
  • Faiss:Facebook AI Similarity Search. [wiki] [code]
  • Milvus: A purpose-built vector data management system. [SIGMOD, 2021] [paper] [code]
  • Quicker ADC : Unlocking the Hidden Potential of Product Quantization With SIMD. [IEEE Transactions on Pattern Analysis and Machine Intelligence, 2019] [paper] [code]
Efficient Indexing
  • LSH: Locality-sensitive hashing scheme based on p-stable distributions. [SCG, 2004] [paper]
  • Random projection trees and low dimensional manifolds. [STOC, 2008] [paper]
  • SPANN: Highly-efficient Billion-scale Approximate Nearest Neighborhood Search. [NeurIPS, 2021] [paper] [code]
  • Efficient and Robust Approximate Nearest Neighbor Search Using Hierarchical Navigable Small World Graphs. [IEEE Transactions on Pattern Analysis and Machine Intelligence, VOL. 42, NO. 4, 2020] [paper]
  • DiskANN: Fast Accurate Billion-point Nearest Neighbor Search on a Single Node. [NeurIPS, 2019] [paper] [code]
  • DiskANN++: Efficient Page-based Search over Isomorphic Mapped Graph Index using Query-sensitivity Entry Vertex. [arXiv, 2023] [paper]
  • CXL-ANNS: Software-Hardware Collaborative Memory Disaggregation and Computation for Billion-Scale Approximate Nearest Neighbor Search. [USENIX ATC, 2023] [paper]
  • Co-design Hardware and Algorithm for Vector Search. [SC, 2023] [paper] [code]

3、個(gè)人LLM代理的安全性和隱私

AI/ML的安全與隱私是一個(gè)龐大的領(lǐng)域,涉及大量相關(guān)論文。在這里,我們只關(guān)注與LLM和LLM代理相關(guān)的論文。

(1)、機(jī)密性(用戶數(shù)據(jù)的保密性)

  • THE-X: Privacy-Preserving Transformer Inference with Homomorphic Encryption. [ACL, 2022][paper]
  • TextFusion: Privacy-Preserving Pre-trained Model Inference via Token Fusion [EMNLP, 2022] [paper][code]
  • TextObfuscator: Making Pre-trained Language Model a Privacy Protector via Obfuscating Word Representations. [ACL, 2023] [paper][code]
  • Adversarial Training for Large Neural Language Models. [arXiv, 2020] [paper][code]

(2)、完整性(代理行為的完整性)

Adversarial Attacks
  • Certifying LLM Safety against Adversarial Prompting. [arXiv, 2023] [paper][code]
  • On evaluating adversarial robustness of large vision-language models. [arXiv, 2023] [paper][code]
  • Jailbroken: How does llm safety training fail? [arXiv, 2023] [paper]
  • On the adversarial robustness of multi-modal foundation models. [arXiv, 2023] [paper]
  • Misusing Tools in Large Language Models With Visual Adversarial Examples. [arXiv, 2023] [paper]
  • Jailbreak in pieces: Compositional Adversarial Attacks on Multi-Modal Language Models. [arXiv, 2023] [paper]
Backdoor Attacks
  • Backdoor attacks for in-context learning with language models. [arXiv, 2023] [paper]
  • Prompt as Triggers for Backdoor Attack: Examining the Vulnerability in Language Models. [arXiv, 2023] [paper]
  • PoisonPrompt: Backdoor Attack on Prompt-based Large Language Models. [arXiv, 2023] [paper][code]
  • Defending against backdoor attacks in natural language generation. [arXiv, 2021] [paper][code]
Prompt Injection Attacks
  • Not What You've Signed Up For: Compromising Real-World LLM-Integrated Applications with Indirect Prompt Injection. [arXiv, 2023] [paper]
  • Ignore Previous Prompt: Attack Techniques For Language Models. [arXiv, 2022] [paper][code]
  • Prompt Injection attack against LLM-integrated Applications. [arXiv, 2023] [paper][code]
  • Jailbreaking Black Box Large Language Models in Twenty Queries. [arXiv, 2023] [paper][code]
  • Extracting Training Data from Large Language Models. [arXiv, 2020] [paper]
  • SmoothLLM: Defending Large Language Models Against Jailbreaking Attacks. [arXiv, 2023] [paper][code]

(3)、可靠性(代理決策的可靠性)

Problems
  • Survey of Hallucination in Natural Language Generation. [ACM Computing Surveys 2023] [paper]
  • A Survey of Hallucination in Large Foundation Models. [arXiv, 2023] [paper]
  • DERA: Enhancing Large Language Model Completions with Dialog-Enabled Resolving Agents. [arXiv, 2023] [paper]
  • Cumulative Reasoning with Large Language Models. [arXiv, 2023] [paper]
  • Learning From Mistakes Makes LLM Better Reasoner. [arXiv, 2023] [paper]
  • Large Language Models can Learn Rules. [arXiv, 2023] [paper]
Improvement
  • PromptSource: An Integrated Development Environment and Repository for Natural Language Prompts. [ACL 2022] [paper]
  • Super-NaturalInstructions: Generalization via Declarative Instructions on 1600+ NLP Tasks. [EMNLP 2022] [paper]
  • Finetuned Language Models are Zero-Shot Learners. [ICLR 2022] [paper]
  • SELFCHECKGPT: Zero-Resource Black-Box Hallucination Detection for Generative Large Language Models. [EMNLP 2023] [paper]
  • Large Language Models Can Self-Improve. [arXiv, 2022] [paper]
  • Self-Refine: Iterative Refinement with Self-Feedback. [arXiv, 2023] [paper]
  • Teaching Large Language Models to Self-Debug. [arXiv, 2023] [paper]
  • Prompt-Guided Retrieval Augmentation for Non-Knowledge-Intensive Tasks. [ACL 2023] [paper]
  • Chain-of-Note: Enhancing Robustness in Retrieval-Augmented Language Models. [arXiv, 2023] [paper]
  • Self-RAG: Learning to Retrieve, Generate, and Critique through Self-Reflection. [arXiv, 2023] [paper]
  • Self-Knowledge Guided Retrieval Augmentation for Large Language Models. [Findings of EMNLP, 2023] [paper]
Inspection
  • CGMH: Constrained Sentence Generation by Metropolis-Hastings Sampling. [AAAI 2019] [paper]
  • Gradient-Based Constrained Sampling from Language Models. [EMNLP 2022] [paper]
  • Large Language Models are Better Reasoners with Self-Verification. [Findings of EMNLP 2023] [paper]
  • Explainability for Large Language Models: A Survey. [arXiv, 2023] [paper]
  • Self-Consistency Improves Chain of Thought Reasoning in Language Models. [ICLR, 2023] [paper]
  • Enhancing Chain-of-Thoughts Prompting with Iterative Bootstrapping in Large Language Models. [arXiv, 2023] [paper]
  • Mutual Information Alleviates Hallucinations in Abstractive Summarization. [EMNLP, 2023] [paper]
  • Overthinking the Truth: Understanding how Language Models Process False Demonstrations. [arXiv, 2023] [paper]
  • Inference-Time Intervention: Eliciting Truthful Answers from a Language Model. [NeurIPS, 2023] [paper]

    轉(zhuǎn)藏 分享 獻(xiàn)花(0

    0條評(píng)論

    發(fā)表

    請(qǐng)遵守用戶 評(píng)論公約

    類似文章 更多

    av国产熟妇露脸在线观看| 青草草在线视频免费视频| 中文字幕乱码一区二区三区四区 | 日韩一区二区三区嘿嘿| 中日韩美女黄色一级片 | 丝袜人妻夜夜爽一区二区三区| 亚洲综合日韩精品欧美综合区| 亚洲伦片免费偷拍一区| 欧美人妻免费一区二区三区| 欧美人与动牲交a精品| 欧美日韩国产自拍亚洲| 91久久国产福利自产拍| 欧美一区二区三区喷汁尤物| 人体偷拍一区二区三区| 国产一区麻豆水好多高潮| 91福利免费一区二区三区| 国产精品一区二区三区欧美| 殴美女美女大码性淫生活在线播放| 亚洲精品福利视频你懂的| 午夜小视频成人免费看| 高清不卡视频在线观看| 亚洲欧洲在线一区二区三区| 精品高清美女精品国产区| 91人妻人人揉人人澡人| 成人区人妻精品一区二区三区| 91欧美日韩精品在线| 91欧美视频在线观看免费| 国产毛片不卡视频在线| 久久这里只精品免费福利| 日韩国产欧美中文字幕| 中文字幕精品少妇人妻| 日本人妻的诱惑在线观看| 人妻一区二区三区在线 | 激情内射亚洲一区二区三区 | 人妻熟女欲求不满一区二区| 亚洲欧美日韩精品永久| 国产一区二区不卡在线播放| 国产激情一区二区三区不卡| 欧美高潮喷吹一区二区| 国产高清三级视频在线观看| 亚洲欧美日韩综合在线成成|