π TRAM: Bridging Trust Regions and Sharpness Aware Minimization π§ π
"Proposes Trust Region Aware Minimization, which encourages flat and smooth minima while maintaining pre-trained representations by using trust region bounds to inform SAM-style regularization on both of these optimization surfaces." [gal30b+] π€ #LG #CL
βοΈ https://github.com/tomsherborne/tram_optimizer
π https://arxiv.org/abs/2310.03646v1 #arxiv
π Neural Language Model Pruning for Automatic Speech Recognition π§ π
"Proposes a variant of low-rank approximation suitable for incrementally compressing models and delivering multiple models with varied target sizes (e,g, 20Γ, 50Γ and 80Γ)." [gal30b+] π€ #LG #CL
π https://arxiv.org/abs/2310.03424v1 #arxiv
7.10.2023 12:53π Neural Language Model Pruning for Automatic Speech Recognition π§ π"Proposes a variant of low-rank approximation suitable for...π MathCoder: Seamless Code Integration in LLMs for Enhanced Mathematical Reasoning ππΎππ§
"Proposes a fine-tuning and inference approach that enhances math reasoning in language models, enabling them to use code for modeling and deriving mathematical equations and, consequently, enhancing their mathematical ability." [gal30b+] π€ #CL #AI #CV #LG
βοΈ https://github.com/mathllm/MathCoder
π https://arxiv.org/abs/2310.03731v1 #arxiv
π Modular Speech-to-Text Translation for Zero-Shot Cross-Modal Transfer π
"The speech encoder is based on the wav2vec2 speech representation and is trained with self-supervision to reconstruct masked portions of speech audio, while the text decoder is a causal Transformer network and is trained to autoregressively reconstruct target text." [gal30b+] π€ #CL
π https://arxiv.org/abs/2310.03724v1 #arxiv
7.10.2023 10:38π Modular Speech-to-Text Translation for Zero-Shot Cross-Modal Transfer π"The speech encoder is based on the wav2vec2 speech...π A Long Way to Go: Investigating Length Correlations in RLHF ππ§
"RLHF learns a reward model from human preference feedback on the outputs of a base model (e-commerce search, chat, question answering, summarization)." [gal30b+] π€ #CL #LG
βοΈ https://github.com/PrasannS/rlhf-length-biases
π https://arxiv.org/abs/2310.03716v1 #arxiv
π Agent Instructs Large Language Models to Be General Zero-Shot Reasoners ππΎπ§
"Builds an autonomous agent to instruct the reasoning process of large language models to further unleash their zero-shot reasoning abilities on a wide set of datasets spanning generation, classification, and reasoning." [gal30b+] π€ #CL #AI #LG
βοΈ https://github.com/wang-research-lab/agentinstruct
π https://arxiv.org/abs/2310.03710v1 #arxiv
π DecoderLens: Layerwise Interpretation of Encoder-Decoder Transformers π
"DecoderLens allows the decoder cross-attention to access all encoder outputs instead of only using the final encoder output, as is normally done in encoder-decoder models." [gal30b+] π€ #CL
π https://arxiv.org/abs/2310.03686v1 #arxiv
7.10.2023 06:08π DecoderLens: Layerwise Interpretation of Encoder-Decoder Transformers π"DecoderLens allows the decoder cross-attention to...π GoLLIE: Annotation Guidelines Improve Zero-Shot Information-Extraction π
"GoLLIE is fine-tuned from a large language model, to follow a given annotation guideline for a specific task, and it uses the resulting model to extract facts." [gal30b+] π€ #CL
βοΈ https://github.com/microsoft/DeepSpeed
π https://arxiv.org/abs/2310.03668v1 #arxiv
π Towards Robust and Generalizable Training: An Empirical Study of Noisy Slot Filling for Input Perturbations ππΎ
"Introduces a noise robustness evaluation dataset Noise-SF for slot filling task, which can help to evaluate the noise robustness of slot filling task, and provide training and evaluation data for robust models." [gal30b+] π€ #CL #AI #DS
βοΈ https://github.com/dongguanting/Noise-SF
π https://arxiv.org/abs/2310.03518v1 #arxiv
π Tik-to-Tok: Translating Language Models One Token at a Time: An Embedding Initialization Strategy for Efficient Language Adaptation ππΎ
"We map tokens from the target tokenizer to semantically similar tokens from the source language tokenizer by using a word translation dictionary encompassing both the source and target languages, which is created automatically." [gal30b+] π€ #CL #AI
π https://arxiv.org/abs/2310.03477v1 #arxiv
7.10.2023 02:38π Tik-to-Tok: Translating Language Models One Token at a Time: An Embedding Initialization Strategy for Efficient Language Adaptation...π Controllable Multi-Document Summarization: Coverage & Coherence Intuitive Policy with Large Language Model Based Rewards π
"A controllable content extraction scheme is trained with a novel coverage and coherence intuitive policy that is duly rewarded by an actively trained LLM, and then used for multi-document summarization." [gal30b+] π€ #CL
π https://arxiv.org/abs/2310.03473v1 #arxiv
7.10.2023 00:53π Controllable Multi-Document Summarization: Coverage & Coherence Intuitive Policy with Large Language Model Based Rewards...π LLM Based Multi-Document Summarization Exploiting Main-Event Biased Monotone Submodular Content Extraction π
"The main-event biased monotone submodular function for content selection enables us to extract the most crucial information related to the main event from the document cluster, which is then rewritten to a coherent text by leveraging a large pre-trained language model." [gal30b+] π€ #CL
π https://arxiv.org/abs/2310.03414v1 #arxiv
7.10.2023 00:08π LLM Based Multi-Document Summarization Exploiting Main-Event Biased Monotone Submodular Content Extraction π"The main-event...π Procedural Text Mining with Large Language Models ππΎ
"Works by leveraging the GPT-4 (Generative Pre-trained Transformer 4) model to extract procedures from unstructured PDF text in an incremental question-answering fashion." [gal30b+] π€ #CL #AI #IT
βοΈ https://github.com/jd-coderepos/proc-tm/
π https://arxiv.org/abs/2310.03376v1 #arxiv
π Evaluating Hallucinations in Chinese Large Language Models π
"Establishes a benchmark named HalluQA for measuring the hallucination phenomenon in Chinese large language models and design a novel automated evaluation method using GPT-4 to judge whether a model output is hallucinated." [gal30b+] π€ #CL
βοΈ https://github.com/xiami2019/HalluQA
π https://arxiv.org/abs/2310.03368v1 #arxiv
π Reformulating Domain Adaptation of Large Language Models as Adapt-Retrieve-Revise π
"Given a target domain like Chinese law, it first continues learning on in-domain data to \textbf{adapt} an affordable 7B LLM to the target domain." [gal30b+] π€ #CL
π https://arxiv.org/abs/2310.03328v1 #arxiv
6.10.2023 19:38π Reformulating Domain Adaptation of Large Language Models as Adapt-Retrieve-Revise π"Given a target domain like Chinese law, it...π Concise and Organized Perception Facilitates Large Language Models for Deductive Reasoning ππΎ
"Carefully analyzes the given statements to efficiently identify the most pertinent information while eliminating redundancy, and then prompts the LLMs in a more organized form that adapts to the model's inference process." [gal30b+] π€ #CL #AI
βοΈ https://github.com/asaparov/prontoqa
π https://arxiv.org/abs/2310.03309v1 #arxiv
π A New Dialogue Response Generation Agent for Large Language Models by Asking Questions to Detect User's Intentions π
"The open-domain dialogue system EDIT consists of a Question Generation (QG) module, an LLM-based QA module and a Knowledge-Enhanced Response Generation module (KG-RG)." [gal30b+] π€ #CL
π https://arxiv.org/abs/2310.03293v1 #arxiv
6.10.2023 17:38π A New Dialogue Response Generation Agent for Large Language Models by Asking Questions to Detect User's Intentions π"The...π A Formalism and Approach for Improving Robustness of Large Language Models Using Risk-Adjusted Confidence Scores π
"A novel method for reducing risk by adjusting LLM confidence scores using a novel calibration method called DwD and a novel evaluation method for assessing both low and high risk tasks." [gal30b+] π€ #CL
π https://arxiv.org/abs/2310.03283v1 #arxiv
6.10.2023 16:38π A Formalism and Approach for Improving Robustness of Large Language Models Using Risk-Adjusted Confidence Scores π"A novel...π Unlock Predictable Scaling From Emergent Abilities π
"Discovers that small models, although they exhibit minor performance, demonstrate critical and consistent task performance improvements that are not captured by conventional evaluation strategies due to insufficient measurement resolution." [gal30b+] π€ #CL
βοΈ https://github.com/openai/human-eval
π https://arxiv.org/abs/2310.03262v1 #arxiv
π Can Large Language Models Be Good Path Planners? A Benchmark and Investigation on Spatial-Temporal Reasoning π
"\textcolor{black}{Proposes a new benchmark, termed PPNL, to evaluate LLMs' spatial-temporal reasoning by formulating ``path planning'' tasks that require an LLM to navigate to target locations while avoiding obstacles and adhering to constraints." [gal30b+] π€ #CL
π https://arxiv.org/abs/2310.03249v1 #arxiv
6.10.2023 14:38π Can Large Language Models Be Good Path Planners? A Benchmark and Investigation on Spatial-Temporal Reasoning...