-
Machine learning (ML) models trained using Empirical Risk Minimization (ERM) often exhibit systematic errors on specific subpopulations of tabular data, known as error slices. Learning robust representation in the presence of error slices is challenging, especially in self-supervised settings during the feature reconstruction phase, due to high cardinality features and the complexity of constructing error
-
2024In this paper, we introduce AdaSelection, an adaptive sub-sampling method to identify the most informative sub-samples within each minibatch to speed up the training of large-scale deep learning models without sacrificing model performance. Our method is able to flexibly combines an arbitrary number of baseline sub-sampling methods incorporating the method-level importance and intra-method sample-level
-
2024Pre-trained language models, trained on largescale corpora, demonstrate strong generalizability across various NLP tasks. Finetuning these models for specific tasks typically involves updating all parameters, which is resource-intensive. Parameter-efficient finetuning (PEFT) methods, such as the popular LoRA family, introduce low-rank matrices to learn only a few parameters efficiently. However, during
-
2024Code generation models are not robust to small perturbations, which often lead to incorrect generations and significantly degrade the performance of these models. Although improving the robustness of code generation models is crucial to enhancing user experience in real-world applications, existing research efforts do not address this issue. To fill this gap, we propose CodeFort, a framework to improve
-
2024Hierarchical Text Classification (HTC) is a sub-class of multi-label classification. It is challenging because the hierarchy typically has a large number of diverse topics. Existing methods for HTC fall within two categories, local methods (a classifier for each level, node, or parent) or global methods (a single classifier for everything). Local methods are computationally expensive, whereas global methods
Related content
-
July 09, 2023Finding that 70% of attention heads and 20% of feed-forward networks can be excised with minimal effect on in-context learning suggests that large language models are undertrained.
-
July 05, 2023Amazon Research Award recipient Shrikanth Narayanan is on a mission to make inclusive human-AI conversational experiences.
-
June 21, 2023The senior applied science manager envisions machine learning as the path to a better experience for Amazon customers.
-
June 12, 2023The company’s work, supported by the Amazon Alexa Fund, has relevant applications for areas from perfumes to disease detection.
-
June 05, 2023Learn about the science behind the brand-new NHL EDGE IQ stat that debuted in April 2023.
-
June 02, 2023In a plenary talk, the Berkeley professor and Distinguished Amazon Scholar will argue that AI research should borrow concepts from economics and focus on social collectives.