Customer-obsessed science


Research areas
-
June 25, 2025With large datasets, directly generating data ID codes from query embeddings is much more efficient than performing pairwise comparisons between queries and candidate responses.
Featured news
-
2024Abstraction ability is crucial in human intelligence, which can also benefit various tasks in NLP study. Existing work shows that LLMs are deficient in abstract ability, and how to improve it remains unexplored. In this work, we design the framework AbsInstruct to enhance LLMs’ abstraction ability through instruction tuning. The framework builds instructions with in-depth explanations to assist LLMs in
-
TQC 20242024Given a linear system of equations Ax = b, quantum linear system solvers (QLSSs) approximately prepare a quantum state |x⟩ for which the amplitudes are proportional to the solution vector x. Asymptotically optimal QLSSs have query complexity O(κ log(1/ε)), where κ is the condition number of A, and ε is the approximation error. However, runtime guarantees for existing optimal and near-optimal QLSSs do not
-
ICPR 20242024This paper improves upon existing data pruning methods for image classification by introducing a novel pruning metric and pruning procedure based on importance sampling. The proposed pruning metric explicitly accounts for data separability, data integrity, and model uncertainty, while the sampling procedure is adaptive to the pruning ratio and considers both intra-class and inter-class separation to further
-
JSM 20242024SHAP (SHapley Additive exPlanations) is widely used in machine learning model explanations nowadays, especially for complex and black-box models (deep learning models, ensemble models). SHAP assigns a feature contribution to every record. Users can check each individual record feature contribution or use the mean absolute SHAP values over the entire dataset as the SHAP feature importance. But it’s not uncommon
-
AutoML 20242024Abstract We introduce TabRepo, a new dataset of tabular model evaluations and predictions. TabRepo contains the predictions and metrics of 1310 models evaluated on 200 classification and regression datasets. We illustrate the benefit of our dataset in multiple ways. First, we show that it allows to perform analysis such as comparing Hyperparameter Optimization against current AutoML systems while also considering
Academia
View allWhether you're a faculty member or student, there are number of ways you can engage with Amazon.
View all