kadri alaa, Author at Soultarity Tech News

Evaluating progress of LLMs on scientific problem-solving

kadri alaa / April 3, 2025

Programmatic and model-based evaluations Tasks in CURIE are varied and have ground-truth annotations in mixed and heterogeneous form, e.g., as JSONs, latex equations, YAML files, or free-form text. Evaluating free-form generation is challenging because answers are often descriptive, and even when a format is specified, as in most of our cases, the response to each […]

A novel benchmark for evaluating cross-lingual knowledge transfer in LLMs

kadri alaa / April 2, 2025

Data creation and verification To construct ECLeKTic, we started by selecting articles that only exist in a single language on Wikipedia for 12 languages — English, French, German, Hebrew, Hindi, Indonesian, Italian, Japanese, Korean, Mandarin Chinese, Portuguese, and Spanish. These pages are often based on topics most salient to speakers of that language, but they

The evolution of graph learning

kadri alaa / March 31, 2025

Graph algorithms (the pre–deep learning era) Initial work in graph analysis often focused on developing methods to better understand the structure of graphs. They aimed to uncover hidden patterns, properties, and relationships within graphs (e.g., community structures or centrality within a network) and were concerned with gaining insights into the graph’s overall organization and meaning.

A 100-AV Highway Deployment – The Berkeley Artificial Intelligence Research Blog

kadri alaa / March 25, 2025

Training Diffusion Models with Reinforcement Learning We deployed 100 reinforcement learning (RL)-controlled cars into rush-hour highway traffic to smooth congestion and reduce fuel consumption for everyone. Our goal is to tackle “stop-and-go” waves, those frustrating slowdowns and speedups that usually have no clear cause but lead to congestion and significant energy waste. To train efficient

Deciphering language processing in the human brain through LLM representations

kadri alaa / March 21, 2025

During speech production, it is evident that language embeddings (blue) in the IFG peaked before speech embeddings (red) peaked in the sensorimotor area, followed by the peak of speech encoding in the STG. In contrast, during speech comprehension, the peak encoding shifted to after the word onset, with speech embeddings (red) in the STG peaking

Load balancing with random job arrivals

kadri alaa / March 20, 2025

Cluster management systems, such as Google’s Borg, run hundreds of thousands of jobs across tens of thousands of machines with the goal of achieving high utilization via effective load balancing, efficient task placement, and machine sharing. Load balancing is the process of distributing network traffic or computational workloads across multiple servers or computing resources, and

Loss of Pulse Detection on the Google Pixel Watch 3

kadri alaa / March 20, 2025

Acknowledgements The research described here is joint work across Google Research, Google Health, Google DeepMind, and partnering teams, including Consumer Health Research, Personal Safety, quality, regulatory, and clinical operations. The following researchers contributed to this work: Kamal Shah, Anran Wang, Yiwen Chen, Jitender Munjal, Sumeet Chhabra, Anthony Stange, Enxun Wei, Tuan Phan, Tracy Giest, Beszel

Free Local RAG Scraper for Custom GPTs and Assistants • AI Blog

kadri alaa / March 20, 2025

This web scraper runs entirely in your browser and is perfect for creating training data for AI models. It works by reading the website’s sitemap.xml file, making it particularly well-suited for modern platforms like Squarespace and Shopify that automatically generate sitemaps. The scraper preserves the structure of your content, including headings, paragraphs, lists, and tables,

Generating synthetic data with differentially private LLM inference

kadri alaa / March 18, 2025

Due to challenges in generating text while maintaining DP and computational efficiency, prior work focused on generating a small amount of data points (<10) to be used for in-context learning. We show that it’s possible to generate two to three orders of magnitude more data while preserving quality and privacy by solving issues related to

Advancing AMIE for longitudinal disease management

kadri alaa / March 6, 2025

A two-agent architecture for enhanced reasoning Our work addresses this challenge with a novel approach based on the interplay of two LLM-driven agents, which has similarities to how human clinicians tackle management problems. The Dialogue Agent is user-facing and equipped to rapidly respond based on its current understanding of the patient. This agent handles the