Welcome to

Hamed's Webpage

Machine Learning and AI

Hamed Firooz

I have over 12 years of experience in shaping and delivering large-scale AI solutions for various products with a track record of developing and leading multi-year technology strategies. Over 7 years of experience managing research and engineering teams across multiple sites

  • Current position: Principle Staff AI Scientist at LinkedIn Core AI
  • Education: PhD from University of Washington (UW)

Education

  • 2012
    PhD
    University of Washington

    Compressed Sensing and Network Coding

  • 2008
    MSc
    University of Tehran

    Peer-to-peer networks

Experience

  • 2023 - Current
    Principle Staff AI Scientist
    LinkedIn Core AI

    I have formed and currently lead a team of over 20 AI scientists and engineers to train and operationalize an LLM-based foundational model for LinkedIn’s personalization tasks at scale.

  • 2018 - 2023
    Sr. Staff AI Tech Lead Manager
    Meta AI

    Leading a medium size team with diverse profiles, research scientists and software engineers. Our mission is to advance AI technologies to keep users safe online. My team builds multimodal content understanding services used across many Meta integrity products.

  • 2016 - 2018
    Staff Machine Learning Engineer
    LinkedIn

    Leading LinkedIn Ads Sponsored Update relevance (five engineers, one analytics, one PM). The team is responsible for modeling and raking advertising content on the LinkedIn news feed that shows ads from millions of advertisers to hundreds of millions of Linkedin daily active users.

  • 2015 - 2016
    Staff Machine Learning Tech lead Manager
    Base CRM (acquired by Zendesk)

    Leading a four-engineer group for forecasting. We are responsible for a) Predicting sales attributes (dollar amount, closed date, and the closing probability) for the Sales team b) Predicting the possibility of churn for the Customer Success (CSM) team. media coverage

  • 2012 - 2015
    Senior Machine Learning Engineer
    Falkonry (acquired by IFS)

    Building an early warning system based on the Bayesian network that provides diagnosis and prognosis of large industrial machines.

2024 Highlights
  • 2024
  • [Oct 2024] We published our findings, about LLM Lost-in-Distance phenomena.

    In this work we demonstrate that LLMs performance affected by the relative distance between pieces of information in the context. The further apart the information is within long context, the more the model’s performance deteriorates.

  • [Aug 2024] My team open sourced Liger Kerner for memory efficient and fast LLM training.

    Liger Kernel is a collection of Triton kernels designed specifically for LLM training. It can effectively increase multi-GPU training throughput by 20% and reduces memory usage by 60%.

  • [May 2024] Enhancing Stability for Large Language Models Training in Constrained Bandwidth Networks is accepted to ICML'24 FoMo-ES workshop.

    This system-model co-design work focus on leveraging syncronization in data parallelism hierarchical partitioning to avoid race conditioning in gradient update for LLM training.

  • [Mar 2024] RESPROMPT: Residual Connection Prompting Advances Multi-Step Reasoning in Large Language Models is accepted to NACCL'24.

    We formulate CoT as a reasoning graph and propose a prompt strategy for multi-step reasoning that can capture complex processes in tasks such as mathematics and commonsense reasoning.

  • [Feb 2024] My team contributed to Open Source DeepSpeed implementation of ZeRO++ hierarchical partitioning.

    The race condition between AllGather and device-to-device copy for the 2nd partition causes instability in training large models such as Llama-7B and Falcon-40B on a moderately large number of GPUs. After discovering the algorithmic issue, we landed the fix in the DeepSpeed repository.

  • 2023
  • [Sep 2023] Our paper on Understanding the detrimental class-level effects of data augmentation is accepted to Neurips 2023.

    We propose a framework for understanding how Data Aaugmentation interacts with class-level learning dynamics. We show that simple class-conditional augmentation strategies informed by our framework improve performance on the negatively affected classes.

  • WIP
Contact