Cambridge Internship in ML Model Optimization
Cambridge, Cambridgeshire, United Kingdom
Save
Overview
Join our Strategic Planning and Architecture (SPARC) team within Microsoft’s Azure Hardware Systems & Infrastructure (AHSI) organization and be a part of the organization behind Microsoft’s expanding Cloud Infrastructure and responsible for powering Microsoft’s “Intelligent Cloud” mission. We are seeking a masters/PhD student to join us in Cambridge winter/spring 2025 to work on model compression and optimization for LLMs, covering topics such as post training quantization and quantization aware training. You will be joining a welcoming and highly interdisciplinary team and work on creative and challenging problems during your internship.
Qualifications
Required/Minimum Qualifications:
- Be enrolled in Masters/PhD program in Computer Science/Machine Learning or related discipline
- Substantial experience quantization of LLMs, model compression
- Substantial knowledge in low-precision data type such as floating point, integer formats, block floats
Other Requirements:
- Cloud Background Check
Preferred/Additional Qualifications:
- PyTorch, Python, Hands-on experience in SW Tool development
- Outstanding communication skills
Responsibilities
- Research and develop quantization flow for LLM inference and training
- Design, implement and evaluate performance of quantized SOTA LLMs
- Write and present your findings in technical documents or presentations