en

ALMA: Streamlining Astronomical Operations With Analytics and AI

ALMA leverages Dataiku to automate data monitoring, streamline proposal reviews, and optimize telescope operations, saving hundreds of hours and enhancing astronomical research.

50-120 Hours

Saved of Astronomers on Duty (AoD) and expert staff time by reducing manual assessments

Weekly

Monitoring of observational data (previously unattainable)

1,700

Proposals processed through automation

 

The Atacama Large Millimeter/submillimeter Array (ALMA) is renowned for its groundbreaking contributions to astronomy, but managing the vast volumes of data, proposals, and observations required to power such achievements poses significant challenges. 

ALMA partnered with Dataiku as part of the AI-for-Good Program — a program that helps NGOs deliver positive impact with data and AI — to implement innovative solutions across various use cases to streamline operations, improve efficiency, and ensure its resources are focused on advancing astronomical discoveries.

A Cultural and Technical Transformation in Analytics

From the start, ALMA recognized that building a sustainable analytics capability required two foundational pillars: a strong data governance framework and a fundamental cultural shift in how to approach analytics in the organization. Dataiku played a central role in both by providing a robust data governance framework and presenting a data mesh approach tailored to ALMA’s organizational structure. This allowed the observatory to mature from fragmented, individual efforts to collaborative, scalable analytics practices.

Dataiku provided us with the tools to govern the full data science lifecycle from data preparation to sharing dataset with users across departments. It has also enabled us to apply governance principles directly to the analytics projects themselves, helping ensure consistency and transparency in our use case queue, and provided the tools to handle the operationalization process when needed. Ignacio Toledo Data Scientist & Analytics Lead at ALMA

Today, more than 130 users actively engage with Dataiku at ALMA, with over 40 using it weekly across various roles. The platform now supports more than 300 projects, including over 67 in production, and manages 240+ curated datasets, 25 webapps, and 30+ automated scenarios — all contributing to smarter operations and better science.

Efficient Monitoring of Observational Data

ALMA’s contact scientists (CS) and science operation specialists oversee thousands of observations (SB) and data processed products to ensure they progress correctly through the observation and data processing pipeline. Manual tracking of these was labor-intensive, leading to delays in identifying and addressing issues, which impacted service delivery.

Using Dataiku, ALMA developed automated dashboards and reports to flag problematic SBs and MOUSs. This solution integrated data from various sources, enabling team members across regions to collaborate and share insights efficiently.

What began as a fragmented, manual process became a robust weekly reporting system. Personalized dashboards and email summaries now help each contact scientist track project status, flag issues, and take timely action. This automation significantly reduces routine work and improves consistency across departments.

The Impact?

  • Automatized regular checks once per week — previously unfeasible due to resource constraints and lack of a data science platform.
  • Improved communication among global teams enhanced issue resolution and service reliability.
  • Boosted relationships with internal teams (i.e., telescope operations) and external users, enhancing trust and satisfaction.
  • Team members were upskilled in data analysis and programming practices.
This resulted in less time spent on routine monitoring, more consistent and reliable information, and most importantly, the ability to catch potential issues earlier. This project exemplified how structured analytics can reduce friction and align teams. Ignacio Toledo Data Scientist & Analytics Lead at ALMA

Optimizing the Proposal Review Process

Each year, ALMA solicits proposals from the astronomy community for ideas on how best to use the telescope. ALMA’s proposal review process involves 1,700 proposals, 1,000 reviewers, and 17,000 reviews, making it the largest in astronomy. Matching proposals to the right reviewers based on their expertise was challenging, with existing algorithms lacking flexibility and accessibility for the broader team.

ALMA used Dataiku to create machine learning (ML) models that parsed the proposal text, inferred the proposal topics, and matched them with reviewers’ expertise. The platform’s integration with Python allowed ALMA to test various algorithms and refine their approach iteratively. As a result, the team developed automated workflows to efficiently process proposals, compute the similarity between proposals and reviewer expertise, and ensure that the right proposals are reviewed by the right people. These automated workflows reduced manual workload and errors, increased productivity through quicker onboarding of new members, and bettered the quality of reviews through the improved proposal-reviewer matching.

Enhancing Quality Assurance for Telescope Operations

Astronomers on Duty (AoDs) at ALMA are responsible for daily telescope operations and, as part of their remit, perform Quality Assurance Level 0 (QA0) to certify observation data quality. Roughly 10% of observations required manual assessment, consuming significant staff time and delaying subsequent processes.

ALMA collaborated with Dataiku to build an ML model capable of classifying QA0 observations. Dataiku’s AutoML and MLOps capabilities accelerated development and deployment, reducing manual intervention. This solution reduced manual QA0 assessments by 82 observations over three months, saving 50 hours initially, with a potential of 120 hours saved when fully deployed. ALMA also benefitted from lowered costs tied to an optimization of computational resource usage and enhanced data quality and reliability.

Transformational Results

Through its collaboration with Dataiku, ALMA has revolutionized the way it works. By automating processes, the organization saves hundreds of hours every year, allowing teams to focus on what truly matters. Dataiku also fosters seamless collaboration across teams with varying expertise, enabling them to work together effortlessly on shared projects. 

These changes have led to smarter decision-making, powered by automated and transparent insights, while optimized resource utilization has significantly cut operational costs.

The use of data has generated a cultural change that has a major impact on the observatory at all levels. Key operational information maximizes our time observing the universe … Our collaboration with Dataiku has been vital to meeting operational goals and achieving record hours of observation. Sean Dougherty Joint ALMA Observatory Director

Airline Pilot Club: Transforming Pilot Training With AI

Airline Pilot Club leveraged Dataiku to revolutionize pilot training with personalized, efficient, and compliant solutions.

Read more

One Acre Fund: Streamlining Processes to Help Remote Farmers

One Acre Fund’s country teams implemented automated processes and streamlined data management to help farmers in remote areas.

Learn More

The Ocean Cleanup: Data Solutions Accelerate Ocean Plastic Removal

The Ocean Cleanup leverages Dataiku to maximize efficiency, collaboration, and operational impact, driving significant progress towards its goal of eliminating 90% of ocean plastics.

Learn More

myAgro: Scaling Farmer Support Through Real-Time Insights

With Dataiku, myAgro boosted sales by 75% with advanced client targeting, supported 900,000+ farmer participations in training, and streamlined data for real-time decisions.

Learn More

Novartis: Streamlining Analytics & AI Across the Organization

Novartis moved from manual spreadsheet calculations to informed decision-making with Dataiku and harnessed the Dataiku LLM Mesh to revolutionize healthcare market research.

Learn More