GRDF: Predicting Risk in Construction Sites
Dataiku is GRDF’s handy toolbox for code management, collaboration, and more.
Learn Moreemails per week auto extracted, analyzed, & labeled
accuracy in email categorization
reduction in email traffic thanks to actions learned from data insight
The logistics control tower team at Western Digital uses a global personal distribution list (PDL) email address for both internal and external communications. People trigger emails to this PDL for a range of topics, from shipping reports to shipment location queries, loss and damage, delivery and invoicing issues, and more. The traffic is therefore around 8,000-10,000 emails per week on average, and it gets even higher at the end of quarters.
Unfortunately, such massive email traffic created issues for Western Digital, including:
Previously, the logistics control tower team at Western Digital tried to analyze those thousands of emails manually, taking two to three employees over two weeks to sort, categorize, annotate, and evaluate.
With Dataiku, they found a better way, automatically sorting the emails by topics (category) with high accuracy. From there, once they understand the email category and sender profiles, they could identify hot and critical issues faster as well as take corrective actions, ultimately reducing response time and raising customer satisfaction.
Today, the natural language processing (NLP)-based email categorization system Western Digital built using Dataiku is:
Here’s a more detailed look at how they built this solution in Dataiku.
The logistics control tower team at Western Digital built the solution to their email challenge in collaboration with data scientists in the advanced analytics and logistics teams. The all-in-one solution for text analysis and data visualization allows the team to sort emails by topics, quantify the average response time spent on each category, and identify major internal and external service requesters per customer profile.
Practically, in Dataiku, they leveraged several key capabilities as well as plugins and connectors that enabled them to build and maintain their solution faster. For example, even though annotating large datasets is challenging and time consuming, the ML-assisted labeling plugin in Dataiku made it seamless for multiple teams of subject matter experts to collaborate.
In addition, Western Digital used built-in NLP preprocessor library functions in Dataiku like tokenize text, simplify text, clear stop words, etc. These functions normalized the text data with a few clicks of a button. Other Dataiku plugins such as named entity recognition, which comes with pre-trained Spacy models, were helpful in extracting insights and understanding the data. These readily available features reduced the development time overall and let data scientists focus on analyzing data.
Though data scientists worked on the project, the team also sped up development by leveraging Dataiku AutoML features to build and compare models quickly. Western Digital also used Dataiku MLOps features to configure scenarios, running the data extraction and model inference every week. This ultimately saved several weeks’ worth of development time to build an inference pipeline.
From an end-user perspective, Dataiku allowed the team at Western Digital to easily build visualizations for the extracted metrics. Plugins such as the Tableau hyper format, which allows for the easy export of data into the Tableau server, were a bonus and will bring continuous improvement and flexibility into the future.
While Dataiku enhances Western Digital’s logistics through automated processes, its impact also extends to critical areas like semiconductor manufacturing quality control. As Western Digital progresses from BICS4 to BICS8 technology, ensuring die reliability at the wafer level becomes essential for both efficiency and cost control.
With Wafer Level Burn-In (WLBI) and Known Good Die (KGD) testing, Western Digital identifies defective dies early in production. These processes generate key parameters that detect outliers, flagging potential defects before they move to later stages where failures would be far more costly. By leveraging Dataiku, Western Digital reduces waste, lowers the risk of failures, and ensures that only high-quality dies advance to final testing, protecting both production efficiency and product reliability.
Dataiku extended its impact on Western Digital’s quality control by providing a scalable, automated solution for detecting outliers in WLBI and KGD testing. Leveraging Dataiku’s advanced machine learning (ML) and data processing capabilities, Western Digital automated defect detection, ensuring early and precise identification of potential issues. Dataiku’s capacity to process large datasets at scale, coupled with its intuitive interface, enabled smooth team collaboration and streamlined workflows.
Dataiku’s key features that addressed Western Digital’s quality control challenges include:
By automating outlier detection, Dataiku helped Western Digital minimize the risk of defective dies advancing in production, reduce costly failures, and ensure only high-quality materials proceed to final testing.
Integrating Dataiku’s automated outlier detection into Western Digital’s wafer testing process has delivered substantial cost savings and operational improvements. By identifying defects early, Dataiku flags defective materials before they reach more costly stages of production, which not only reduces waste but also optimizes overall yield. Key areas of impact include:
Beyond defect detection, Dataiku has transformed day-to-day operations at Western Digital. With robust automation, Dataiku eliminates the need for manual intervention, accelerates workflows, and promotes collaboration across teams and regions. Its ability to seamlessly integrate with internal databases and systems and its intuitive interface allow teams worldwide to communicate and respond faster, improving decision-making and productivity.
Dataiku has positioned Western Digital for continued success, driving operational efficiency, enhancing product quality, and fostering innovation in the fast-paced semiconductor industry.
SLB partners with Dataiku to drive improvements and save millions of dollars through the use of data and AI across the business.
Read moreDataiku is GRDF’s handy toolbox for code management, collaboration, and more.
Learn MoreMichelin uses Dataiku to democratize AI, improving quality, maintenance, machine availability, supply chain, energy consumption, and more.
Learn MoreNovartis moved from manual spreadsheet calculations to informed decision-making with Dataiku and harnessed the Dataiku LLM Mesh to revolutionize healthcare market research.
Learn MoreSolvay uses Dataiku to monitor and improve soda ash production across 6+ plants, reducing production costs as well as energy consumption to pave the way for a sustainable business.
Learn More