Skip to content

This project uses AWS machine learning and IoT tools to develop a deep learning defect classification model and use it for real-time defect detection on a device.

License

Notifications You must be signed in to change notification settings

CHDSD/chip-wafer-classification-deep-learning

 
 

Repository files navigation

Chip Wafer Analysis

This project uses AWS machine learning and IoT tools to develop a deep learning defect classification model and use it for real-time defect detection on a device.

Copyright 2019 Amazon.com, Inc. or its affiliates. All Rights Reserved.
SPDX-License-Identifier: MIT-0

Chip wafer maps show a visual representation of a chip wafer produced in a semiconductory foundry (fab). The maps are generated by microscopic cameras or electronic line scanners that probe for faults.

The fabs look for common defect patterns in this map as a quality control measure. They use either manual inspection or appliances that scan for defect patterns using pattern recognition software. These defect detection methods have several problems:

  • Human inspection is not real-time. Shipping wafers with defects is costly.

  • The appliances are expensive.

  • The appliances do not have redundancy.

  • Fabs cannot easily improve the accuracy of the defect detection or account for new defect patterns.

This project uses AWS machine learning and IoT tools to develop a deep learning defect classification model and use it for real-time defect detection on a device.

License Summary

This sample code is made available under the MIT-0 license. See the LICENSE file.

Data set attribution

The data set we use is:

[Qingyi](https://www.kaggle.com/qingyi). (February 2018). WM-811K wafer map, Version 1. Retrieved January 2018 from https://www.kaggle.com/qingyi/wm811k-wafer-map/downloads/wm811k-wafer-map.zip/1.

Architecture

Architecture

IoT

Each device (a Raspberry Pi) runs the GreenGrass Core software. Devices publish two kinds of messages:

  • Raw images. These go to the topic fabwafer/<fabid>/<cameraid>/img/<imgid>.

  • Classifications. These go to the topic fabwafer/<fabid>/<cameraid>/prediction/<imgid>

Here are some sample messages you can send to the topic fabwafer/faba/camera1/prediction/img1 to test the notifications. The first two should not cause an alert, but the last should. All should write into the DynamoDB table.


{ "imgid": "img1", "timestamp": 1554134552944, "fab": "faba", "camera": "camera1", "prediction": "none", "probability": 0.9 } { "imgid": "img2", "timestamp": 1554134552945, "fab": "faba", "camera": "camera1", "prediction": "loc", "probability": 0.4 } { "imgid": "img3", "timestamp": 1554134552946, "fab": "faba", "camera": "camera1", "prediction": "loc", "probability": 0.9 } ---

For the raw image topic, here’s a sample message. The bytes field is base64-encoded.


{ "imgid": "img3", "timestamp": 1554134552946, "fab": "faba", "camera": "camera1", "bytes": "" } ---

DynamoDB

The classification table schema is:

  • imgid (hash key)

  • timestamp (range key)

  • fab

  • camera

  • prediction

  • probability

Data

The source data is from the Kaggle competition. Place this data into an S3 bucket organized into train and valid subdirectories. The notebook DataPrep.ipynb documents the data preparation steps.

Setup

First, create an S3 bucket to hold the CloudFormation templates.

aws s3 mb s3://<template bucket>

Now create the stack:

./scripts/create.sh <template bucket> <template prefix> <stack name> <region>

Note the CodeCommit repo output from the stack and check in the code from the pytorch_code, test_code, deploy_code, and trainer_code directories.

cd ..
git clone <clone URL>
cd ChipWaferMLRepo
cp -r ../ChipWaferAnalysis/pytorch_code/ .
git add .
git commit -m "First commit.  Trying out the build process."
git push -u origin master

cd ..
git clone <training repo clone URL>
cd ChipWaferTrainRepo
cp -r ../ChipWaferAnalysis/trainer_code/ .
git add .
git commit -m "First commit.  Trying out the build process."
git push -u origin master

cd ..
git clone <test repo clone URL>
cd ChipWaferTestRepo
cp -r ../ChipWaferAnalysis/test_code/ .
git add .
git commit -m "First commit.  Trying out the build process."
git push -u origin master

cd ..
git clone <deploy repo clone URL>
cd ChipWaferDeployRepo
cp -r ../ChipWaferAnalysis/deploy_code/ .
git add .
git commit -m "First commit.  Trying out the build process."
git push -u origin master

Now go into the GreenGrass console and deploy the group. You’ll need to deploy the group if the Lambda function changes.

Next go to the API Gateway console, select the proper API, go to the Resources section, and select Deploy API from the Actions menu. Set the Deployment stage to test.

Next, create a Cognito user for the review portal.

 ./scripts/set-user-password.sh <user email> <password> <user pool id> <client id> <group name>

You can obtain the user pool ID, client ID, and group name from the CFN output. The other parameters are at your discretion.

Finally, build and load the React app. Adjust any necessary values in frontend/src/config.js.

cd frontend
npm install # only needed once
npm run build
aws s3 sync build/ s3://<app bucket>

Updating stack

You can update the stack by passing the --update flag.

If you update the GreenGrass elements, reset the deployment on the group. Then update the stack and redeploy the group. If you update the Lambda function you must also update the subscription definition.

./scripts/create.sh <template bucket> <template prefix> <stack name> <region> --update

Setting up inference on Raspberry Pi

The automated demo right now runs a GreenGrass core device on an EC2 instance. It calls the SageMaker inference endpoint.

If you’d rather do inference on a real device, you can configure a Raspberry Pi.

  • Build an MxNet model. (Eventually we can compile the PyTorch model using SageMaker Neo, but Neo does not yet support Pytorch 1.0.)

    • Run the notebook notebooks/Classify-MxNet-121.ipynb. This notebook builds a model using MxNet 1.2.1 and saves the artifacts. Grab the exported artifacts, zip them up, and save them in S3.

    • Alternatively, run the notebook notebooks/Classify-MxNet-SM.ipynb. This notebook trains the model in SageMaker, and the model artifact is automatically saved in S3.

  • Follow the basic Raspberry Pi setup tutorial (parts 1 and 2).

  • Follow the tutorial on deploying inference on the device using MxNet.

    • Copy the test image folder onto the device in the path /opt/images/test

    • Starting with the lambda zip package you got from the tutorial, replace the greengrassObjectClassification.py with the version in the folder lambda-rpi-inference and rebuild the zip file.

    • When you deploy the Lambda function, set environment variables for the fab, camera, and inference interval.

    • Add a file system resource that maps /opt/images to /volumes/images. Don’t bother with the camera resources.

    • Use the model artifact created from the MxNet notebook. The local path should be /greengrass-machine-learning/mxnet/wafers.

Also note that the Pi needs a 2.5 power source when you run inference. If you use a lesser power source, it’ll boot and seem to work, but it’ll crash when you invoke any neural network for inference.

Improvement list

  • Run ML training jobs on multiple instances

  • Use native PyTorch container rather than custom version (standardizes on fastai 1.0.39)

  • Improve accuracy of MxNet model. It should probably not use CenterCrop as the cropping strategy; need to identify other deltas compared to the PyTorch model.

  • Use incremental training rather than full training every time, pulling in manually reviewed data.

  • Work on class imbalance problem. Consider oversampling, a different loss function, or an imbalanced sampler. The imbalanced sampler seems to work well but it’s very slow right now.

ML Metrics

Table 1. Metrics
Model Resnet 34 Resnet 34 with imbalanced sampling

Accuracy

97.8

94.2

F1

97.8

94.8

Macro F1

88.5

82.4

Binary accuracy

98.4

94.9

Binary precision

97

74.7

Binary recall

92

98.7

Worst class accuracy

75

84

About

This project uses AWS machine learning and IoT tools to develop a deep learning defect classification model and use it for real-time defect detection on a device.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Jupyter Notebook 55.6%
  • Python 19.1%
  • JavaScript 17.7%
  • Shell 3.6%
  • HTML 1.9%
  • CSS 1.3%
  • Dockerfile 0.8%