Skip to content
This repository has been archived by the owner on Aug 29, 2024. It is now read-only.

Commit

Permalink
Fixed real-time integration with PyCaret for Credit Card Fraud Detect…
Browse files Browse the repository at this point in the history
…ion (#370)
  • Loading branch information
HuifangYeo committed Oct 10, 2022
1 parent fb977de commit 4a45332
Show file tree
Hide file tree
Showing 18 changed files with 4,799 additions and 140 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -84,11 +84,11 @@
"# print(\"Downloading: %d%% [%d / %d] bytes\" % (current / total * 100, current, total))\n",
"\n",
"\n",
"# url = \"https://data.atoti.io/notebooks/credit-card-fraud/output.zip\"\n",
"# url = \"https://data.atoti.io/notebooks/credit-card-fraud/data.zip\"\n",
"# filename = wget.download(url, bar=bar_custom)\n",
"\n",
"# # unzipping the file\n",
"# with ZipFile(\"output.zip\", \"r\") as zipObj:\n",
"# with ZipFile(\"data.zip\", \"r\") as zipObj:\n",
"# # Extract all the contents of zip file in current directory\n",
"# zipObj.extractall(data_path)"
]
Expand Down Expand Up @@ -5109,7 +5109,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.8.12"
"version": "3.9.13"
}
},
"nbformat": 4,
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,15 @@ This use case comprises of 3 main sections:
- Detect credit card fraud with autoML, choosing the best model with PyCaret.
- Real-time credit card analysis to investigate suspicious transaction flagged by the ML.

Below depicts the flow between the libraries and their usage:
<img src="./img/app_flow.png" />

This use case uses [PyCaret 2.3.4](https://pycaret.org/).
As the latest version of atoti has conflicting dependencies with PyCaret, the two programs are running on separate virtual environment, communicating through endpoints.

<img src="./img/system_design.png" />


## 1 Synthetic data generation

[01-Synthetic-data-generation.ipynb](./01-Synthetic-data-generation.ipynb) is adapted from the GitHub repository [Sparkov_Data_Generation](https://github.com/namebrandon/Sparkov_Data_Generation). It makes use of Faker to generate customers and credit card transactions with varying profiles:
Expand Down Expand Up @@ -70,9 +79,28 @@ This notebook covers the following:
- we used the LGBM model to perform fraud prediction
- load transactions and its prediction into atoti
- Evaluate incoming transaction from atoti web application on http://localhost:10327.
- Create **source simulation** in atoti with prediction (with and without cumulative features) from:
- LGBM
- DT
- anomaly detection

This allows us to compare the performance of the models and also decide if the additional cumulative features are necessary.
### Real-time fraud prediction

To test the real-time fraud detection, start the Flask application included under the `atoti-pycaret` package. Follow the [README.md](./atoti-pycaret/README.md) included under the package on how to start the application.

Alternatively, you can always integrate your own machine learning models and update the REST URI under the function `get_prediction` in the [main.ipynb](./main.ipynb):

```
def get_prediction(features_df):
url = "http://127.0.0.1:105/predict"
header = {"Content-Type": "application/json"}
payload = {
"features": features_df.to_json(orient="records"),
}
try:
response = requests.post(url, json=payload)
prediction = pd.DataFrame.from_dict(response.json())
return prediction
except requests.exceptions.HTTPError as e:
print(e.response.text)
```
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
3.8.7
Original file line number Diff line number Diff line change
@@ -0,0 +1,53 @@
# Endpoint for Credit Card Fraud prediction

The package automl consists of machine learning models that we have trained using [PyCaret](https://pycaret.org/).

By creating a small [Flask application](https://flask.palletsprojects.com/en/2.2.x/), we are able to create an endpoint that takes in the features for the model to perform fraud prediction.

# Installation

Set up the virtual environment for the project using the below command:
```
poetry install
```

Refer to the [poetry documentation](https://python-poetry.org/docs/master/#installing-with-the-official-installer) for more information on the package manager.


# Runtime
To launch the Flask application, run the following command:
```
poetry run python .\automl\prediction.py
```

You should able to see the following:

<img src="../img/flask_endpoint.png">

We can post requests to the endpoint at http://127.0.0.1:105/predict, e.g.

```
def get_prediction(features_df):
url = "http://127.0.0.1:105/predict"
header = {"Content-Type": "application/json"}
payload = {
"features": features_df.to_json(orient="records"),
}
try:
response = requests.post(url, json=payload)
prediction = pd.DataFrame.from_dict(response.json())
return prediction
except requests.exceptions.HTTPError as e:
print(e.response.text)
```

You can verify that the requests are received by the endpoint through the shell running this program:

<img src="../img/request_received.png"/>


The endpoint returns a Pandas Dataframe containing the features and their corresponding prediction.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Original file line number Diff line number Diff line change
@@ -0,0 +1,33 @@
from flask import Flask, jsonify, request
import pandas as pd
import pycaret.classification as pyc
import pickle
import os

app = Flask(__name__)

dir_path = os.path.dirname(os.path.realpath(__file__))
print(dir_path)


def predict(df):
model = pyc.load_model("./automl/models/Final_LGBM_Model_20211130")
return pyc.predict_model(model, data=df)


@app.route("/predict", methods=["POST"])
def predict_model():
test = request.json

features_json = test["features"]
features_df = pd.read_json(features_json)
print(f"Features received: {len(features_df)}")

model_prediction = predict(features_df)
print(f"Prediction completed for {len(model_prediction)}")

return model_prediction.to_json(orient="records")


if __name__ == "__main__":
app.run(host="0.0.0.0", port=105)
Loading

0 comments on commit 4a45332

Please sign in to comment.