-
Notifications
You must be signed in to change notification settings - Fork 112
AI For Artists
Artificial Intelligence? Let's dive in, artists!
I am learning Machine Learning (a subset of AI), and my focus is the implementation of AI algorithms in Houdini to solve particular problems. Here I am sharing my exercises and findings with some theory and explanation.
When I learn new things, the most exciting and challenging part is finding a good application example to play with. Something simple enough to understand, replicate, and modify to get the first grasp, yet applicable and relevant to your area of interest. When you learn Python, it is much more thrilling to affect your Houdini scenes, rather than building a shopping list application.
Let's try something fun, intelligent, and easy to implement.
Imagine, you have 3D models of a sphere and cube. How can we determine programmatically if this object is a sphere or cube?
By programmatically I mean that you should have a piece of code that will accept information about the object as input and produce an output, the object classification decision: "This object is a Shpere!"
For humans, it is an easy task, you just show the Houdini viewport with an object to someone, ask what shape they see and you will get an answer. With a computer, it's a bit more tricky.
I will overview what we will do at a high level for the context and better understanding. Then we dive into details: thought process and implementation.
Here is a big picture of the Machine Learning pipeline:
- Obtain or generate the data,
- Teach the model with this data,
- Utilize the model to make predictions with new data.
For our object classification problem, those will be major steps:
The first thing you need to solve is how would you even "show" an object to a computer, e.g. what will be the input for our program?
My first obvious suggestion would be to provide an image of an object (a screenshot of a viewport, or a more fancy Karma visualization). We all know that it will work, we would need to train a model by showing it millions of variations of a sphere picture so it can learn to detect a sphere that it has never seen before. And Houdini is the ideal tool for that, we can procedurally generate any number of different spheres to feed the model. However, there is another way that is easier to implement.
Having a sphere in Houdini means that in addition to the ability to render a sphere, we also have access to miscellaneous sphere parameters such as position, scale, number of points, world position of each point, etc. Those features of an object can be good data for training the model.
Now we need to figure out the good features to describe an object's shape so the computer can determine the differences and make a correct decision.
Let's say we consider an object scale as a feature. We can create a million spheres and a million cubes with a random scale, and record this data in a table, so we will have 2 million rows (one for each sphere or cube instance) and one column "scale" with a float value of each object size. We can easily supply this table as input to a proper ML model. Is scale a good parameter that can characterize an object's shape? Is scale a descriptive property of topology? Can algorithms find some patterns in different object's scale values and link those values to shape?
Intuitively we can feel that the scale of an object does not tell us anything about its shape. So we need to come up with good features, relevant to the object's shape, generate many variations of shape, and record those variations to a table to feed the model.
Let's create a Houdini scene to produce the data.
Before putting our hands on the coding we need to make some preparations. This section should come first, but I move it to the end for better storytelling.
You can download the final Houdini scene and I'll explain the basic steps of developing the solution, so you can get a grasp on the thought process.
We will use the "Sphere" and "Box" SOP nodes to create geometry, read relevant properties that can describe geometry shapes from those nodes, and save this data to an Excel table. In the end, we will create a program (the "Python" SOP node) and connect "Sphere" or "Cube" SOPs as input to this node. The program will tell us, what node is connected, a sphere or cube, hence we will detect an object shape programmatically.
We need a lot of data, in our case we need many different variations of spheres and cubes. Different how? Well, this is the core aspect of Machine Learning, to understand what characteristics (features) will work best for our task. We thought about object scale and realized that it does not describe the 3d model topology at all, so we need something more relevant. We can start varying the the easiest thing we can control in 3d primitive: a number of rows and columns.
Create a "For-Each Number" loop and put a "Sphere" SOP inside, set the "Iterations" parameter as 250 to create 250 copies of a sphere. You will get an object containing 250 equal spheres as an output of the loop.
Why 250? It is just a value I thought would be sufficient to start meaningful training, but not too high to lose performance in Houdini. The ML algorithms prefer as much data as you can get, usually, we talking about millions of data points. I was not sure if my laptop would handle this amount, so I decided to start with 250 objects and see what happened.
Now we need a program that will record the necessary features of those 250 spheres to the Excel table. Create a "Python" SOP after the loop. In this Python program, we will walk through each primitive of input geometry and record feature values to a CSV file:
import hou
import pandas as pd
current_node = hou.pwd()
geo = current_node.geometry()
data = []
csv_path = f'$JOB/scenes/data/train_data_sphere.csv'
for primitive in geo.prims():
# Here we will read values and store them in a dictionary
features = {}
data.append(features)
data_frame = pd.DataFrame(data)
data_frame.to_csv(csv_path)
Here we used Pandas to store the feature data. We can record the necessary data with the default "CSV" library and using Pandas might seem excessive but it will make more sense when we come to the next steps, training and using the model for predictions.
Now we need to randomize a number of rows and columns for each object. I did it with Python code in the "Rows" and "Columns" parameters for the sphere, and the "Axis Division" parameter for the cube:
import random
counter = hou.node('../foreach_count1')
seed = counter.geometry().attribValue("iteration") + 128
random.seed(seed)
value = random.randint(3, 120)
return value
You can write an expression for any parameter by clicking it and pressing Ctr + E. The Parameter Expression window will rise, just switch from default "Hscript Expression" to "Python Expression", write the code, and press "Accept".
At this point, we can see two major things. First, we need our feature values stored as primitive attributes so we can read them from primitives and store them in a dictionary.
Second, we need to get 250 rows in our table, where each row will represent one unique sphere variation. The columns of the table will be our features, so each cell will hold a feature value for a particular sphere instance. We did not figure out what would our features be yet, just decided to vary the number of rows and columns, we will come back to this later. If we loop through all primitives, we will get much more rows, than objects we have (because each object will consist of several primitives), so we need to record primitive data only once for each object. We can do this if we rely on the "Iterations" parameter of the For-Each loop, we can store the object number to a variable and skip the recording of primitive data if this object was processed before:
import hou
import pandas as pd
current_node = hou.pwd()
geo = current_node.geometry()
processed_objects = []
data = []
csv_path = f'$JOB/scenes/data/train_data_sphere.csv'
for primitive in geo.prims():
object_index = primitive.attribValue('object_index')
if object_index not in processed_objects:
processed_objects.append(object_index)
features = {}
data.append(features)
data_frame = pd.DataFrame(data)
data_frame.to_csv(csv_path)
To make it work we need to store the For_Each loop "Iteration" number as an "object_index" primitive attribute.
At this point, we have a backbone for data generation. We utilize Houdini to create a synthetic data set that represents the topology of two 3D objects, a sphere, and a cube. The term synthetic means that the data for ML training was created procedurally (programmatically), otherwise we would have to download 250 spheres and 250 cube models somewhere (making sure that all are different) and individually record parameters for each object.
We also would need to add a cube to the setup and record another file with the cube's data. Later, when we train the model, we will join those two files into one sphere-cube data set.
Now we need to figure out the most exciting thing, the features. What information we will use as a core characteristic of geometry shape?
Houdini allows you to get a lot of geometry properties quite easily, you can get surface area, volume, curvature, number of points, faces, rows, columns etc, etc, etc. Initially, I used all mentioned attributes as features to train the model, but later doing experiments I left only two: the number of points and the number of faces:
features = {'points': primitive.attribValue('points'),
'faces': primitive.attribValue('faces'),
'object_type': 'sphere'}
data.append(features)
You can change the "object_switch" input to switch between sphere and cube, and activate the "export_data" Python SOP to record sphere and cube data. You should have two Excel files with 250 rows each and "points", "faces" and "object_type" columns:
You can download the shpere data file and cube data file for reference.
In the case of distinguishing between a sphere and a cube, we are dealing with binary data. E.g. we check if something is True or False (is this a Sphere?), and for such cases the Logistic Regression model will work perfectly.
The Logistic Regression model will tell us the probability of the input object being a Sphere depending on the number of points and the number of faces this object has. In other terms, the Logistic Regression model allows us to make an object classification decision, e.g. how we classify an input object, as a sphere or cube. This object is predicted as a sphere by a number of points and faces.
Create a new notebook in Collab with a descriptive name. Think of Collab as a regular Python IDE that runs on the cloud. In Collab Notebooks (your Python scripts) you can run code by chanks. You can write a code by chanks in Code Cells. To create a new cell press "+ Code" button, to run the code in the cell press the "Run Cell" button (the white arrow in a black circle).
Once you run the code in the cell, all data remains in the memory, e.g. you need to run the import of modules only once, then you can run a print statement as much as you need, without executing the previous cell with imports.
Upload sphere and cube data files on your Google Drive. I place it in the "PROJECTS/sphere_cube" folder:
Now we can import this data in Collab Notebook and perform some magic:
import pickle
import pandas as pd
from sklearn.linear_model import LogisticRegression
from sklearn.model_selection import train_test_split
from sklearn.metrics import classification_report
# Read sphere cube data
df_sphere = pd.read_csv('/content/drive/MyDrive/PROJECTS/sphere_cube/train_data_sphere.csv')
df_cube = pd.read_csv('/content/drive/MyDrive/PROJECTS/sphere_cube/train_data_cube.csv')
The first time you run the block with reading CSV files it will throw an error. You need to connect your Google Drive to Collab. Press the "Files" icon on the left menu (image of the folder) and then press "Mount Drive" (Google Drive image). Now you should be able to read CSV data.
Next, we need to combine sphere and cube data into one dataset. We also will delete the first column, we don't need it:
# Combine data and delete redundant column
df = pd.concat([df_sphere, df_cube], ignore_index=True)
df.drop(['Unnamed: 0'], axis=1, inplace=True)
If you want to examine the content of your data frame you can run the "head" method on your data frame object which will show several first rows:
As you can see, we have the "object_type" column holding the string data type. To train the Logistic Regression model we need a Boolean data. We will create a new column named "cube", set it to True for cube rows, **False" for sphere rows, and delete the original "object_type" column:
# Transform "object_type" column from string to boolean (1=torus, 0=sphere)
df['cube'] = df['object_type'].apply(lambda x: True if x == 'cube' else False)
df.drop(['object_type'], axis=1, inplace=True)
When you utilize data for model training, a common workflow would be to split the data set into two portions, train and test sets. The train set is used for model training, and the test set evaluates how well the model makes predictions. In such a way, when you test the model, you provide data that it has never seen before, making the evaluation reliable.
# Define features and split data into train/test sets
x = df.drop('cube', axis=1) # Features without target variable (predictors)
y = df['cube'] # Target variable
x_train, x_test, y_train, y_test = train_test_split(x, y, test_size=0.2, random_state=42)
Here "x", the feature matrix (input variables), is a matrix of our feature values. We get it by cutting the "cube column" from our source data set.
The "y", the target vector (output variable), is the thing we are trying to predict.
The tset_size=0.2 means that 20% of source data will be dedicated to tests, and 80% to training.
The random_state=42 tells the program to shuffle all rows to introduce randomness to our source data, which is essential for proper training. The 42 is just a random seed, it can be any integer number, which ensures the results will be the same every time you run the code with the same random_state.
The function returns:
- x_train: the training set for the features.
- x_test: the testing set for the features.
- y_train: the training set for the target variable.
- y_test: the testing set for the target variable.
Now we can finally train our model:
# Train model
logmodel = LogisticRegression(max_iter=600000)
logmodel.fit(x_train, y_train)
Before we export the trained model to a file, we need to get a grasp on the results of training:
# Evaluate model
y_pred = logmodel.predict(x_test)
print(classification_report(y_test, y_pred))
Those "1.0" are fantastic numbers, telling us that the model performance is perfect. Probably we get such beautiful results due to the synthetic nature of data and a very specific use case. There is a lot of information, the core metric is "Accuracy", and the model made 100% correct predictions when evaluated on the test data set.
Time to export the trained model to a file:
# Save model
model_path = '/content/drive/MyDrive/PROJECTS/sphere_cube/sphere_cube_model.sav'
pickle.dump(logmodel, open(model_path, 'wb'))
Download the model file from Google Drive to a local drive. Now we are ready to build a prediction program in Houdini.
The final Sphere Cube Notebook.
Now let's check how good our model is (if it is capable of making correct predictions).
The setup would be extremely simple, create SOPs of sphere and cube, connect them to a switch, and put Python SOP after.
The Python program should be able to determine what node is connected to the input. To make it work we need to read information about the input object in the same way as we did when generating data for training (the number of features and their names needs to be equal). Then we feed this data to our trained model (logmodel.predict(data_frame)) and get a prediction (True or False) that we can print:
import hou
import pickle
import pandas as pd
# Load model
model_path = '$JOB/scenes/data/sphere_cube_model.sav'
logmodel = pickle.load(open(model_path, 'rb'))
# Get test object data
current_node = hou.pwd()
geo = current_node.geometry()
test_primitive = geo.prims()[0]
# Get data for current object
data = []
data.append({
"points": len(geo.points()),
"faces": len(geo.prims())})
data_frame = pd.DataFrame(data)
# Determine if it's a Cube
prediction = logmodel.predict(data_frame)[0]
print(f">> This is a Cube: {prediction}")
As you can see, we detect if the input object is a cube or not. Remember, when we trained our model we defined a cube as the target variable (y = df['cube']).
Change the switch input and see if the prediction is still correct.
In this case, you can classify not only Houdini procedural primitives. For example, create a cube and sphere in Maya, export models to OBJ, load in Houdini with File SOP and see how it is working.
If you were able to replicate the setup and it is working as described, you can start doing your research and experiments.
This article is the final result of my exploration. I started by replicating another Logistic Regression tutorial and found some flaws in the concept. Then I came up with my idea of a relevant task for the Logistic Regression presented here. Initially, I trained the model with several extra features (curvature, scale, surface area, and volume) but then I decided to shrink it as much as it possible. Turned out, the number of points in conjunctions with number of faces allows to train very reliable model.
Try to break the model and then fix it!
Change sphere or cube parameters (number of rows and columns, scale, rotation, etc). Find an object parameter value so the program will fail to predict correctly, then try to fix the model (by introducing new features, for example).
I did not have any luck with this unless I tried to check other primitives, the Dodecahedron (Platonic Solids SOP) was detected as a cube! Can you fix this?
Try to lower the amount of data samples, currently, we have 250 copies of cube and sphere, will be 100 copies enough? Maybe 50 (I tried, it still working which is suspicious)? What about 10?
What if you train the model with the cube and torus data instead of a sphere?
Before diving into the magic of teaching the computer, we need to install and utilize several extra things.
You can run Python programs on your computer, for this tutorial it would not make any difference since we will deal with tiny data sets.
However, in more realistic scenarios, the computational resources might become a bottleneck very fast. We will learn another new thing here! It is a common workflow to perform Machine learning computations on the cloud, Google Colab or Jupiter Notebook are essential tools in this field.
The Google Colab is easy to access and use, it offers computational resources for free (to some extent), and you have all the necessary libraries installed in the Colab environment.
Install Python. Why do we need Python in the OS if we have Python in Houdini? Well, we will need some Python libraries, that are not included in the Python shipped with Houdini, and getting those is much easier with Python installed in your system.
To avoid issues, your major and minor Python versions in Houdini and OS should match. If you open the "Python Shell" tab in Houdini it will tell you the version of Python there. Download and install the proper minor version of Python 3 for your OS.
Once installed, Python will be here:
C:/Users/<user name>/AppData/Local/Programs/Python/Python310
This tutorial will utilize Pandas, a very efficient library for analyzing, cleaning, exploring, and manipulating data.
Run Command Prompt and type pip install pandas
, if you have Python installed properly you should get Pandas in your system here:
C:/Users/<user name>/AppData/Local/Programs/Python/Python310/Lib/site-packages/pandas
Now you need to tell Houdini Python where to search for Pandas (or any other extra library). Add this to your houdini.env
file:
PYTHONPATH = "C:/Users/<user name>/AppData/Local/Programs/Python/Python310/Lib/site-packages;$PYTHONPATH"
We will train the model with the sklearn module in Colab, but to unpack this model in Houdini you will need this library installed locally as well. Check which version is in Collab:
import sklearn
print(sklearn.__version__)
>> 1.3.2
Install the proper version in your OS: pip install scikit-learn==1.3.2