Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AI Based Parameter Prediction Model for conv_hip_igemm_group_fwd_xdlops Solver #2523

Merged
merged 23 commits into from
Dec 8, 2023
Merged
Show file tree
Hide file tree
Changes from 8 commits
Commits
Show all changes
23 commits
Select commit Hold shift + click to select a range
bbbfde6
implemented tuning heuristic for ck_igemm_group_fwd
Dmantri98 Nov 10, 2023
52ebde1
update testing
Dmantri98 Nov 11, 2023
c4269ee
cleanup testing
Dmantri98 Nov 11, 2023
690304e
clang-format and final changes
Dmantri98 Nov 14, 2023
65154f8
Merge develop
Dmantri98 Nov 15, 2023
e6e82d6
hip tidy fixes
Dmantri98 Nov 15, 2023
bb71c21
take out redundant void
Dmantri98 Nov 16, 2023
ffbc428
fix cppcheck error
Dmantri98 Nov 16, 2023
6dd08ee
replace temp vector with erase
Dmantri98 Nov 17, 2023
69c2351
add guards and takeout json_fwd.hpp
Dmantri98 Nov 21, 2023
275980d
Merge branch 'develop' of https://github.com/ROCmSoftwarePlatform/MIO…
Dmantri98 Nov 21, 2023
a358946
takeout architecture check
Dmantri98 Nov 21, 2023
d5d41ef
Merge branch 'develop' into dmantri/conv_hip_igemm_grp_fwd_xdlops_heu…
junliume Nov 23, 2023
9ccba71
Merge branch 'develop' of https://github.com/ROCmSoftwarePlatform/MIO…
Dmantri98 Nov 28, 2023
9dadd17
cmake variable check
Dmantri98 Nov 28, 2023
ec59604
simplfy input_tensor shaping
Dmantri98 Nov 30, 2023
acf1d93
merge develop
Dmantri98 Nov 30, 2023
c593b63
Merge branch 'develop' into dmantri/conv_hip_igemm_grp_fwd_xdlops_heu…
Dmantri98 Dec 4, 2023
5cc829d
env variable and parameter template changes
Dmantri98 Dec 5, 2023
f5094ae
Merge branch 'develop' into dmantri/conv_hip_igemm_grp_fwd_xdlops_heu…
Dmantri98 Dec 5, 2023
77caaa8
clang format
Dmantri98 Dec 6, 2023
f62951b
add back const
Dmantri98 Dec 6, 2023
abd1738
Merge branch 'develop' into dmantri/conv_hip_igemm_grp_fwd_xdlops_heu…
Dmantri98 Dec 7, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
29 changes: 20 additions & 9 deletions src/conv/heuristics/ai_heuristics.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -436,7 +436,8 @@ Metadata::Metadata(const std::string& arch, const std::string& solver)
const nlohmann::json metadata =
common::LoadJSON(GetSystemDbPath() + "/" + arch + "_" + solver + "_metadata.ktn.model");
num_tuning_params = metadata["num_tuning_params"].get<std::size_t>();
tuning_decodings = metadata["decodings"]["tunings"].get<std::unordered_map<std::string, int>>();
tuning_decodings =
metadata["decodings"]["tunings"].get<std::unordered_map<std::string, std::string>>();
}

class Model
Expand All @@ -450,9 +451,14 @@ class Model
{
}
virtual ~Model() = default;
fdeep::tensors Encode(const std::vector<float>& features, std::size_t dim) const
fdeep::tensors Encode(const std::vector<float>& features, std::size_t dim, bool transform) const
{
fdeep::tensor input_tensor = fdeep::tensor(fdeep::tensor_shape(dim, dim), features);
if(transform)
{
fdeep::tensor input_tensor = fdeep::tensor(fdeep::tensor_shape(dim, dim), features);
return encoder.predict({input_tensor});
}
fdeep::tensor input_tensor = fdeep::tensor(fdeep::tensor_shape(dim, 1), features);
return encoder.predict({input_tensor});
Dmantri98 marked this conversation as resolved.
Show resolved Hide resolved
}
fdeep::tensors Decode(const float prev_token, const fdeep::tensors& context) const
Expand Down Expand Up @@ -509,11 +515,16 @@ std::shared_ptr<Model> GetModel(const std::string& arch, const std::string& solv
bool ModelSetParams(const std::string& arch,
const std::string& solver,
const std::vector<float>& features,
std::function<bool(int, int)> validator)
bool transform_features,
std::function<bool(std::size_t, std::string)> validator)
{
auto model = GetModel(arch, solver);
int dim = std::sqrt(features.size());
fdeep::tensors context = model->Encode(features, dim);
auto model = GetModel(arch, solver);
int dim = 0;
if(transform_features)
dim = std::sqrt(features.size());
else
dim = features.size();
fdeep::tensors context = model->Encode(features, dim, transform_features);
Comment on lines +515 to +520
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we make dim a part of the Model object? Since dim is known apriori, maybe we can just add it to the metadata for the model.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would be fine but I don't see the point as we have to hard-code the feature vector anyway so if we know the size of the feature vector we can just extract dim from it.

float decoder_input = 0.0;
for(std::size_t i = 0; i < model->metadata.num_tuning_params; ++i)
{
Expand All @@ -529,9 +540,9 @@ bool ModelSetParams(const std::string& arch,
{
int token = pq.top().second;
// convert index to token value
int value = model->metadata.tuning_decodings[std::to_string(token)];
std::string value = model->metadata.tuning_decodings[std::to_string(token)];
pq.pop();
if(value < 0)
if(value == "-1")
return false;
if(validator(i, value))
{
Expand Down
5 changes: 3 additions & 2 deletions src/include/miopen/conv/heuristics/ai_heuristics.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -81,14 +81,15 @@ namespace tuning {
struct Metadata
{
std::size_t num_tuning_params;
std::unordered_map<std::string, int> tuning_decodings;
std::unordered_map<std::string, std::string> tuning_decodings;
Metadata(const std::string& arch, const std::string& solver);
};

bool ModelSetParams(const std::string& arch,
const std::string& solver,
const std::vector<float>& features,
std::function<bool(int, int)> validator);
bool transform_features,
std::function<bool(std::size_t, std::string)> validator);
} // namespace tuning
#endif // MIOPEN_ENABLE_AI_KERNEL_TUNING
} // namespace ai
Expand Down
23 changes: 16 additions & 7 deletions src/include/miopen/solver.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -370,12 +370,8 @@ struct PerformanceConfigConvAsm1x1U : PerfConfigBase<PerformanceConfigConvAsm1x1

void StaticHeuristic(const miopen::conv::ProblemDescription& problem);
void HeuristicInit(const ExecutionContext&, const miopen::conv::ProblemDescription&);
#if MIOPEN_ENABLE_AI_KERNEL_TUNING
void RunParmeterPredictionModel(const ExecutionContext&,
const miopen::conv::ProblemDescription&,
bool& valid);
bool ModelApplyToken(int index, int value, const miopen::conv::ProblemDescription&);
#endif
bool IsModelApplicable(const ExecutionContext& ctx,
const miopen::conv::ProblemDescription& problem) const;
averinevg marked this conversation as resolved.
Show resolved Hide resolved
bool IsValidValue() const { return IsValidValueImpl(8); }
bool SetNextValue(const miopen::conv::ProblemDescription&);
bool IsValid(const ExecutionContext&, const miopen::conv::ProblemDescription& problem) const
Expand All @@ -399,6 +395,10 @@ struct PerformanceConfigConvAsm1x1U : PerfConfigBase<PerformanceConfigConvAsm1x1
{
return IsValidValueImpl(sequence_length);
}
void RunParmeterPredictionModel(const ExecutionContext&,
const miopen::conv::ProblemDescription&,
bool& valid);
bool ModelApplyToken(int index, std::string value, const miopen::conv::ProblemDescription&);
#endif
bool IsValidImpl(const miopen::conv::ProblemDescription& problem, int sequence_length) const;
bool IsValidValueImpl(int sequence_length) const;
Expand Down Expand Up @@ -4488,7 +4488,7 @@ struct PerformanceConfigHipImplicitGemmGroupFwdXdlops
: PerformanceConfigHipImplicitGemmGroupFwdXdlops(0, "")
{
}
void HeuristicInit(const miopen::conv::ProblemDescription&);
void HeuristicInit(const ExecutionContext&, const miopen::conv::ProblemDescription&);
bool SetNextValue(const miopen::conv::ProblemDescription&);
bool IsValidValue() const;
bool IsValid(const ExecutionContext&, const miopen::conv::ProblemDescription& problem) const
Expand All @@ -4497,12 +4497,21 @@ struct PerformanceConfigHipImplicitGemmGroupFwdXdlops
}
bool IsValid(const miopen::conv::ProblemDescription&) const;
bool operator==(const PerformanceConfigHipImplicitGemmGroupFwdXdlops& other) const;
bool IsModelApplicable(const ExecutionContext& ctx,
const miopen::conv::ProblemDescription& problem) const;
averinevg marked this conversation as resolved.
Show resolved Hide resolved

private:
std::vector<int> heuristic_indexes;
std::vector<std::vector<std::string>> heuristic_kernels;
averinevg marked this conversation as resolved.
Show resolved Hide resolved
template <typename DataType>
void Init(const miopen::conv::ProblemDescription&);
template <typename DataType>
bool CheckIsSupportCKArgs(const miopen::conv::ProblemDescription&) const;
template <typename DataType>
void RunParameterPredictionModel(const ExecutionContext& ctx,
const miopen::conv::ProblemDescription& problem);
void InitHeuristicKernelIDs();
bool ModelApplyToken(int idx, std::string value);
averinevg marked this conversation as resolved.
Show resolved Hide resolved
};

struct ConvHipImplicitGemmGroupFwdXdlops final
Expand Down
76 changes: 38 additions & 38 deletions src/kernels/gfx908_ConvAsm1x1U_metadata.ktn.model
Original file line number Diff line number Diff line change
Expand Up @@ -2,44 +2,44 @@
"num_tuning_params": 8,
"decodings": {
"tunings": {
"0": -1,
"1": 4,
"2": 2,
"3": 1,
"4": 3,
"5": 16,
"6": 8,
"7": 1,
"8": 4,
"9": 32,
"10": 4,
"11": 1,
"12": 2,
"13": 5,
"14": 7,
"15": 3,
"16": 6,
"17": 8,
"18": 64,
"19": 16,
"20": 32,
"21": 4,
"22": 1,
"23": 1,
"24": 3,
"25": 2,
"26": 4,
"27": 2,
"28": 4,
"29": 1,
"30": 2,
"31": 1,
"32": 4,
"33": 2,
"34": 4,
"35": 8,
"36": 1,
"37": -1
"0": "-1",
"1": "4",
"2": "2",
"3": "1",
"4": "3",
"5": "16",
"6": "8",
"7": "1",
"8": "4",
"9": "32",
"10": "4",
"11": "1",
"12": "2",
"13": "5",
"14": "7",
"15": "3",
"16": "6",
"17": "8",
"18": "64",
"19": "16",
"20": "32",
"21": "4",
"22": "1",
"23": "1",
"24": "3",
"25": "2",
"26": "4",
"27": "2",
"28": "4",
"29": "1",
"30": "2",
"31": "1",
"32": "4",
"33": "2",
"34": "4",
"35": "8",
"36": "1",
"37": "-1"
}
}
}

Large diffs are not rendered by default.

Large diffs are not rendered by default.

Original file line number Diff line number Diff line change
@@ -0,0 +1,36 @@
{
"num_tuning_params": 9,
"decodings": {
"tunings": {
"0": "-1",
"1": "64",
"2": "256",
"3": "128",
"4": "64",
"5": "128",
"6": "32",
"7": "256",
"8": "32",
"9": "128",
"10": "64",
"11": "256",
"12": "32",
"13": "16",
"14": "Default",
"15": "OddC",
"16": "2",
"17": "1",
"18": "4",
"19": "1",
"20": "2",
"21": "4",
"22": "8",
"23": "1",
"24": "4",
"25": "8",
"26": "1",
"27": "4",
"28": "-1"
}
}
}
36 changes: 18 additions & 18 deletions src/solver/conv_asm_1x1u.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -367,37 +367,42 @@ bool PerformanceConfigConvAsm1x1U::IsValidImpl(const ProblemDescription& problem
}
return true;
}
#if MIOPEN_ENABLE_AI_KERNEL_TUNING

bool PerformanceConfigConvAsm1x1U::ModelApplyToken(int index,
int value,
std::string value,
const ProblemDescription& problem)
{
int val = stoi(value);
switch(index)
{
case 0: read_size = value; break;
case 1: k_mult = value; break;
case 2: chunks_per_wave = value; break;
case 3: chunk_size = value; break;
case 4: n_mult = value; break;
case 5: c_mult = value; break;
case 6: waves_c_in_group = value; break;
case 7: waves_k_in_group = value; break;
case 0: read_size = val; break;
case 1: k_mult = val; break;
case 2: chunks_per_wave = val; break;
case 3: chunk_size = val; break;
case 4: n_mult = val; break;
case 5: c_mult = val; break;
case 6: waves_c_in_group = val; break;
case 7: waves_k_in_group = val; break;
default: return false;
}
// this function may leave PerformanceConfigConvAsm1x1U in a partially valid or invalid state
return this->IsPartiallyValid(problem, index + 1);
}

static bool IsModelApplicable(const ExecutionContext& ctx, const ProblemDescription& problem)
bool PerformanceConfigConvAsm1x1U::IsModelApplicable(const ExecutionContext& ctx,
const ProblemDescription& problem) const
{
if(!miopen::IsEnabled(MIOPEN_DEBUG_CONV_DIRECT_ASM_1X1U_AI_HEUR{}))
#if MIOPEN_ENABLE_AI_KERNEL_TUNING
if(miopen::IsDisabled(MIOPEN_DEBUG_CONV_DIRECT_ASM_1X1U_AI_HEUR{}))
return false;
if(ctx.GetStream().GetDeviceName() != "gfx908")
return false;
if(problem.GetKernelStrideH() != 1)
return false;
return true;
#else
return false;
#endif
CAHEK7 marked this conversation as resolved.
Show resolved Hide resolved
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@JehandadKhan do you think KernelTuningNet trained for one GPU might also work well for the others (like TunaNet does)? I don't think this would be the case here because the the mapping from kernel parameters to kernel runtime is a very unstable one, but I was wrong about TunaNet, too :)

}

static std::vector<float> TransformFeatures(const ProblemDescription& problem, std::size_t n)
Expand Down Expand Up @@ -426,15 +431,14 @@ void PerformanceConfigConvAsm1x1U::RunParmeterPredictionModel(const ExecutionCon
static const std::string& arch = ctx.GetStream().GetDeviceName();
static const std::string solver = "ConvAsm1x1U";
std::vector<float> features = TransformFeatures(problem, n);
if(ai::tuning::ModelSetParams(arch, solver, features, [&](int idx, int value) {
if(ai::tuning::ModelSetParams(arch, solver, features, true, [&](int idx, std::string value) {
return this->ModelApplyToken(idx, value, problem);
}))
{
MIOPEN_LOG_I("Params set by AI: " << ToString());
valid = true;
}
}
#endif

void PerformanceConfigConvAsm1x1U::StaticHeuristic(const ProblemDescription& problem)
{
Expand Down Expand Up @@ -488,17 +492,13 @@ void PerformanceConfigConvAsm1x1U::HeuristicInit(const ExecutionContext& ctx,
if(problem.GetInDataType() == miopenDouble)
MIOPEN_THROW("Double data type is not supported by ConvAsm1x1U");

#if MIOPEN_ENABLE_AI_KERNEL_TUNING
if(IsModelApplicable(ctx, problem))
{
bool valid = false;
RunParmeterPredictionModel(ctx, problem, valid);
if(valid)
return;
}
#else
std::ignore = ctx;
averinevg marked this conversation as resolved.
Show resolved Hide resolved
#endif
StaticHeuristic(problem);
MIOPEN_LOG_I(ToString());
}
Expand Down
Loading