Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replacing undocumented python example for Sparse Euclidian Distance with a meta-example #4945

Merged
merged 5 commits into from
Apr 3, 2020
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
20 changes: 20 additions & 0 deletions examples/meta/src/distance/sparseeuclidean.sg.in
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
LibSVMFile f_feats_a("@SHOGUN_DATA@/fm_train_sparsereal.dat")
LibSVMFile f_feats_b("@SHOGUN_DATA@/fm_test_sparsereal.dat")

#![create_features]
Features features_a = features(f_feats_a)
Features features_b = features(f_feats_b)
#![create_features]

#![create_instance]
Distance d = distance("SparseEuclideanDistance", lhs=features_a, rhs=features_a)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as the documentation stats SparseEuclideanDistance is expecting SparseFeatures.... imo the features you are reading above are not SparseFeatures but DenseFeatures.

plz make sure that you compile and test your changes before sending in a PR

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In all the other examples(.sg.in files), features_a and features_b are defined as variables of the Features class, and they directly become DenseFeatures by calling the constructor of the CDenseFeatures class.
How should we ensure that the features are sparse and the constructor of the CSparseFeatures class is called?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

none of the examples are actually using methods that requires SparseFeatures... for that actually you'll need to patch factory.h to be able to read sparse format (say libsvmfile) and create a SparseFeature from it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That might be a bit beyond an entrance task so might be worth starting an issue and then rebase this when it’s done?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah nvm! Does it work? :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Started new issue #4947

#![create_instance]

#![extract_distance]
RealMatrix distance_matrix_aa = d.get_distance_matrix()
#![extract_distance]

#![refresh_distance]
d.init(features_a, features_b)
RealMatrix distance_matrix_ab = d.get_distance_matrix()
#![refresh_distance]
26 changes: 0 additions & 26 deletions examples/undocumented/python/distance_sparseeuclidean.py

This file was deleted.

63 changes: 45 additions & 18 deletions src/shogun/util/factory.h
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,7 @@
#include <shogun/evaluation/MachineEvaluation.h>
#include <shogun/evaluation/SplittingStrategy.h>
#include <shogun/features/DenseFeatures.h>
#include <shogun/features/SparseFeatures.h>
#include <shogun/features/DenseSubsetFeatures.h>
#include <shogun/io/CSVFile.h>
#include <shogun/io/LibSVMFile.h>
Expand Down Expand Up @@ -106,25 +107,51 @@ namespace shogun
require(file, "No file provided.");
std::shared_ptr<Features> result = nullptr;

switch (primitive_type)
if(std::type_index(typeid(*file)) == std::type_index(typeid(LibSVMFile)))
{
case PT_FLOAT64:
result = std::make_shared<DenseFeatures<float64_t>>();
break;
case PT_FLOAT32:
result = std::make_shared<DenseFeatures<float32_t>>();
break;
case PT_FLOATMAX:
result = std::make_shared<DenseFeatures<floatmax_t>>();
break;
case PT_UINT8:
result = std::make_shared<DenseFeatures<uint8_t>>();
break;
case PT_UINT16:
result = std::make_shared<DenseFeatures<uint16_t>>();
break;
default:
not_implemented(SOURCE_LOCATION);
switch (primitive_type)
{
case PT_FLOAT64:
result = std::make_shared<SparseFeatures<float64_t>>();
break;
case PT_FLOAT32:
result = std::make_shared<SparseFeatures<float32_t>>();
break;
case PT_FLOATMAX:
result = std::make_shared<SparseFeatures<floatmax_t>>();
break;
case PT_UINT8:
result = std::make_shared<SparseFeatures<uint8_t>>();
break;
case PT_UINT16:
result = std::make_shared<SparseFeatures<uint16_t>>();
break;
default:
not_implemented(SOURCE_LOCATION);
}
}
else
{
switch (primitive_type)
{
case PT_FLOAT64:
result = std::make_shared<DenseFeatures<float64_t>>();
break;
case PT_FLOAT32:
result = std::make_shared<DenseFeatures<float32_t>>();
break;
case PT_FLOATMAX:
result = std::make_shared<DenseFeatures<floatmax_t>>();
break;
case PT_UINT8:
result = std::make_shared<DenseFeatures<uint8_t>>();
break;
case PT_UINT16:
result = std::make_shared<DenseFeatures<uint16_t>>();
break;
default:
not_implemented(SOURCE_LOCATION);
}
}
result->load(file);
return result;
Expand Down