Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Automatically use indexes configured declaratively #163

Closed
mfenniak opened this issue Nov 8, 2013 · 5 comments
Closed

Automatically use indexes configured declaratively #163

mfenniak opened this issue Nov 8, 2013 · 5 comments

Comments

@mfenniak
Copy link
Owner

mfenniak commented Nov 8, 2013

It would be cool if we didn't have to specify index names for Get, GetAll, Between, and OrderBy, and any other current or future operation that can optionally take an index name.

My thought is that we could extend the IObjectDatumConverter interface, which currently maps field & property references into names, to add an optional ability to have an index associated with a field. Then the DataContract & JSON serializers could have an attribute on fields (eg. [Indexed("id")]) that makes it automatic for Get, GetAll, Between, and OrderBy to use the appropriate index. However, this is pretty limited... RethinkDB supports indexes on expressions, not just fields, and we wouldn't be able to do that with this approach. So explicit index specification would always need to be supported, but the 95% case could work automatically.

I wonder if there's a good justification for why RethinkDB always requires explicit index naming... some reason why we should avoid this automatic idea...

@karlgrz
Copy link
Collaborator

karlgrz commented Nov 8, 2013

They talk about that somewhere in their site (I remember reading something
regarding it, just don't have laptop until later).
On Nov 7, 2013 9:17 PM, "Mathieu Fenniak" notifications@github.com wrote:

It would be cool if we didn't have to specify index names for Get, GetAll,
Between, and OrderBy, and any other current or future operation that can
optionally take an index name.

My thought is that we could extend the IObjectDatumConverter interface,
which currently maps field & property references into names, to add an
optional ability to have an index associated with a field. Then the
DataContract & JSON serializers could have an attribute on fields (eg.
[Indexed("id")]) that makes it automatic for Get, GetAll, Between, and
OrderBy to use the appropriate index. However, this is pretty limited...
RethinkDB supports indexes on expressions, not just fields, and we wouldn't
be able to do that with this approach. So explicit index specification
would always need to be supported, but the 95% case could work
automatically.

I wonder if there's a good justification for why RethinkDB always requires
explicit index naming... some reason why we should avoid this automatic
idea...


Reply to this email directly or view it on GitHubhttps://github.com//issues/163
.

@mfenniak
Copy link
Owner Author

mfenniak commented Nov 8, 2013

EqJoin is another potential index user.

I peeked through rethinkdb.com, Karl... the best I could find is a reference to them not having a query planner, so they don't automatically use secondary indexes. I think that makes a lot of sense from the server-side, but it leaves the door open for a client-side library to make index referencing a little more automatic. If you happen across any more clear reasoning, I'd love to hear it. :-)

@mfenniak
Copy link
Owner Author

mfenniak commented Nov 8, 2013

Upon further investigation, this might not be a good fit. GetAll, EqJoin, and Between are all operations that must operate against an index, and default to the primary-key index. My thought last night was that they "optionally" took an index, so we could lookup an attribute on the field and use the index name if available but do a non-indexed operation otherwise.

Since the indexes are required for these operations, I think the declarative model I had in mind would be a bad one. Removing an index attribute from a field would continue to compile, but operations against it would start failing at runtime. That's unfortunate.

I still think there's some potential for specifying indexes in a manner other than just having a string "indexName" parameter, but I'm not sure what a good alternative is.

@bchavez
Copy link
Collaborator

bchavez commented Nov 8, 2013

Here's how I work with indexes now with RethinkDB. I don't know if this helps or not:

    public static class Index
    {
        public class Account
        {
            public static string Logins = "logins_index";
        }
    }

    // DEFINE INDEXES
    internal static class Schema
    {
        static Schema()
        {
            Tables.RegisterTable<Account>()
                .AddIndex<Account>( x => x.IndexCreate( Index.Account.Logins, y => y.Logins, multiIndex: true ) );

        }

        public static Dictionary<Type, Table> Tables = new Dictionary<Type, Table>();
    }

    public static class ExtensionsForTable
    {
        public static Table RegisterTable<T>( this Dictionary<Type, Table> tables )
        {
            var table = new Table( typeof(T) );
            tables.Add( typeof(T), table );
            return table;
        }
    }

    public class Table
    {
        public Table( Type domainType )
        {
            DomainType = domainType;
            this.Name = domainType.Name;
            this.Indexes = new List<IWriteQuery<DmlResponse>>();

            var contract = RethinkDb.Newtonsoft.Configuration
                .ConfigurationAssembler
                .DefaultJsonSerializerSettings
                .ContractResolver
                .ResolveContract( DomainType ) as JsonObjectContract;

            var idprop = contract.Properties.GetProperty( "id", StringComparison.OrdinalIgnoreCase );
            var primaryKeyFiledName = idprop.PropertyName;

            this.TableDef = Db.Query.TableCreate( this.Name, primaryKey: primaryKeyFiledName );
        }

        public string Name { get; set; }
        public Type DomainType { get; set; }

        public IWriteQuery<DmlResponse> TableDef;

        public List<IWriteQuery<DmlResponse>> Indexes { get; set; }

        public ITableQuery<T> Query<T>()
        {
            return Db.Query.Table<T>( this.Name );
        }

        public Table AddIndex<T>(Func<ITableQuery<T>, IWriteQuery<DmlResponse>> indexDef )
        {
            var index = indexDef( Query<T>() );
            this.Indexes.Add( index );
            return this;
        }

        public Table Using(Func<IDatabaseQuery, IWriteQuery<DmlResponse>> tableDef )
        {
            var table = tableDef( Db.Query );
            this.Indexes.Add( table );
            return this;
        }
    }

Then querying GetAll with an index name:

        public async Task<Account> FindAsync( string login )
        {
            var results = this.Session.RunAsync( Table.GetAll( login,
                                           indexName: Index.Account.Logins ) );
            if ( await results.MoveNext() )
            {
                return results.Current;
            }
            return null;
        }

Generally, I use Schema as a class that defines all my tables and indexes. Then, inside a unit test, I get a collection of all the strongly typed table definitions Schema.Tables and create the entire database.

@mfenniak
Copy link
Owner Author

I decided to implement something similar to @bchavez's approach here; basically an object that represents an index, and provides a type-safe usage model wherever the index is accessed. For example:

var nameIndex = table.IndexDefine("name_index", o => o.Name);
var query = table.Between("Jack", "Jim", nameIndex);

In this example, nameIndex will be an object of type IIndex<T, string> (assuming the "Name" field on the object is a string). The Between query is then compile-time safe, guaranteeing that the left and right keys have the same type as the index. This type-safety seemed important with the introduction of the Group() query operator that works on an index.

This doesn't really "fix" what the issue was originally opened for, but it does make using indexes a lot nicer in rethinkdb-net. So, I'm going to call it good enough to close this issue.

Note that "IndexDefine" doesn't work on multiindexes, so I'm going to open a new issue for that...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants