Skip to content
This repository has been archived by the owner on Sep 6, 2021. It is now read-only.

Getting mode from file extension won't always work #2923

Closed
MarkMurphy opened this issue Feb 21, 2013 · 10 comments
Closed

Getting mode from file extension won't always work #2923

MarkMurphy opened this issue Feb 21, 2013 · 10 comments

Comments

@MarkMurphy
Copy link
Contributor

Getting mode from file extension won't always work where some files don't have extensions like Makefiles for example.

Instead I propose that the getModeFromFileExtension method be refactored to getModeFromFilePath where the filename including the extension can be compared to a regex representing each mode.

for example:

var Mode,
    modes = [],
    modesByName = {};

var modeList = [{
    name: "Makefile", 
    options: "text/plain", 
    mime: "text/plain"
    matches: "^GNUmakefile|^makefile|^Makefile|^OCamlMakefile|make"
} , {
    name: "HTML"
    options: {
        name: "htmlmixed",
        scriptTypes: [{matches: /\/x-handlebars-template|\/x-mustache/i, mode: null}]
    },
    mime: "text/html"
    matches: "html|htm|shtm|shtm|xhtml|cfm|cfml|cfc|dhtml|xht"
}];

var Mode = function(name, options, mime, matches) {
    this.name = name;
    this.options= options;
    this.mime= mime;

    var re;

    if (/\^/.test(matches)) {
        re = matches.replace(/\|(\^)?/g, function(a, b){
            return "$|" + (b ? "^" : "^.*\\.");
        }) + "$";
    } else {
        re = "^.*\\.(" + matches+ ")$";
    }   

    this.matches= new RegExp(re, "gi");
};

Mode.prototype.supportsFile = function(filename) {
    return filename.match(this.matches);
};

for (var i = 0, length = modeList.length; i < length; i++) {
    var mode = modeList[i];
    mode = new Mode(mode.name, mode.options, mode.mime, mode.matches);
    modesByName[mode.name.toLowerCase()] = mode;
    modes.push(mode);
}

function getModeFromPath(path) {
    var filename = path.split(/[\/\\]/).pop(),
         length = modes.length
         i;

    for (i = 0; i < length; i++) {
        if (modes[i].supportsFile(filename)) {
            mode = modes[i];
            break;
        }
    }

    if (!mode) {
         console.log("Called EditorUtils.js _getModeFromFilePath with an unhandled file name or extension: " + filename);
        return "";
    }

    return mode;
}
@pthiess
Copy link
Contributor

pthiess commented Feb 25, 2013

Hi Mark,
This seems like a reasonable approach to deal with filenames without an extension. We may want to discuss it regarding performance.
However, due to the teams bandwidth we would need to put it on the backlog for now.
Would you feel confident to work on this by yourself?
If so, that would be cool and I would encourage you to start a discussion about the best solution on our developer list (Google Groups).
Thanks a ton for your suggestion!

@peterflynn
Copy link
Member

Are there cases where you'd need a full-blown regexp? Or would a list of extensionless filenames suffice? Something like this:

{
    name: "Makefile", 
    options: "text/plain", 
    mime: "text/plain",
    fileExtensions: [],
    fileNames: ["GNUmakefile", "makefile", "Makefile", "OCamlMakefile", "make"]
}

I'd worry a little bit that regexps make it too easy to create bugs where language extensions match an overly-broad set of filenames. (I think your example above actually contains such a bug :-) -- there's no "^" before the last item, so any filename containing the contiguous substring "make" would get matched). Regexps also make it impractical to detect clashing extensions, whereas with simple lists we can automatically detect & warn when two language extensions are fighting over the same file.

@MarkMurphy
Copy link
Contributor Author

@peterflynn no bug, if you look at the Mode constructor, it assumes to add a '$' to the expression:

if (/\^/.test(matches)) {
    re = matches.replace(/\|(\^)?/g, function(a, b){
    return "$|" + (b ? "^" : "^.*\\.");
    }) + "$";
} else {
    re = "^.*\\.(" + matches+ ")$";
}   

this.matches= new RegExp(re, "gi");

In this case the regexps are so simplistic that there's really no reason to worry about using them for this.

Performance wise, it wouldn't be any faster that the current implementation of using the switch + whatever time ti takes to evaluate against each regex.

What I do like about about this approach is that it makes it much easier for me to quickly see what extension is associated to what language/mode.

Another approach I was considering was to create a key/value pair object where the keys would be extensions and extension-less file names and the value would be an object containing the mode name, options, mime, etc.

Here's an example:

var modes = {
    "html": { name: 'HTML', mode: ... },
     "htm": { alias: "html" },
    "makefile": { name: 'Makefile', mode: ... }
};

function getModeFromFilePath (path)  {
    var filename = ... ,
        extension = ... , 
        syntax = { mime: 'text/plain', mode: 'text/plain' };

    extension = extension ? extension : filename;

    while(extension && modes[extension] && modes[extension].alias) {
        extension = modes[extension].alias;
    }

    return extension && modes[extension] ? modes[extension] : syntax;
}

@peterflynn
Copy link
Member

@MarkMurphy: this is definitely about to become a lot more declarative -- see pull request #2844, in particular LanguageManager.js and the languages.json file that feeds into it.

Sorry about the "bug" confusion -- I didn't realize that you were preprocessing the "match" string and not using it as a literal regexp directly.

I think my concerns there remain, though... I'd prefer if we could do this via a simple list of filenames, similar to your last proposal above. (Except that I think we should keep it separate from the list of extensions -- otherwise an extensionless file whose name matches any file extension would get mapped to that mode automatically, which seems wrong).

@MarkMurphy
Copy link
Contributor Author

@peterflynn that makes sense.

@peterflynn
Copy link
Member

Cool. Well, feel free to make a branch off of #2844's and start playing with an implementation of this if you're interested :-)

@MarkMurphy
Copy link
Contributor Author

@peterflynn Sure, I should be able to have something done soon.

MarkMurphy added a commit to MarkMurphy/brackets that referenced this issue Mar 3, 2013
MarkMurphy added a commit to MarkMurphy/brackets that referenced this issue Mar 3, 2013
DennisKehrig added a commit that referenced this issue Mar 13, 2013
Fixing issue #2923 - Getting mode from file extension won't always work
@DennisKehrig
Copy link
Contributor

@MarkMurphy Can you verify this as working?

@MarkMurphy
Copy link
Contributor Author

@DennisKehrig Confirmed. Tested by creating an extension-less file named "Cakefile" in a project which Brackets successfully interpreted as a CoffeeScript file.

@DennisKehrig
Copy link
Contributor

@MarkMurphy Woohoo, thanks! Then I'll go ahead and close this one :)

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

4 participants