Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Java] Elision of underscore characters in property names leads to invalid Java #4805

Open
gwynjudd opened this issue Feb 15, 2017 · 30 comments

Comments

@gwynjudd
Copy link

Description

The swagger file, being generated from an external model contains some property names that contain underscores e.g. "CHT_Provider". It also contains some other property names that are identical once the underscores are removed e.g. "CHTProvider".

Swagger-codegen version

λ java -jar swagger-codegen\snapshot\swagger-codegen-cli-2.2.2-SNAPSHOT.jar version 2.2.2-SNAPSHOT

NOTE:
Also replicates with 2.2.1

Swagger declaration file content or url

Below see an edited version of the swagger file, removed all paths and not required definitions

{
  "swagger": "2.0",
  "info": {
    "version": "v2",
    "title": "StayinFront.WebAPI V2"
  },
  "host": "localhost",
  "basePath": "/revapi",
  "schemes": [
    "http"
  ],
  "paths": {},
  "definitions": {
    "ClassInstanceBaseModel": {
      "description": "Base data for a class instance",
      "type": "object",
      "properties": {
        "class": {
          "description": "The class name",
          "type": "string"
        }
      }
    },
    "enums_CHTProvider": {
      "description": "CHTProvider",
      "type": "object",
      "properties": {
        "description": {
          "description": "description",
          "type": "string"
        },
        "data": {
          "enum": [
            "MOX",
            "RC"
          ],
          "type": "string"
        }
      }
    },
    "CHT_Configuration_Data": {
      "type": "object",
      "properties": {
        "CHTProvider": {
          "$ref": "#/definitions/enums_CHTProvider"
        },
        "CHT_Provider": {
          "$ref": "#/definitions/ClassInstanceBaseModel"
        }
      }
    }
  }
}

Using the below command line to generate code, the following Java code is generated for the CHTConfigurationData class:

...

/**
 * CHTConfigurationData
 */
@javax.annotation.Generated(value = "class io.swagger.codegen.languages.JavaClientCodegen", date = "2017-02-16T11:28:41.907+13:00")
public class CHTConfigurationData {
  @SerializedName("CHTProvider")
  private EnumsCHTProvider cHTProvider = null;

  @SerializedName("CHT_Provider")
  private ClassInstanceBaseModel cHTProvider = null;
...
}


Obviously this fails to compile due to the duplicate member names. Adding underscore characters back into the generated Java in appropriate places seems to resolve the compilation errors.

I would expect that the underscores weren't removed, but failing that could an option be added to avoid this behaviour?

Command line used for generation

java -jar swagger-codegen\snapshot\swagger-codegen-cli-2.2.1.jar generate -i swagger-file.json -l java -o output-dir

Steps to reproduce
  1. generate Java source using the above command line
  2. attempt to compile
Related issues

Here is a similar issue relating to changing of case in property names:
#4066

Here is a similar issue in PHP language generation:
#4551

Here is a similar issue in node javascript code gen - there is a configuration option to disable the behaviour in this case:
#2766

@gwynjudd
Copy link
Author

I've tested a simple code change that seems to do what I want. Just for testing purposes, I commented out the code that removes underscores in camelize, it seems to have the desired effect. I can spend some time to implement such a feature properly, but I didn't want to do much more if I was going down the wrong track. Does it seem like a configurable feature to turn off removing underscores would be accepted? Or some other way to accomplish this?

    public static String camelize(String word, boolean lowercaseFirstLetter) {
        // Replace all slashes with dots (package separator)
        Pattern p = Pattern.compile("\\/(.?)");
        Matcher m = p.matcher(word);
        while (m.find()) {
            word = m.replaceFirst("." + m.group(1)/*.toUpperCase()*/); // FIXME: a parameter should not be assigned. Also declare the methods parameters as 'final'.
            m = p.matcher(word);
        }

        // case out dots
        String[] parts = word.split("\\.");
        StringBuilder f = new StringBuilder();
        for (String z : parts) {
            if (z.length() > 0) {
                f.append(Character.toUpperCase(z.charAt(0))).append(z.substring(1));
            }
        }
        word = f.toString();

        m = p.matcher(word);
        while (m.find()) {
            word = m.replaceFirst(
                    "" + Character.toUpperCase(m.group(1).charAt(0)) + m.group(1).substring(1)/*.toUpperCase()*/);
            m = p.matcher(word);
        }

        // Uppercase the class name.
        p = Pattern.compile("(\\.?)(\\w)([^\\.]*)$");
        m = p.matcher(word);
        if (m.find()) {
            String rep = m.group(1) + m.group(2).toUpperCase() + m.group(3);
            rep = rep.replaceAll("\\$", "\\\\\\$");
            word = m.replaceAll(rep);
        }

        if (false) {
            // Remove all underscores
            p = Pattern.compile("(_)(.)");
            m = p.matcher(word);
            while (m.find()) {
                word = m.replaceFirst(m.group(2).toUpperCase());
                m = p.matcher(word);
            }
        }

        if (lowercaseFirstLetter) {
            word = word.substring(0, 1).toLowerCase() + word.substring(1);
        }

        return word;
    }

@Ondra09
Copy link

Ondra09 commented Feb 17, 2017

Same for typescript-angular2:

"Request": {
                "device_type": {
                    "type": "string"
                },
            }
        },

generates:

export interface Request { 
    deviceType?: string;
}

Completely unusable with this bug. As from generated interface is created json with wrong keys.

@wing328
Copy link
Contributor

wing328 commented Feb 17, 2017

@Ondra09 please generate TS Angular2 from 2.3.0 branch (which should have the enhancement to deserialize the JSON with a property name mapping) or you can use the following option:

	modelPropertyNaming
	    Naming convention for the property: 'camelCase', 'PascalCase', 'snake_case' and 'original', which keeps the original name (Default: camelCase)

Ref: java -jar modules/swagger-codegen-cli/target/swagger-codegen-cli.jar config-help -l typescript-angular2

@wing328
Copy link
Contributor

wing328 commented Feb 17, 2017

@gwynjudd For the following model definition:

    "CHT_Configuration_Data": {
      "type": "object",
      "properties": {
        "CHTProvider": {
          "$ref": "#/definitions/enums_CHTProvider"
        },
        "CHT_Provider": {
          "$ref": "#/definitions/ClassInstanceBaseModel"
        }
      }

I assume the property names "CHTProvider" and "CHT_Provider" cannot be changed.

One workaround is to handle the special case in https://github.com/swagger-api/swagger-codegen/blob/master/modules/swagger-codegen/src/main/java/io/swagger/codegen/languages/AbstractJavaCodegen.java#L450

@wing328 wing328 modified the milestones: Future, v2.2.3 Feb 17, 2017
@Ondra09
Copy link

Ondra09 commented Feb 17, 2017

With modelPropertyNaming=original is everything working. This is often discussed in other threads too, now I can see it... It would be nice to have defaults set differently for different languages.

Thank you for great answer.

@wing328
Copy link
Contributor

wing328 commented Feb 17, 2017

@Ondra09 can you also try generating the client from 2.3.0 branch to see if the deserialization logic works for you?

Ideally we don't want developers changing the model property naming as we want model property naming conforming to the language style guide (e.g. camelCase for Typescript).

@Ondra09
Copy link

Ondra09 commented Feb 17, 2017

Same on 2.3.0 branch.

When is property renamed it will be automatically serialized with renamed name, instead of with correct one.

@gwynjudd
Copy link
Author

@wing328 yes that's correct I have no way to modify the property or object names

@wing328
Copy link
Contributor

wing328 commented Feb 20, 2017

Same on 2.3.0 branch.

Just want to make sure. You've done a mvn clean package to rebuild the JAR, right? If yes, I'll inform the author of that enhancement (property name mapping) to take a look.

@wing328
Copy link
Contributor

wing328 commented Feb 20, 2017

@gwynjudd understood. Have you tried the workaround I mentioned above? Would that help?

One workaround is to handle the special case in https://github.com/swagger-api/swagger-codegen/blob/master/modules/swagger-codegen/src/main/java/io/swagger/codegen/languages/AbstractJavaCodegen.java#L450

@gwynjudd
Copy link
Author

@wing328 do you mean the "if it's all uppper case, do nothing" bit? It wouldn't apply in my case because a number of classes do not match the regex:

    if (name.matches("^[A-Z_]*$")) {
        return name;
    }

@ataraxus
Copy link
Contributor

@wing328 the typescript generator does not have something similar like the java one which does the naming convention matching via annotations like @SerializedName or @JsonProperty. the typescript generator does camelcase by default. which just by accident until now was totally ok until i now generated a java client which tries to consume the API and fails since naming conventions dont match

@wing328
Copy link
Contributor

wing328 commented Feb 22, 2017

@ataraxus The feature is added via #4264 to 2.3.0

Can you pull the latest 2.3.0 and build the JAR locally to give it a try?

@Ondra09
Copy link

Ondra09 commented Feb 22, 2017

Same on 2.3.0 branch.
Just want to make sure. You've done a mvn clean package to rebuild the JAR, right? If yes, I'll inform the author of that enhancement (property name mapping) to take a look.

It was clean build. First try was from binaries obtained through brew.

@wing328
Copy link
Contributor

wing328 commented Feb 23, 2017

@Ondra09 you're right. Looks like the deserialization only looks into the "type" but there's no attribute name mapping defined.

@TiFu I wonder if you can take a look when you've time.

UPDATE: I review the code again and looks like there are attribute mapping defined.

@TiFu
Copy link
Contributor

TiFu commented Feb 23, 2017

The problem is that both cHTProvider and CHT_Provider are mapped to the same name (cHTProvider) while maintaining a different baseName. Is that correct?

I don't think that this is a bug with the Serializer/Deserializer. The serializer/deserializer assumes that a property appears only once in a class (and not twice because two properties are mapped to the same name).

This is a part of the code which is generated based on the MWE in the first comment and illustrates the issue:

export class CHTConfigurationData {
    'cHTProvider': EnumsCHTProvider;
    'cHTProvider': ClassInstanceBaseModel;

    static discriminator = undefined;

    static attributeTypeMap: Array<{name: string, baseName: string, type: string}> = [
        {
            "name": "cHTProvider",
            "baseName": "CHTProvider",
            "type": "EnumsCHTProvider"
        },
        {
            "name": "cHTProvider",
            "baseName": "CHT_Provider",
            "type": "ClassInstanceBaseModel"
        }    ];

    static getAttributeTypeMap() {
        return CHTConfigurationData.attributeTypeMap;
    }
}

The problem is caused by the Java-Code generating the mapping for properties. It produces a mapping which maps two properties to the same name.

Two possible solutions would be:

@gwynjudd
Copy link
Author

@TiFu I think the problem is that codegen generates code that can't compile because of the removal of underscores. The problem arises when underscores are removed from property names as shown in my original example, as well as if they are removed from other things such as class names. Just generally removing underscores is causing a problem because it results in duplicate names.

To me it seems like having an option to not remove underscores is the best solution. It doesn't result in invalid code, I think, but maybe I am missing something.

The first of your solutions would seem to mean that at codegen time, it would throw an error if this situation arises. Instead of being able to build the generated code, I would be unable to generate code at all. What would be the way forward in this case?

The second may work, if the class name problem is also resolved. But to me it doesn't seem as nice as just not removing underscores. Can you explain why underscores are removed? I don't understand that bit, it may help me to understand this solution.

@TiFu
Copy link
Contributor

TiFu commented Feb 26, 2017

The first solution is probably the worst of the possible solutions. The way forward, would be to change the modelPropertyNaming (e.g. use original instead), such that valid code can be generated.

The reason underscores are removed is that camel case normally doesn't contain underscores.

Introducing an option to not remove underscores would probably be the best option. Codegen should also detect duplicate class names/property names and at least print a warning/error (maybe with a hint to the appropriate flags to resolve duplicate identifiers).

@wing328 : Would this be an acceptable solution?

@wing328
Copy link
Contributor

wing328 commented Feb 27, 2017

@TiFu The issue you looked into is related to Java not Typescript.

For the TypeScript issue, I've opened a new issue for tracking: #4857

@wing328
Copy link
Contributor

wing328 commented Feb 27, 2017

Would this be an acceptable solution?

@TiFu we can introduce the option to keep/change the property naming but for this particular case, I think they can simply update https://github.com/swagger-api/swagger-codegen/blob/master/modules/swagger-codegen/src/main/java/io/swagger/codegen/languages/AbstractJavaCodegen.java#L450 to customize the behaviour in handling CHTProvider and CHT_Provider

@baynezy
Copy link

baynezy commented May 23, 2017

Is anyone working on this? As I am happy to fix it.

@wing328
Copy link
Contributor

wing328 commented May 25, 2017

@baynezy I don't think anyone is working on it.

Can you elaborate on the solution you want to apply to fix the isuse?

@Uebergeek702
Copy link

Uebergeek702 commented Jun 20, 2017

We have the same problem:
in Json there are two fields : ROW_ID (Long) and ROWID (String)

"PRAWJ202": {"type": "object","properties": {
"ROW_ID": { "format": "int64", "type": "integer" },
"TABLE_ID": { "type": "string" },
"ROWID": { "type": "string" }}}

and that is created:

   /**
   * Get ROW_ID
   * @return ROW_ID
  **/
  @ApiModelProperty(example = "null", value = "")
  public Long getROWID() {
    return ROW_ID;
  }

   /**
   * Get ROWID
   * @return ROWID
  **/
  @ApiModelProperty(example = "null", value = "")
  public String getROWID() {
    return ROWID;
  }

and that ends up in error: method getROWID() is already defined in class PRAWJ202
Tried with swagger-codegen-cli-2.2.2.jar

@baynezy : did you work on this?

@baynezy
Copy link

baynezy commented Jun 20, 2017

I didn't get onto it I'm afraid. I got swallowed up by other problems.

@Uebergeek702
Copy link

@baynezy : Can you give me a hint which class to touch? then I will give it a try - because I need that

@gwynjudd
Copy link
Author

@Uebergeek702 I didn't have time to come up with a good solution. I made a hack though which may help you to properly solve the problem. In DefaultCodegen.java, search for the text "// Remove all underscores" -my hack was to comment out the following lines, but a proper solution would involve making that behaviour controllable by option.

@wing328
Copy link
Contributor

wing328 commented Jun 21, 2017

@Uebergeek702
Copy link

@wing328 thank you for the hint - I also wrote you an email. Can you assign me to the issue? (Is that usual?) and help me to set everything up to contribute? Thx.

@Uebergeek702
Copy link

I did something - but I don't know how to contribute....

@wing328
Copy link
Contributor

wing328 commented Jun 22, 2017

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

7 participants